Page 18 of 181 FirstFirst ... 8151617181920212868118 ... LastLast
Results 426 to 450 of 4519

Thread: AMD Zambezi news, info, fans !

  1. #426
    Xtreme Member
    Join Date
    Oct 2007
    Location
    Philadelphia, PA
    Posts
    120
    Some people really have nothing better to do than to stir up some fanbois on the net :P


  2. #427
    Xtreme Enthusiast
    Join Date
    Oct 2005
    Location
    Melbourne, Australia
    Posts
    529
    Quote Originally Posted by Dolk View Post
    Actually no, it would not be possible to do integer execution while the FP is being executed as well. This was one of my exam questions for my CPU Archi class. I'll have to dig up my response, but it pretty much has to do with the scheduler and the PC.
    That's when you're dealing with a traditional core with one FPU and one integer unit.
    Also, do traditional cores have a single scheduler for the integer plus FPU?

    What I'm talking about is, if you have a 256-bit AVX instruction, could AMD just make it so it just registers as using one of the two cores in the module? even though it needs to use both the 128-bit FPU pipelines (which belong to two cores)


    So if the OS sees the 256-bit AVX instruction as just being processed by core 1, then it allows core 2 to process integer instructions
    Last edited by Apokalipse; 05-12-2011 at 04:40 AM.

  3. #428
    Xtreme Member
    Join Date
    Jan 2009
    Posts
    128
    No I don't think so. Both Core 1 and 2 would be in use (technically).
    (╯°□°)╯︵ ┻━┻

    currently has .2748428927348 internets
    create your own rules to find the answer

  4. #429
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    @Apokalipse, Dolk:
    Nice to see at least some technical discussion going on, other than analyzing the wave of faked BD screenshots ^^

    To your discussion:
    Well, after providing 2 instruction streams to a BD module, the OS is kind of out of the loop of executing them.

    The fetch/decode/dispatch path works on 2 threads, capable of switching between them on a per-cycle-basis. The dispatch unit dispatches groups of "cops" (complex ops, AMD term) belonging to either thread (I hope, my English skills aren't killing the message here ). Such a dispatch group might contain ALU, AGU and FP ops of one thread. I assume the FP ops or the according dispatch group have a tag denoting their respective thread.

    From now on the integer and the FP schedulers are kind of independently working on the cops/uops they received, until control flow changes. The FP scheduler "sees" a lot of uops asking for ready inputs and producing outputs. Except for control flow changes it actually doesn't even need to know to which thread an op belongs to. The thread relation is assured by a register mapping (has to be done for OOO execution anyway). So the FPU could execute uops as their inputs become available.

    If there are 256b ops, they're already decoded as 2 uops ("Double decode" type) to do calculations on both 128b wide halves of the SIMD vector. I don't think that there is an explicit locking mechanism. The FPU could basically issue both uops in the same cycle or - if there are ready uops of the other thread - issue them to one 128b unit during 2 consecutive cycles (the software optimization manual show latencies of 256b ops supporting this). This actually depends on the implemented policy: issue uops based on their age only or issue them ensuring balanced execution of both threads.

    But what's important to your discussion: this happens independently from both integer cores.
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

  5. #430
    Xtreme Mentor
    Join Date
    Nov 2005
    Location
    Devon
    Posts
    3,437
    Thanks for nice explanation Dresden!

    Hopefully not too long till we will be able to torture this design with variety of workloads and see how it performs. Also I think OS scheduling will be quite important for overlay CPU performance due to module based nature of it, but that's the same for any HT enabled CPU.

    Overlay I hope AVX performance won't be that far off 4 core SB while SSE should be higher. Now we just need some AVX enabled apps to start rolling out in bigger quantities.
    RiG1: Ryzen 7 1700 @4.0GHz 1.39V, Asus X370 Prime, G.Skill RipJaws 2x8GB 3200MHz CL14 Samsung B-die, TuL Vega 56 Stock, Samsung SS805 100GB SLC SDD (OS Drive) + 512GB Evo 850 SSD (2nd OS Drive) + 3TB Seagate + 1TB Seagate, BeQuiet PowerZone 1000W

    RiG2: HTPC AMD A10-7850K APU, 2x8GB Kingstone HyperX 2400C12, AsRock FM2A88M Extreme4+, 128GB SSD + 640GB Samsung 7200, LG Blu-ray Recorder, Thermaltake BACH, Hiper 4M880 880W PSU

    SmartPhone Samsung Galaxy S7 EDGE
    XBONE paired with 55'' Samsung LED 3D TV

  6. #431
    I am Xtreme FlanK3r's Avatar
    Join Date
    May 2008
    Location
    Czech republic
    Posts
    6,823
    new article at AMDblog about states (C6)
    http://blogs.amd.com/work/2011/05/16...AMD+at+Work%29
    ROG Power PCs - Intel and AMD
    CPUs:i9-7900X, i9-9900K, i7-6950X, i7-5960X, i7-8086K, i7-8700K, 4x i7-7700K, i3-7350K, 2x i7-6700K, i5-6600K, R7-2700X, 4x R5 2600X, R5 2400G, R3 1200, R7-1800X, R7-1700X, 3x AMD FX-9590, 1x AMD FX-9370, 4x AMD FX-8350,1x AMD FX-8320,1x AMD FX-8300, 2x AMD FX-6300,2x AMD FX-4300, 3x AMD FX-8150, 2x AMD FX-8120 125 and 95W, AMD X2 555 BE, AMD x4 965 BE C2 and C3, AMD X4 970 BE, AMD x4 975 BE, AMD x4 980 BE, AMD X6 1090T BE, AMD X6 1100T BE, A10-7870K, Athlon 845, Athlon 860K,AMD A10-7850K, AMD A10-6800K, A8-6600K, 2x AMD A10-5800K, AMD A10-5600K, AMD A8-3850, AMD A8-3870K, 2x AMD A64 3000+, AMD 64+ X2 4600+ EE, Intel i7-980X, Intel i7-2600K, Intel i7-3770K,2x i7-4770K, Intel i7-3930KAMD Cinebench R10 challenge AMD Cinebench R15 thread Intel Cinebench R15 thread

  7. #432
    Xtreme Member
    Join Date
    Aug 2009
    Posts
    431
    Quote Originally Posted by FlanK3r View Post
    new article at AMDblog about states (C6)
    http://blogs.amd.com/work/2011/05/16...AMD+at+Work%29
    Isn't that basically the same info that was already posted back in Feb. ?

  8. #433
    Xtreme Member
    Join Date
    Jul 2010
    Posts
    399
    Yes, but one interesting thing stood out for me:

    With the new “Bulldozer” core, when both cores in a module are idle for a pre-determined period of time, the power is gated off to the module. Our design engineers estimate that this will drop the power consumption by up to 95% in idle over the previous generation of processor cores
    I wonder how's that estimated, what's the best case scenario here. Possibly a 12 core Opteron (what are they idling at?), compared to a .... any Bulldozer with all but one module sleeping?

  9. #434
    Xtreme Member
    Join Date
    Sep 2010
    Posts
    139
    Quote Originally Posted by DarthShader View Post
    Yes, but one interesting thing stood out for me:


    I wonder how's that estimated, what's the best case scenario here. Possibly a 12 core Opteron (what are they idling at?), compared to a .... any Bulldozer with all but one module sleeping?
    Another interesting bit about the article is JF commenting on someone and thus also confirming turboboost over all cores are mostly related to non-fpu workloads. e.g. all integer applications will benifit from (at least) 500MHz higher clock. (~16% extra performance through turboost). Most people already speculated the turbo was related to the fpu but now it is somehwat confirmed.

  10. #435
    Xtreme Member
    Join Date
    Sep 2008
    Location
    Eastern Tennessee (from Minnesota)
    Posts
    241

    Red face

    Quote Originally Posted by DarthShader View Post
    I wonder how's that estimated, what's the best case scenario here. Possibly a 12 core Opteron (what are they idling at?), compared to a .... any Bulldozer with all but one module sleeping?
    Wonder if that is anything like the Bobcat power draw I saw happening at idle...

    The ASRock E350M-1 with the iGPU enabled would drop down to around only 0.836V when idle! Load was as high as 1.36V, but that wasn't ever constant either. This was all while running with C'n'Q OFF and Windows Power Plan set to Min-100% and Max-100% on CPU, so 0.8xxV at full speed (1600MHz) for dual core o_0

    On the Sapphire IPC-E350M1 I received, the iGPU was bad in some way The system had corrupted graphics immediate at POST, and would lock up on even the BIOS logo. Tossing in a discrete card and hastily turning off the iGPU in the BIOS (before it crashed) rectified the issue, which BTW a BIOS update didn't fix either. Anyways, point of that is the fact the iGPU is disabled and Sapphire's method completely turns it off, compared to ASRock which seems to just... I dunno... remove it's device ID so Windows doesn't detect it? Same situation as above, the idle voltage with no iGPU is something like 0.58V!! Again, still at 1600MHz, just idling in Windows (with programs open, not just after a fresh boot).

    The other thing is the voltage is not static, but fluctuates between specific voltages based on load. I had seen basically everything between the .5V to 1.35V (on the Sapphire) in 0.1V increments, with the typical range being around 1.15V to 1.25V for your every-day web browsing. Essentially what seemed to be occuring was the system automatically switching on/off power phases, like what a number of motherboards now do. Granted, what I saw might be exclusive to AMD's mobile line, but I have a feeling we'll see close to the same results with most/all the Bulldozer platforms too. Hell that might be one the main reason for 990FX!

    [/rambling]

  11. #436
    Xtreme Member
    Join Date
    Jul 2010
    Posts
    399
    You might want to measure the voltages with a Voltometer, if possible on that board, cause I don't think going as low as you describe is possible with silicone based semiconductors, but I might be wrong.

  12. #437
    Xtreme Member
    Join Date
    Sep 2008
    Location
    Eastern Tennessee (from Minnesota)
    Posts
    241
    Quote Originally Posted by DarthShader View Post
    You might want to measure the voltages with a Voltometer, if possible on that board, cause I don't think going as low as you describe is possible with silicone based semiconductors, but I might be wrong.
    I'd be more than happy to, but I'm not entirely sure where I'd need to measure at :\ I have no qualms about poking around, even at the small SMD caps, but if I took that approach it'd literally take me a week to get lucky enough to find it lol I'll drag out my spare PSU and HDD, boot the board and probe around the major components (MOSFETs to start).

    Quote Originally Posted by Dresdenboy View Post
    @Apokalipse, Dolk:
    Nice to see at least some technical discussion going on, other than analyzing the wave of faked BD screenshots ^^

    To your discussion:
    Well, after providing 2 instruction streams to a BD module, the OS is kind of out of the loop of executing them.

    The fetch/decode/dispatch path works on 2 threads, capable of switching between them on a per-cycle-basis. The dispatch unit dispatches groups of "cops" (complex ops, AMD term) belonging to either thread (I hope, my English skills aren't killing the message here ). Such a dispatch group might contain ALU, AGU and FP ops of one thread. I assume the FP ops or the according dispatch group have a tag denoting their respective thread.

    From now on the integer and the FP schedulers are kind of independently working on the cops/uops they received, until control flow changes. The FP scheduler "sees" a lot of uops asking for ready inputs and producing outputs. Except for control flow changes it actually doesn't even need to know to which thread an op belongs to. The thread relation is assured by a register mapping (has to be done for OOO execution anyway). So the FPU could execute uops as their inputs become available.

    If there are 256b ops, they're already decoded as 2 uops ("Double decode" type) to do calculations on both 128b wide halves of the SIMD vector. I don't think that there is an explicit locking mechanism. The FPU could basically issue both uops in the same cycle or - if there are ready uops of the other thread - issue them to one 128b unit during 2 consecutive cycles (the software optimization manual show latencies of 256b ops supporting this). This actually depends on the implemented policy: issue uops based on their age only or issue them ensuring balanced execution of both threads.

    But what's important to your discussion: this happens independently from both integer cores.
    To expand on this bit of info, I found this when looking for hints on 'Dozer pricing with Google lol It's from a 20 Questions AMD blog, from Aug 2010...

    Quote Originally Posted by John Fruehe@AMD[B
    ]“Is there any”programmable-tangible” improvement in synchronization between cores in the same module? In other words, will I get tangible performance improvement if I can partition my multi-threaded algorithm to pairs of closely interacting threads, and schedule each pair to a module?”[/B] – Edward Yang

    That is a very interesting question.

    For the majority of software, the OS will work in concert with the processor to manage the thread to core relationships. We are collaborating with Microsoft and the open source software community to ensure that future versions of Windows and Linux operating systems will understand how to enumerate and effectively schedule the Bulldozer core pairs. The OS will understand if your machine is setup for maximum performance or for maximum performance/watt which takes advantage of Core Performance Boost.

    However, let’s say you want to explore if you can get a performance advantage if your threads were scheduled on different modules. The benefit you can gain really depends on how much sharing the two threads are going to do.

    Since the two integer cores are completely separate and have their own execution clusters (pipelines) you get no sharing of data in the L1 – and there is no specific optimizations needed at the software level. However, at the L2 cache level there could be some benefits. A shared L2 cache means that both cores have access to read the same cache lines – but obviously only one can write any cache line at any time. This means that if you have a workload with a main focus of querying data and your two threads are sharing a data set that fits in our L2, then having them execute in the same module could have some advantages. The main advantage we expect to see is an increase in the power efficiency of the cores that are idle. The more idle other cores are, the better chance the busy cores will have to boost.

    However, there is another consideration to this which is how available other cores are. You need to weigh the benefits of data sharing with the benefit of starting the thread on the next available core. Stacking up threads to execute in proximity means that a thread might be waiting in line while an open core is available for immediate execution. If your multi-threaded application isn’t optimized to target the L2 (or possibly the L3 cache), or you have distinctly separate applications to run, and you don’t need to conserve power, then you’ll likely get better performance by having them scheduled on separate modules. So it is important to weigh both options to determine the best execution.
    It seems as though we won't really be able to utilize a Bulldozer chip to it's full extent until Windows 8 I'm no coder, but I have a feeling it's not going to be as easy a thing to do to make Windows 7 capable of that where M$ can just roll out a 50MB patch, but I'll cross my fingers just in case!!

    EDIT: JUST in case there was any confusion still, here are some comments he posted from the same Blog entry:

    "John Fruehe September 13, 2010
    A module will have 2 cores in it. It will be seen by the hardware and software as 2 cores. The module will essentially be invisible to the system.
    "

    "John Fruehe September 8, 2010
    We have already said one core per thread, period. But a single thread gets all of the front end, all of the FPU and all of the L2 cache if there is not a second thread on the module.
    "
    Last edited by Formula350; 05-17-2011 at 04:59 PM.

  13. #438
    Registered User
    Join Date
    Jul 2010
    Posts
    34
    Quote Originally Posted by DarthShader View Post
    Yes, but one interesting thing stood out for me:


    I wonder how's that estimated, what's the best case scenario here. Possibly a 12 core Opteron (what are they idling at?), compared to a .... any Bulldozer with all but one module sleeping?
    It could be worse: he might look at it at the core level, not the whole chip level. So he could be comparing the power consumption of two idle MC cores to a power gated BD module. If that's the case, the 95% figure wouldn't be too shocking.

  14. #439

  15. #440
    Xtreme Addict
    Join Date
    Jan 2009
    Location
    SF
    Posts
    1,070
    Quote Originally Posted by liberato87 View Post
    What do you think the msrp will be on that badboy?

  16. #441
    Registered User
    Join Date
    Mar 2011
    Location
    South Italy
    Posts
    89
    Quote Originally Posted by richierich View Post
    What do you think the msrp will be on that badboy?
    dunno
    I think on launch 150euro; and the crosshair V on 180-200euro.
    Sabertooth must cost 30-50euro less than the CH V otherwise nobody will prefer it to a CH V .
    But I think Sabertooth it will be very interesting looking the specs

    -sorry for my basic english

    EDIT

    posted here http://www.xtremesystems.org/forums/...72#post4851872

  17. #442
    Xtreme Member
    Join Date
    Sep 2008
    Location
    Eastern Tennessee (from Minnesota)
    Posts
    241
    Quote Originally Posted by liberato87 View Post
    Hmm, that isn't the TUF model shown in the leaked roadmap, unless they plan on having a TUF Armored and Non-TUF model o_0 I was kinda digging that P67 model heh
    EDIT: OK I guess technically it's a TUF, but I think everyone would consider a TUF Sabertooth to be synonymous with an Armored board :\

    Quote Originally Posted by liberato87 View Post
    dunno
    I think on launch 150euro; and the crosshair V on 180-200euro.
    Sabertooth must cost 30-50euro less than the CH V otherwise nobody will prefer it to a CH V.
    But I think Sabertooth it will be very interesting looking the specs
    Looks like the same image posted on page 14 or 15

    EDIT: Dug up a better pics



    Can anyone explain the function of the second oval crystal location next to the SB? Seems like all the AMD boards have two (at least the 4 I currently have, 890GX/FX and two E350), one being unpopulated... That for use if the manufacturer wants to offer the ability to adjust the SB's clock itself?

    Also somewhat disconcerting (IMO) is the SB heatsink looking plastic haha Does look to come in contact with whatever that other chip is, above the cap/crystal that sit to the right of the black SATA ports...
    Last edited by Formula350; 05-18-2011 at 10:18 AM.

  18. #443
    Xtreme Member
    Join Date
    Mar 2010
    Location
    Angeles/ HK/ Shenzen
    Posts
    444

  19. #444
    Xtreme Member
    Join Date
    Sep 2008
    Location
    Eastern Tennessee (from Minnesota)
    Posts
    241
    Had Jetway put a PCIe x1 above the first PEG slot and put on right-angle SATA ports, that'd be a really nice entry level 990X board I think.

    EDIT: Though one redeeming point in my eyes is that mini-PCIe slot! I'm becoming a real fan of those as it gives you the ability to run WiFi, cheaply, and not need a full sized ATX card. Which on this board would not be feasible if running Crossfire/SLI as the x1 and PCI slots would all be blocked (the second x1 wouldn't technically, but putting a card there might impede airflow).
    Last edited by Formula350; 05-18-2011 at 08:10 PM.

  20. #445
    Xtreme Cruncher
    Join Date
    Apr 2008
    Location
    Ohio
    Posts
    3,119
    With most of the cases I have used I like the placement of the SATA ports. They are above the Graphics card and I always have problems with access to the angled ports.
    ~1~
    AMD Ryzen 9 3900X
    GigaByte X570 AORUS LITE
    Trident-Z 3200 CL14 16GB
    AMD Radeon VII
    ~2~
    AMD Ryzen ThreadRipper 2950x
    Asus Prime X399-A
    GSkill Flare-X 3200mhz, CAS14, 64GB
    AMD RX 5700 XT

  21. #446
    Xtreme Mentor
    Join Date
    Dec 2007
    Location
    State of Confusion, USA
    Posts
    2,513
    Quote Originally Posted by charged3800z24 View Post
    With most of the cases I have used I like the placement of the SATA ports. They are above the Graphics card and I always have problems with access to the angled ports.
    I agree with you Charged!

    For someone who builds a machine and then never touches it again the angled ports usually result in a cleaner install, but for folks like us (most XS members) the angled SATA ports are a PITA.

    It's not my deciding factor when picking a board, but I'm not a big fan of the angled ports...
    They make swapping gear alot more difficult most of the time.
    AMD FX-8350 (1237 PGN) | Asus Crosshair V Formula (bios 1703) | G.Skill 2133 CL9 @ 2230 9-11-10 | Sapphire HD 6870 | Samsung 830 128Gb SSD / 2 WD 1Tb Black SATA3 storage | Corsair TX750 PSU
    Watercooled ST 120.3 & TC 120.1 / MCP35X XSPC Top / Apogee HD Block | WIN7 64 Bit HP | Corsair 800D Obsidian Case








    First Computer: Commodore Vic 20 (circa 1981).

  22. #447
    Xtreme Member
    Join Date
    May 2007
    Location
    Sweden
    Posts
    127
    It quite early info, but still official and very confusing and:

    http://www.anandtech.com/show/2881

    AMD refers to the module as being two tightly coupled cores, which starts the path of confusing terminology. A few of you wondered how AMD was going to be counting cores in the Bulldozer era; I took your question to AMD via email:

    Also, just to confirm, when your roadmap refers to 4 bulldozer cores that is four of these cores:

    http://images.anandtech.com/reviews/.../bulldozer.jpg

    Or does each one of those cores count as two? I think it's the former but I just wanted to confirm.

    AMD responded:

    Anand,

    Think of each twin Integer core Bulldozer module as a single unit, so correct.

    As said before, maybe AMD is happy to begin with a dual (two modules), tricore (three modules, one inactive) and a quadcore (4 modules) bulldozer and see the extra integer cores as "extra execution power" like Intel's Hyperthreading technology.

    Interlagos dual 4 module MCM cpu will be a server Opteron part, but maybe also the hexacore (two Bulldozer dies with one inactive module) and eightcore (two fully Bulldozer dies).

    Much talks for it, and much against it! One month left to go.
    Ivy Bridge 3770K @ ????MHz
    6c Intel Xeon X7460 24MB cache 16GB RAM 22TB HDD fileserver
    Dual Intel Xeon E5620 workstation
    SB 2600K @ 5016MHz 1.37v HT on AIR primestable
    AMD Athlon X3 425 @ B25 4GHz+ AIR
    AMD Athlon X2 6400+ @ 3811MHz AIR
    AMD Athlon X2 3600+ @ 3200MHz AIR
    AMD Athlon XP 1700+ @ 2714MHz AIR
    Thermalright Ultra-120 Extreme
    Corsair 8GB XMS3 2000MHz
    ATI Radeon HD5850 @ 1000MHz+/1200MHz+
    Windows 7 Enterprise x64
    Corsair HX750W

  23. #448
    Xtreme Member
    Join Date
    Apr 2005
    Location
    London, UK
    Posts
    261
    Hm, I remember John saying that they will not use modules for marketing, it will be cores.

  24. #449
    Xtreme Mentor
    Join Date
    Nov 2005
    Location
    Devon
    Posts
    3,437
    Quote Originally Posted by muziqaz View Post
    Hm, I remember John saying that they will not use modules for marketing, it will be cores.
    Yes, correct.

    We already know FX8, FX6 and FX4 monikers, what else do we need as a proof?
    RiG1: Ryzen 7 1700 @4.0GHz 1.39V, Asus X370 Prime, G.Skill RipJaws 2x8GB 3200MHz CL14 Samsung B-die, TuL Vega 56 Stock, Samsung SS805 100GB SLC SDD (OS Drive) + 512GB Evo 850 SSD (2nd OS Drive) + 3TB Seagate + 1TB Seagate, BeQuiet PowerZone 1000W

    RiG2: HTPC AMD A10-7850K APU, 2x8GB Kingstone HyperX 2400C12, AsRock FM2A88M Extreme4+, 128GB SSD + 640GB Samsung 7200, LG Blu-ray Recorder, Thermaltake BACH, Hiper 4M880 880W PSU

    SmartPhone Samsung Galaxy S7 EDGE
    XBONE paired with 55'' Samsung LED 3D TV

  25. #450
    Xtreme Member
    Join Date
    Jul 2004
    Location
    Berlin
    Posts
    275
    @Module, cores:
    It was even not a clear thing for AMD in the beginning.

    1. The engineers developed BD and called a module "core" (in patents), since in their view they developed a much bigger core than 10h and added a copy of integer execution resources (which is basically similar to a simplified CPU - not x86 compatible it just works on already translated (decoded) instructions like a RISC CPU). The module could also be seen as a heavily optimized dual core processor. It basically has everything, while sharing those parts, where sharing makes sense.

    2. Someone possibly thought that these small cores will perform much better than logical cores in a hypothetical SMT machine, so this needs to be marketed somehow.

    What they are depends on the applied definitions of "core".
    Now on Twitter: @Dresdenboy!
    Blog: http://citavia.blog.de/

Page 18 of 181 FirstFirst ... 8151617181920212868118 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •