MMM
Results 1 to 25 of 719

Thread: AMD cuts to the core with 'Bulldozer' Opterons

Hybrid View

  1. #1
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by -Sweeper_ View Post
    ~11% penalty is not what we'd get even with ''independent'' cores?
    That's the penalty due to shared front end.You have around 10% penalty for much much less die space investment(no need to have another front end stage,int cores can use full potential of 2 fmac units,shared L2 per module etc.)
    Also this means that each core inside module does perform a bit better on its own(not counting new Turbo BD will have).

    Quote Originally Posted by -Boris- View Post
    But they perform the same when bottlenecked?
    So L3 matters, but performance don't? A phenom II 910 and a Core i7 980X will perform the same in games 2014? I have a feeling that the benchmarks made today which aren't bottlenecked will give a good clue on the 980X scenario.




    This is all wrong, there are no numbers of scaling on multiple modules. The only number we have is within a module, between one and two cores.
    And 1.8x is 90% performance per core.

    How can you get 80% scaling to 11% penalty? The numbers are 90% and 10%.
    Phenom II 910 is QC ,the other is 6 core Westmere.Yes L3 matters and you can see this with Agena Vs Deneb.By 2014 games will use more than 4 cores so yes it (Westmere) will be better performer but not because the things you believe.

    I can't see what you don't understand? You have a scaling within a module.Anything outside of module should behave the same as Lisbon does today(scaling to 6 cores which communicate over shared L3).Each module is a "super core" if you will and each of those will scale the same-hence 4x1.8(or 8 x 0.9 since you like 90% number more )

    Quote Originally Posted by savantu View Post
    Back again to same old stuff : ICC 8.0 did a check for vendor ID, newer versions ( currently ICC 10 ) have the check removed and will check for feature flags ( basically whatever the CPU supports the compiler will throw at it ). However, Intel claim no responsibility for code quality and bugs.
    They say the check in 8.0 was introduced simply because AMD did not give them the detailled errata list for their CPUs ( obviously that AMD refrains from sending samples to Intel for validation ).
    It would be like AMD sampling now BD to Intel so future updates to Intel's compiler can support BD features.
    You should read Agner Fog's latest blogs then

    Quote Originally Posted by Agner Fog
    Intel have released a new version of their Math Kernel Library (v. 10.3) in beta test.

    I have tested the new libraries and found that the CPU dispatching works basically the same way as before. The standard math library, vector math library, short vector math library and the 64-bit version of other math kernel library functions still use an inferior code path for non-Intel processors.

    I have found the following differences from previous versions:

    * Many functions now have a branch for the forthcoming AVX instruction set, but still only for Intel processors. This will increase the difference in performance between Intel and AMD processors on these functions. Both Intel and AMD are planning to support AVX in 2011.

    * The CPU dispatcher for the vector math library has a new branch for non-Intel processors with SSE2. Unlike the generic branch, the new non-Intel SSE2 branch is used only on non-Intel processors, and it is inferior in many cases to the branch used by Intel processors with the same instruction set. The non-Intel SSE2 branch is implemented in the 32-bit Windows version and the 32-bit Linux version, but not in the 64-bit versions of the library.

    * A new Summary Statistics library uses the same CPU dispatcher as the vector math library.

    Obviously, I haven't tested all functions in the library. There may be more differences that I haven't discovered. But it is clear that many functions in the new version of the library still cripples performance on non-Intel processors. I don't understand how they can do this without violating the legal settlement with AMD.
    Reply To This Message
    Last edited by informal; 08-05-2010 at 08:45 AM.

  2. #2
    Xtreme Enthusiast
    Join Date
    Oct 2008
    Posts
    678
    Quote Originally Posted by informal View Post
    Phenom II 910 is QC ,the other is 6 core Westmere.Yes L3 matters and you can see this with Agena Vs Deneb.By 2014 games will use more than 4 cores so yes it (Westmere) will be better performer but not because the things you believe.
    Stop this nonsense. You claim total performance doesn't matter in real world, with this as proof?!
    But after that you claim that L3 has an invisible impact, and core count?
    Let me get this straight. If I have my Phenom II running a bunch of games today clocked at 2GHz and 4.8GHz in two runs. Limited by my GPU I get the exact same FPS in both cases. Will both perform equally in the games realesed in 2014? Same core count, same cache, same everything except frequency.

    Quote Originally Posted by informal View Post
    I can't see what you don't understand? You have a scaling within a module.Anything outside of module should behave the same as Lisbon does today(scaling to 6 cores which communicate over shared L3).Each module is a "super core" if you will and each of those will scale the same-hence 4x1.8(or 8 x 0.9 since you like 90% number more )
    I'm not to sure, we don't know how fast the L3 is. It could be enhanced much. I often see i7 with 3 times the bandwidth. Faster L3 could improve scaling.
    Last edited by -Boris-; 08-05-2010 at 09:03 AM.

  3. #3
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by -Boris- View Post
    Stop this nonsense. You claim total performance doesn't matter in real world, with this as proof?!
    But after that you claim that L3 has an invisible impact, and core count?
    Let me get this straight. If I have my Phenom II running a bunch of games today clocked at 2GHz and 4.8GHz in two runs. Limited by my GPU I get the exact same FPS in both cases. Will both perform equally in the games realesed in 2014? Same core count, same cache, same everything except frequency.
    What kind of a game would run the same @ 2Ghz and 4.8Ghz Deneb chip?? There is no such a game and if there was it would be extremely GPU bound(I can't emphasize the word extremely).Both cache and core count are important(again look at Agena and Deneb and look at C2D Vs C2Q in modern games). There is a point where games stop scaling with CPU clocks since the GPU (yes even 5970) starts to bottleneck and can't process enough data.This happens with both Deneb and Nehalem.

    Also first you need to find and show me "a bunch of games today" that can run the same @ 2Ghz and 4.8Ghz .There are no such games as I mentioned previously above.Second it wouldn't mean the games of 2014 would run the same on those 2 different CPU clocks since the games would a) be hardly playable with your current GPU if you would stick with it b) scaling would stop somewhere in between those two frequencies if you choose to buy a new GPU.The only way a game from 2014 could scale perfectly with clock speeds from 2 to 4.8Ghz is if it was coded with awesome multi core support and it uses all available CPU resources to the maximum(highly unlikely).

    I'm not to sure, we don't know how fast the L3 is. It could be enhanced much. I often see i7 with 3 times the bandwidth. Faster L3 could improve scaling.
    L3 sharing policies are very different in Deneb/Thuban and Nehalem so you can't just compare the bandwidth like that.AMD uses victim(spill over cache) while intel uses inclusive.

  4. #4
    Xtreme Enthusiast
    Join Date
    Oct 2008
    Posts
    678
    Quote Originally Posted by informal View Post
    What kind of a game would run the same @ 2Ghz and 4.8Ghz Deneb chip?? There is no such a game and if there was it would be extremely GPU bound(I can't emphasize the word extremely).Both cache and core count are important(again look at Agena and Deneb and look at C2D Vs C2Q in modern games). There is a point where games stop scaling with CPU clocks since the GPU (yes even 5970) starts to bottleneck and can't process enough data.This happens with both Deneb and Nehalem.

    Also first you need to find and show me "a bunch of games today" that can run the same @ 2Ghz and 4.8Ghz .There are no such games as I mentioned previously above.Second it wouldn't mean the games of 2014 would run the same on those 2 different CPU clocks since the games would a) be hardly playable with your current GPU if you would stick with it b) scaling would stop somewhere in between those two frequencies if you choose to buy a new GPU.The only way a game from 2014 could scale perfectly with clock speeds from 2 to 4.8Ghz is if it was coded with awesome multi core support and it uses all available CPU resources to the maximum(highly unlikely).


    L3 sharing policies are very different in Deneb/Thuban and Nehalem so you can't just compare the bandwidth like that.AMD uses victim(spill over cache) while intel uses inclusive.
    I admit, I exaggerated a bit. Say a 2.8GHz Phenom II and a 4.8GHz one. Would they handle the games released in 2014 equally?


    And I know there is differences between i7 and Phenom II caches. My point is we don't know if DB has 48-way, 64-way or 96-way caches. We don't know if they operate around 2GHz or if the are at core speed. We know nothing about the speed of communication between modules.

  5. #5
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by -Boris- View Post
    I admit, I exaggerated a bit. Say a 2.8GHz Phenom II and a 4.8GHz one. Would they handle the games released in 2014 equally?


    And I know there is differences between i7 and Phenom II caches. My point is we don't know if DB has 48-way, 64-way or 96-way caches. We don't know if they operate around 2GHz or if the are at core speed. We know nothing about the speed of communication between modules.
    2.8Ghz to 4.8Ghz is 71% improvement and this brings very minor improvement in today's games.In future games I think the GPU based PhysX (not necessarily meaning the NV's approach,using just the term) will play much bigger role and then ,again,CPU will become even less of a factor than it is today.What will mater is a number of relatively fast cores(IMO 4) and cache(by this I mean,at least Penryn and Deneb class @ 2.8-3Ghz).With offloaded PhysX to GPU and massively shader heavy game engines,CPU will use additional cores to offload AI for example,among other things.Also we have to keep in mind that maybe games will use new AVX instruction set and this may play a role in how certain chip performs.

  6. #6
    V3 Xeons coming soon!
    Join Date
    Nov 2005
    Location
    New Hampshire
    Posts
    36,363
    Slightly OT but from what I see AMD with the MC's has got a excellent product, just need to get the clocks up.
    A 12 core MC at 3000mhz would be a force to reckon with.
    Crunch with us, the XS WCG team
    The XS WCG team needs your support.
    A good project with good goals.
    Come join us,get that warm fuzzy feeling that you've done something good for mankind.

    Quote Originally Posted by Frisch View Post
    If you have lost faith in humanity, then hold a newborn in your hands.

  7. #7
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by Movieman View Post
    Slightly OT but from what I see AMD with the MC's has got a excellent product, just need to get the clocks up.
    A 12 core MC at 3000mhz would be a force to reckon with.
    Yeah that's true.The only problem is they can't crank the clocks that high AND stay in the 105W ACP bracket . What I find interesting is that MC and Lisbon(D1) appear to still not using the low k dielectric tweak AMD implemented in Thuban silicon (E0) which practically made possible for AMD to make a hex core desktop chip with the same clock as QC equivalent(955BE,3.6Ghz @ 3 cores with Turbo) while staying within 125W TDP and actually drawing less than 955BE under full load.If they were to use this major process node tweak on server parts,I think they could bring up the clocks on MC by at least 15% while staying at the same power bands as today.

    Quote Originally Posted by -Boris- View Post
    So you believe that there will be no practical difference between the two? You are very vague.
    No,I believe the difference won't be big or impact gameplay that much.Not to mention there won't be 4.8Ghz Deneb chip to actually compare this(unless you do the dry ice session and test it yourself ).But I figure many will be looking to replace their SB or Bulldozer desktop chips with something new in 2014 so testing a Deneb on Dry Ice with the latest GPUs wouldn't be on anyone's priority list .
    Last edited by informal; 08-05-2010 at 10:48 AM.

  8. #8
    Xtreme Enthusiast
    Join Date
    Dec 2007
    Posts
    816
    Quote Originally Posted by Movieman View Post
    Slightly OT but from what I see AMD with the MC's has got a excellent product, just need to get the clocks up.
    A 12 core MC at 3000mhz would be a force to reckon with.
    More info here: http://www.youtube.com/watch?v=V0UcQDUR-fU
    Last edited by Drwho?; 08-08-2010 at 06:58 PM. Reason: Joking ... ;-)
    DrWho, The last of the time lords, setting up the Clock.

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •