Page 94 of 181 FirstFirst ... 448491929394959697104144 ... LastLast
Results 2,326 to 2,350 of 4519

Thread: AMD Zambezi news, info, fans !

  1. #2326
    Xtreme Cruncher
    Join Date
    Jun 2008
    Location
    pacific NW usa
    Posts
    2,764
    Quote Originally Posted by informal View Post
    Well if Opteron 6220 results are true ,as xsecret claims,then C11.5 result should roughly follow the same route as the Sisoft's MM benchmark.I say roughly since I have no idea what is the ratio of memory and SIMD instructions in these tests. If it does have similar ratio,then instead of 5.24pts one should have 1.67x the result of 1100T if he would to run C11.5 on FX8150 @ 3.6Ghz.Or in numbers : 9.85pts.
    9.85pts is dangerously reminiscent of this (early slide detailing Scorpius platform and Zambezi advantage over Thuban in 3 benchmarks;slide was from Dec 2010 and was pointing roughly at 10pts in C11.5 for 8C Zambezi @ unknown clock).

    9.85 seem a little high doesnt it?
    i would expect around 7.50 to 8.00 tops(best case scenario)
    and a more realistic score of 6.50 to 7.00?
    then again i dont know this stuff too well
    _________________________________________________
    ............................ImAcOmPuTeRsPoNgE............................
    [SIGPIC][/SIGPIC]

    MY HEATWARE 76-0-0

  2. #2327
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by xsecret View Post
    You should be aware that Zambezi Turbo mode is not that simple. There is 2 "main" levels of Turbo. For example, a 3.6 GHz base CPU can reach 3.9 GHz with all cores used as long as TDP remain under a specified value AND can reach 4.2 GHz in single core mode. So, depending on the usage on a 8-threads application, you can be at 3.6 GHz or 3.9 GHz.
    I actually used 3.8Ghz in above post.So Turbo for 8150 is figured in.
    Also I was under impression that when you have 8 FP heavy threads,like in the Multimedia benchmark from sisoft or Cinebench,there won't be any turbo engaging and chip will run at default (3.6Ghz).
    In any case,the Opteron SSE/AVX results completely disprove the FX8120 score of 5.24pts in C11.5. It doesn't make sense that in one FP heavy benchmark Zambezi kicks Thuban's ass (like in sisoft one,where 8150 @ 3.6Ghz is being 67% faster than 1100T) while in other it is practically slower than same chip or barely faster (6pts for 8150 according to xsecret and Chinese leaks vs 5.91pts for 1100T).

  3. #2328
    Registered User
    Join Date
    Feb 2005
    Posts
    39
    Quote Originally Posted by informal View Post
    I actually used 3.8Ghz in above post.So Turbo for 8150 is figured in.
    Also I was under impression that when you have 8 FP heavy threads,like in the Multimedia benchmark from sisoft or Cinebench,there won't be any turbo engaging and chip will run at default (3.6Ghz).
    In any case,the Opteron SSE/AVX results completely disprove the FX8120 score of 5.24pts in C11.5. It doesn't make sense that in one FP heavy benchmark Zambezi kicks Thuban's ass (like in sisoft one,where 8150 @ 3.6Ghz is being 67% faster than 1100T) while in other it is practically slower than same chip or barely faster (6pts for 8150 according to xsecret and Chinese leaks vs 5.91pts for 1100T).
    If Sandra uses AVX in the MM Benchmark, that's not strange. Thuban doesn't have AVX and all benchmarks using it are 10 times slower.
    Doc_TB @ CanardPC.Com (FR)

  4. #2329
    Xtreme Cruncher
    Join Date
    Jun 2008
    Location
    pacific NW usa
    Posts
    2,764
    So going from 3.1Ghz to 3.6Ghz do these 2 CB scores seem to be inline with reality?


    _________________________________________________
    ............................ImAcOmPuTeRsPoNgE............................
    [SIGPIC][/SIGPIC]

    MY HEATWARE 76-0-0

  5. #2330
    Xtreme Member
    Join Date
    Apr 2007
    Location
    Serbia
    Posts
    102
    Quote Originally Posted by xsecret View Post
    If Sandra uses AVX in the MM Benchmark, that's not strange. Thuban doesn't have AVX and all benchmarks using it are 10 times slower.
    Why do you think that AVX is so much powerful than SSE? Thuban Core and BD module can execute same number of raw FLOPS. AVX and SSE are vectorised packed FP instructions. BD module can execute one 256-bit AVX which contain 4DP FP operations, same as two 128-bit AVX or SSE. In some cases 256 AVX can be faster, but how much? Two times...

    Quote Originally Posted by radaja View Post
    So going from 3.1Ghz to 3.6Ghz do these 2 CB scores seem to be inline with reality?
    CB scales perfectly with frequency. 3.6/3.1*5.24 = 6.08. Something is wrong here with this results or frequency of CPU's isn't accurate. Actually I think that is much lower than CPUz's readings.
    Last edited by drfedja; 09-11-2011 at 03:22 PM.
    "That which does not kill you only makes you stronger." ---Friedrich Nietzsche
    PCAXE

  6. #2331
    Xtreme 3D Team
    Join Date
    Jan 2009
    Location
    Ohio
    Posts
    8,499
    @ rajada
    yes.

    13% increase in performance over 12.5% increase in base clock speed. Factor complex turbo in and it seems logical to me.
    Smile

  7. #2332
    Xtreme Member
    Join Date
    Aug 2004
    Posts
    210
    Quote Originally Posted by drfedja View Post
    Why do you think that AVX is so much powerful than SSE? Thuban Core and BD module can execute same number of raw FLOPS. AVX and SSE are vectorised packed FP instructions. BD module can execute one 256-bit AVX which contain 4DP FP operations, same as two 128-bit AVX or SSE. In some cases 256 AVX can be faster, but how much? Two times...
    Yes, AVX would do nothing, but FMA could be the big difference. SiSoft normally always programs special code for each CPU, thus on Bd, it should use XOP&FMA.

  8. #2333
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by xsecret View Post
    If Sandra uses AVX in the MM Benchmark, that's not strange. Thuban doesn't have AVX and all benchmarks using it are 10 times slower.
    I don't know if you have followed bulldozer trheads but actually bulldozer has teh same throughput in all 3 modes: legacy SSE,AVX 128bit and AVX 256bit. This is because the way AMD designed their FPU(or FlexFP as they call it). You have 8 of these FMACs in 8 core chip. All of them are 128bit wide. 128bit AVX usually carries very little to no performance benefit over standard SSE(think 5-10%). This is even seen in Zambezi leaked Sisoft numbers:
    Attachment 119979
    As you can see 11% faster in 256bit AVX mode than in legacy SSE (128bit) mode.
    With bulldozer,when you go to 256bit AVX you may even incur a small penalty ,but this is not the norm(compiler patches state up to 3% penalty and AMD encourages devs to use AVX 128 instead the 256bit one).
    So point is: AVX(both 128 and 256bit) brings nothing or close to nothing since Bulldozer has same peak flops in all 3 modes I listed.
    The only difference is FMA recompiled software which can bring additional 2x performance over AVX 128.At least this is what AMD listed in their HPC documents from last year. I can't find the pdf but I can link to a recent presentation which included a slide on FlexFP.A picture is worth a thousand words :
    Attachment 119978
    As you can see,same peak flops in all 3 cases. I rest my case .

    BTW the leak that I linked above showed that Zambezi @ 2.8Ghz had 132mpix/s for SSE score and 147 for AVX.I already showed that Opterons score better than this(10% higher than Zambezi). There is no Turbo in heavy FP/SIMD mode mind you. If you use 132 score as base and not 147 (AVX one),you get for 3.6Ghz : 132x3.6/2.8=170mpix/s vs 115 for 1100T. That is 48% better and based on Zambezi leak(not Opteron's score). 1.48x 5.91pts (Thuban score) =8.74pts. This is still miles ahead of what you claim and Chinese show. Again,remember that these numbers are based on SSE score I linked above (so legacy SSE code that Cinebech uses too).

  9. #2334
    Xtreme Member
    Join Date
    Apr 2007
    Location
    Serbia
    Posts
    102
    Quote Originally Posted by Opteron146 View Post
    Yes, AVX would do nothing, but FMA could be the big difference. SiSoft normally always programs special code for each CPU, thus on Bd, it should use XOP&FMA.
    I agree, but we don't know how SiSoft works with FMA and XOP turned on and off. We will know when we get BD on the bench table.

    Quote Originally Posted by informal View Post
    As you can see,same peak flops in all 3 cases. I rest my case .
    Yes, but what is the module count ? For 64 DP FLOPS you must have 8 SB cores and 16 FlexFP's. That slide is BS, because there is no CPU with 32 BD cores, or 16 BD modules. Interlagos has 8 BD modules or 8 FlexFP's which can execute up to 32 DP FLOPS, or 64 SP FLOPS.
    If you compare 8 core Xeon and 16 core Interlagos that slide make sense.

    Quote Originally Posted by BeepBeep2 View Post
    @ rajada
    yes.

    13% increase in performance over 12.5% increase in base clock speed. Factor complex turbo in and it seems logical to me.
    No, there is 16% increase in clock speed and 13% increase in performance. Gap is too big between increase of frequency and performance or scaling is too bad.
    Last edited by drfedja; 09-11-2011 at 03:40 PM.
    "That which does not kill you only makes you stronger." ---Friedrich Nietzsche
    PCAXE

  10. #2335
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    @drfedja
    We already have Sisoft numbers for SSE and AVX/FMA. Sisoft uses AVX and doesn't use FMA since the speedup with AVX 256 versus SSE 128 is 11% (147/132.3).

  11. #2336
    Xtreme Member
    Join Date
    Apr 2007
    Location
    Serbia
    Posts
    102
    Quote Originally Posted by informal View Post
    @drfedja
    We already have Sisoft numbers for SSE and AVX/FMA. Sisoft uses AVX and doesn't use FMA since the speedup with AVX 256 versus SSE 128 is 11% (147/132.3).
    Yes, if that numbers are correct.
    "That which does not kill you only makes you stronger." ---Friedrich Nietzsche
    PCAXE

  12. #2337
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Well they are correct in a sense that they show us what code path Zambezi runs(AVX and not FMA). Also they kinda align with both opteron 6200 series sisoft results. 2P 6282SE gets 585 @ 2.5Ghz which equates (with perfect scaling of 4x) to 147 or 164mpix/s @ 2.8Ghz (11% higher than 8C Zambezi @ 2.8Ghz). 2P 6220 @ 3Ghz gets 315mpix/s ;with perfect scaling => 315/2=157.5mpix/s or @ 2.8Ghz 147mpix/s (exactly the same as that Zambezi @ 2.8Ghz). So we can say now that results of the Zambezi sample are true for SIMD and kinda off for integer test.

  13. #2338
    Xtreme Member
    Join Date
    Sep 2008
    Location
    Eastern Tennessee (from Minnesota)
    Posts
    241
    Quote Originally Posted by informal View Post
    No.The link says 1.78x higher than Thuban in render. Check again.It's the middle bar (yellow) one. It represents Render performance . You can see it being roughly 1.78x higher than 1100T's.
    It can't "clearly be" something and "roughly", all at the same time. Nor can it also "say" when no where are there words stating it. :P I get that you've taken the 1.5x bar and lined it up to get 1.78x, but it's a marketing slide. It's meant to look good, not be mathematically accurate lol

    I know I'm sounding like a total dbag, which I apologize for, but I'm just trying to point out all the work you're doing for something that wasn't meant to be taken so literally (by dissecting and comparing) :\ I know where you're coming from though, with doing what you're doing being mentally stimulating, as I get that way with stuff.
    Last edited by Formula350; 09-11-2011 at 03:58 PM. Reason: Typo

  14. #2339
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Well yes it is a marketing slide(doh) but the bars are not drawn just for fun. There is clearly a ratio. 1.5x is for the last test. You can see the color for the individual test. Media benchmark (PCmark) shows the least advantage of the 3 and AMD didn't write "up to 15% in PC Mark TV and movies" for obvious reasons. Also note that it says " performance estimates and subject to change. This means they had no idea what clock speeds they will be hitting with retail chips when the time comes. Maybe they expected 4Ghz stock and now we have "only" 3.6Ghz.
    But still my point stands. We had these performance projections from December last year. Rendering showed the greatest improvement. Now it(Zambezi) shows lower performance with the latest ES floating around.

    PS You don't sound like a dbag at all. You just need to read up more . I said 1.78 since nobody knows exactly how long that bar is.It's longer than 1.7x and shorter than 2x. The last one is the only one listed with solid number,even though everything was a projection back in that time.
    Last edited by informal; 09-11-2011 at 03:59 PM.

  15. #2340
    Xtreme Enthusiast
    Join Date
    Feb 2009
    Location
    Lima, Peru
    Posts
    600
    All i see is crippled chips. Who knows, integrated chip to enable FX performance on a given day?


    If a FX-8120 scores less than a 1090T, then what would be the point of the new chip?

    Just release an ironed Phenom II and call it Phenom III or Phenom FX.

    less latency/more L3 (8-10MB)
    1MB L2 per core
    DDR3 1866/2133 controller
    add SSE4.2 / AVX / FMA / etc
    Magically --> 20-25% IPC with more or less the same arq, a monter gaming/mt machine.

    10points CB11.5 on a Phenom 8 core
    Phenom III X4 3Ghz $149 ~ Phenom II X4 980 3.7Ghz

    That should give SB a run for it's money.


    Really, what would be the point?
    Athlon II X4 620 2.6Ghz @1.1125v | Foxconn A7DA-S (790GX) | 2x2GB OCZ Platinum DDR2 1066
    | Gigabyte HD4770 | Seagate 7200.12 3x1TB | Samsung F4 HD204UI 2x2TB | LG H10N | OCZ StealthXStream 500w| Coolermaster Hyper 212+ | Compaq MV740 17"

    Stock HSF: 18°C idle / 37°C load (15°C ambient)
    Hyper 212+: 16°C idle / 29°C load (15°C ambient)

    Why AMD Radeon rumors/leaks "are not always accurate"
    Reality check

  16. #2341
    On the rise!
    Join Date
    Sep 2005
    Posts
    1,008
    I know this is probably old news to most. But wanted to show my findings just to verify any speculations:

    4. Entry Period: The Contest begins July 21, 2011 at 12:01am Eastern Time (“EDT”) and ends October 12, 2011 at 11:59 pm EDT (the “Entry Period”). Entries that are submitted before or after the Entry Period will be disqualified. Sponsor’s computer will be the official timekeeping device for the Contest.
    Can be seen here on the AMD giveaway contest rules!

  17. #2342
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Yeah we discussed that few days ago. They changed the date from Sept. 9 to October 12. This is in line with Q4 launch or as it was rumored : early October.

  18. #2343
    Banned
    Join Date
    Sep 2011
    Posts
    149
    Has amd stated whats coming first? Server or Desktop? Opteron's 6200 is scheduled to arrive on 10-11-11 on BLT so we should assume the desktop chips a week or so later?

    http://www.shopblt.com/cgi-bin/shop/...er_id=!ORDERID!

  19. #2344
    Xtreme Cruncher
    Join Date
    Jun 2008
    Location
    pacific NW usa
    Posts
    2,764
    Quote Originally Posted by Pestilence View Post
    Has amd stated whats coming first? Server or Desktop? Opteron's 6200 is scheduled to arrive on 10-11-11 on BLT so we should assume the desktop chips a week or so later?

    http://www.shopblt.com/cgi-bin/shop/...er_id=!ORDERID!
    all we know is server is shipping for revenue and will launch Q4,so said JF

    he also said this in regards to BLT on ETA's.

    Quote Originally Posted by JF-AMD View Post
    When you see those things pop up on random web sites it is typically a bad data feed from their distributor. The disti turns on SKUs they shouldn't and the reseller just takes the whole feed.

    The funny thing is that you can't even be sure that the data is real. Sometimes they load with dummy data as a placeholder. I am specifically not looking at the link because I don't want to have to start answering questions about details. But I would be a bit careful on these things.
    _________________________________________________
    ............................ImAcOmPuTeRsPoNgE............................
    [SIGPIC][/SIGPIC]

    MY HEATWARE 76-0-0

  20. #2345
    Registered User
    Join Date
    Sep 2008
    Posts
    45
    The Sandia Processor Arithmetic Benchmark is not a pure integer benchmark, but a aggregate score of the pure integer Dhrystone benchmark and the floating point focused Whetstone benchmark.

  21. #2346
    Xtreme Addict
    Join Date
    Jun 2002
    Location
    Ontario, Canada
    Posts
    1,782
    Quote Originally Posted by AKM View Post
    What if IPC will be lower in some cases and higher in others?
    IPC should be higher in all cases on a new architecture. No excuses.
    As quoted by LowRun......"So, we are one week past AMD's worst case scenario for BD's availability but they don't feel like communicating about the delay, I suppose AMD must be removed from the reliable sources list for AMD's products launch dates"

  22. #2347
    Xtreme Addict
    Join Date
    Mar 2005
    Location
    Rotterdam
    Posts
    1,553
    Quote Originally Posted by freeloader View Post
    IPC should be higher in all cases on a new architecture. No excuses.
    It generally makes a lot of sense now that AMD delayed desktop and pulled in server chips. Because desktops depend heavily on IPC and single threaded workload, and if BD is very weak at both they need to tweak for maximum clocks they can to offset this. But for servers it is not as big of a problem so it became the new priority.

    Had BD been a spectacular product it would be in our computers already. I doubt any delays were due to bugs, but rather due to attempting to get clock shigher to make up for poor ipc.
    Gigabyte Z77X-UD5H
    G-Skill Ripjaws X 16Gb - 2133Mhz
    Thermalright Ultra-120 eXtreme
    i7 2600k @ 4.4Ghz
    Sapphire 7970 OC 1.2Ghz
    Mushkin Chronos Deluxe 128Gb

  23. #2348
    PerryR
    Guest
    Quote Originally Posted by Dimitriman View Post
    Had BD been a spectacular product it would be in our computers already. I doubt any delays were due to bugs, but rather due to attempting to get clock shigher to make up for poor ipc.
    I thought servers were more important than desktop? It makes perfect sense that they would get the product to a place that would, more than likely, produce the most revenue.

    I'm not sure about bugs or higher clocks being the issue, I think GF didn't produce a enough quantities; hell, from what I understand, the demand for LLano has been overwhelming.

  24. #2349
    Registered User
    Join Date
    Feb 2005
    Posts
    39
    Quote Originally Posted by freeloader View Post
    IPC should be higher in all cases on a new architecture. No excuses.
    When you're not able to increase the IPC on your current ľarch, you must use faster clocks to increase the performance. In order to use faster clocks, you need an high throughput engine and remove all bottlenecks in your frontend. Sometimes you need to do some horrible things to achieve this like putting your L1 in Write-Through while trying to amaze ppls with "ultra high bandwidth" FP/SMD units... even if you're not able to feed them correctly with your decode/dispatch unit in all cases. Finally, you'll get a decent CPU, but only at very high frequency and with a LOT of power to dissipate. Worst of all : when your process is not able to give you high yields, you must launch it at low freq.

    Say hello to Netburst....

    ...and Bulldozer ?
    Last edited by xsecret; 09-11-2011 at 08:07 PM.
    Doc_TB @ CanardPC.Com (FR)

  25. #2350
    I am Xtreme
    Join Date
    Aug 2008
    Posts
    5,586
    Quote Originally Posted by xsecret View Post
    When you're not able to increase the IPC on your current ľarch, you must use faster clocks to increase the performance. In order to use faster clocks, you need an high throughput engine and remove all bottlenecks in your frontend. Sometimes you need to do some horrible things to achieve this like putting your L1 in Write-Through while trying to amaze ppls with "ultra high bandwidth" FP/SMD units... even if you're not able to feed them correctly with your decode/dispatch unit in all cases. Finally, you'll get a decent CPU, but only at very high frequency and with a LOT of power to dissipate. Worst of all : when your process is not able to give you high yields, you must launch it at low freq.

    Say hello to Netburst....

    ...and Bulldozer ?
    best amd bd post ever


Page 94 of 181 FirstFirst ... 448491929394959697104144 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •