Page 6 of 7 FirstFirst ... 34567 LastLast
Results 126 to 150 of 157

Thread: 5870 Bottleneck Investigation (CPU and/or Memory Bandwidth)

  1. #126
    Xtreme Enthusiast
    Join Date
    Dec 2003
    Location
    Nederlands
    Posts
    635
    Furmark has a strange way of stressing the cards to there max. You can overheat a HD4870 or HD4890 at ease even without renaming the .exe file A twin turbo could not cool a HD4890 VRM's got to hot. Stock cooler does a better job there bot you do have a lot more noise. The power draw is also way higher. So its deffinatly doing something. I also think furmark is able to keep all the 800 (RV770 and RV790) and 1600 shaders (RV870) fully loaded where games can not.
    System Specs: -=Game PC=- | -=Lan Box=-

  2. #127
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    i dont think its that all the shader cores are loaded, games can def do that, what furmark does is load the cores completely, all 5 parts of it, and occt even feeds the rest of the logic with work and keeps texture and geometry setup busy with bogus work i think... its kinda like linx that bombards cpus with random instructions without any dependencies so every single unit is fully loaded. i think...

  3. #128
    Xtreme Enthusiast
    Join Date
    Dec 2003
    Location
    Nederlands
    Posts
    635
    Quote Originally Posted by saaya View Post
    i dont think its that all the shader cores are loaded, games can def do that, what furmark does is load the cores completely, all 5 parts of it, and occt even feeds the rest of the logic with work and keeps texture and geometry setup busy with bogus work i think... its kinda like linx that bombards cpus with random instructions without any dependencies so every single unit is fully loaded. i think...
    Indeed thats what i tried to say :P Games mostly load 2-4 out of 5 alu's. And sometimes they get up to 5.
    Last edited by Astennu; 10-02-2009 at 04:39 AM.
    System Specs: -=Game PC=- | -=Lan Box=-

  4. #129
    Xtreme Guru
    Join Date
    Jan 2005
    Location
    Tre, Suomi Finland
    Posts
    3,858
    Quote Originally Posted by Chickenfeed View Post
    Furmark 1.7
    1280x1024 8x AA 60000MS
    HD 5870 1GB

    Mem - Score
    1300 - 2733
    1200 - 2537
    1100 - 2360
    1000 - 2187
    900 - 1963

    http://img194.imageshack.us/img194/9013/graphtv.jpg
    My HD4890 at SXGA, 8xMSAA scored 1694 at core 850MHz, memory 900MHz and 1859 at memory 1000MHz. So HD5870 at same clocks is whopping 16-17% faster. And that, ladies and gentlemen, is nothing short of messed up considering Cypress' overwhelming HW level superiority and the fact Furmark is very, very shader dependant...
    Last edited by largon; 10-02-2009 at 07:22 AM.
    You were not supposed to see this.

  5. #130
    Xtreme Addict
    Join Date
    May 2003
    Location
    Hopatcong, NJ
    Posts
    1,078
    Any conspiracy theories yet?

    Witholding performance via driver for GT300 launch? lol...

    'Gaming' AMD FX-6300 @ 4.5GHz | Asus M5A97 | 16GB DDR3 2133MHz | GTX760 2GB + Antec Kuhler620 mod | Crucial m4 64GB + WD Blue 2x1TB Str
    'HTPC' AMD A8-3820 @ 3.5GHz | Biostar TA75A+ | 4GB DDR3 | Momentus XT 500GB | Radeon 7950 3GB
    'Twitch' AMD 720BE @ 3.5GHz | Gigabyte GA-78LMT-S2P | 4GB DDR3 | Avermedia Game Broadcaster

    Desktop Audio: Optical Out > Matrix mini DAC > Virtue Audio ONE.2 > Tannoy Reveal Monitors + Energy Encore 8 Sub
    HTPC: Optoma HD131XE Projector + Yamaha RX-V463 + 3.2 Speaker Setup

  6. #131
    Xtreme Member
    Join Date
    Sep 2008
    Posts
    115
    Quote Originally Posted by largon View Post
    My HD4890 at SXGA, 8xMSAA scored 1694 at core 850MHz, memory 900MHz and 1859 at memory 1000MHz. So HD5870 at same clocks is whopping 16-17% faster. And that, ladies and gentlemen, is nothing short of messed up considering Cypress' overwhelming HW level superiority and the fact Furmark is very, very shader dependant...
    doesn't furmark have to be updated for better loading of newer cards?

  7. #132
    Xtreme Addict
    Join Date
    May 2004
    Location
    Aland Islands, Finland
    Posts
    1,137
    Quote Originally Posted by Miwo View Post
    Any conspiracy theories yet?

    Witholding performance via driver for GT300 launch? lol...
    Well we haven't seen what the x2 can do yet. It's always possible that they've figured out an more efficient way for the GPU's to work over the traditional cfx..

    wild speculations
    Asus Crosshair IV Extreme
    AMD FX-8350
    AMD ref. HD 6950 2Gb x 2
    4x4Gb HyperX T1
    Corsair AX1200
    3 x Alphacool triple, 2 x Alphacool ATXP 6970/50, EK D5 dual top, EK Supreme HF

  8. #133
    Xtreme Guru
    Join Date
    Jan 2005
    Location
    Tre, Suomi Finland
    Posts
    3,858
    gamervivek,
    Of course not. It's just an OpenGL benchmark app. It's not like games and all 3D aps need to be patched to "support" new HW.
    You were not supposed to see this.

  9. #134
    Xtreme Addict
    Join Date
    Jul 2007
    Location
    Alberta, Canada
    Posts
    1,264
    Quote Originally Posted by largon View Post
    My HD4890 at SXGA, 8xMSAA scored 1694 at core 850MHz, memory 900MHz and 1859 at memory 1000MHz. So HD5870 at same clocks is whopping 16-17% faster. And that, ladies and gentlemen, is nothing short of messed up considering Cypress' overwhelming HW level superiority and the fact Furmark is very, very shader dependant...
    I will be doing this test at more "real" world resolutions soon ( real world as far as these cards go - I know 1280x1024 is still widely used though ) I do agree that something is pretty messed up given the small difference.

    I'll mention I've found that I get a display driver has stopped running error at 900/1300 when running OCCTs furmark type test after 30-45min ( this got core to like 88C and fan to over 50% which gets quite loud ) With actual games though it usually is in the 70s and only Crysis has managed to get any higher. I'm not sure if this is an indication of the card not being able to handle the load at these clocks in something like Furmark or merely drivers in their infancy. All I know is no games or other programs have shown signs of issue yet so far.
    Feedanator 7.0
    CASE:R5|PSU:850G2|CPU:i7 6850K|MB:x99 Ultra|RAM:8x4 2666|GPU:980TI|SSD:BPX256/Evo500|SOUND:2i4/HS8
    LCD:XB271HU|OS:Win10|INPUT:G900/K70 |HS/F:H115i

  10. #135
    Xtreme Member
    Join Date
    Jun 2009
    Location
    Winnipeg, MB
    Posts
    137
    Quote Originally Posted by saaya View Post
    rv600 and rv670 and rv770 all adjusted the timings when you overclocked the memory... id be surprised if rv870 doesnt do it anymore...

    a drop from 8.3 to 7.7ns is a latency reduction of 7% though, then how come we still only get a 1.5% boost? and memory timings tend to not really matter on vgas, they need bw bw bw... they usually have cas15 or so...

    i dont think the boost we see is only from reduced memory latency, as a matter of fact, i dont think memory latency decreases a lot, since r600 all gpus use a formula to calculate memory timings based on memory clock. the latency still decreases, and bw still increases, otherwise we wouldnt see any gains at all, but its less than we would see from a static timing config with increased memory speed.

    you are probably thinking of system memory performing better with lower timings, for vgas that doesnt apply... they do score better with lower timings by the boost is tiny compared to cpu system memory...

    nice tests chumbucket and chickenfield (sp?)
    what it shows is that in furmark the 5870 def is bw limited, but that doesnt seem to be the case in actual games...
    furmark is 99% shader heavy load only isnt it?
    Ah I never knew that about the memory timings on ATI's cards.

    I still don't think we have a memory bandwidth issue in games.

    If a core increase of 10% nets 5% on average, and a memory bandwidth increase only nets 0.5% to 3.5%, I'd say we have a core bottleneck more than anything. :p

    Though this benchmark stands out for oddity:



    -A 10% increase in memory nets ... 0% increase in frame rate.
    -A 10% increase in core nets ... 0.6% increase in frame rate.
    -Both increases combined and now we have a 7.4% increase in frame rate.

  11. #136
    Xtreme Cruncher
    Join Date
    May 2009
    Location
    Bloomfield
    Posts
    1,968
    Quote Originally Posted by saaya View Post
    i dont think its that all the shader cores are loaded, games can def do that, what furmark does is load the cores completely, all 5 parts of it, and occt even feeds the rest of the logic with work and keeps texture and geometry setup busy with bogus work i think... its kinda like linx that bombards cpus with random instructions without any dependencies so every single unit is fully loaded. i think...
    its more complicated than that. if a shader program is flow heavy it will do a lot better on nvidia cards. loading all of the alu's is harder than it sounds. its vliw too so not all instructions can take full advantage of all of the shaders. they encourage programmers to use certain floats and vecs to get the fastest performance. memory size is probably the issue. think about it. if you double the amount of threads you double the required memory. its like having a 4870 512mb. it makes sense that seeing that they doubled everything but memory and bandwidth.

  12. #137
    Xtreme Member
    Join Date
    Aug 2004
    Location
    TasMania
    Posts
    179
    Quote Originally Posted by Miwo View Post
    Any conspiracy theories yet?

    Witholding performance via driver for GT300 launch? lol...
    Im still waiting for this "investigation" to be finalised so we can see how ATi have completely stuffed the HD5870
    CPU: i7 860 4000Mhz 1.3v
    MoBO: GA-P55-UD5
    RAM: GSkill RipJaws 800 7.7.7.
    HdD: Seagate 7200.11 500Gb
    CoOlinG: Noctua 12P
    VGa: Gigabyte HD6870
    PSu: Silverstone 550W

  13. #138
    Xtreme Member
    Join Date
    Aug 2004
    Location
    Bel Air, Maryland
    Posts
    143
    Quote Originally Posted by Miwo View Post
    Any conspiracy theories yet?

    Witholding performance via driver for GT300 launch? lol...
    I know that's what's on my mind. I tried asking on Terry Makedon's (CatalystMaker) twitter, but he only evaded the question, then stopped answering when I pushed harder. I'm really beginning to think they're holding the card back.
    Intel i7-2700k@ 4.7ghz (46x102)
    Asus P8Z68 Deluxe GEN3
    G.Skill 2x4gb RipjawX 2133 11-11-11-30
    GTX 680 1220/7000
    Corsair TX750W
    Razer Lachesis w/ Razer Pro|Pad
    1x160gb Seagate HDD, 2x1tb Seagate HDD
    LG 22" 226WTQ & BenQ G2400WD
    Windows 7 Ultimate x64 SP1

  14. #139
    Registered User
    Join Date
    Nov 2008
    Posts
    16
    One theory why 5870 is slower than its spec suggest: Its 1600 shaders are nowhere near working at their max capacity. This would explain why

    1) increasing memory clocks does not improve results as much as expected.
    2) load power is lower than other cards in games but much higher in occt.
    3) raising memory has a greater effect in furmark

  15. #140
    Xtreme Addict
    Join Date
    Dec 2002
    Posts
    1,250
    "if" they can hold it back a little, it make sense as they wait for the late and very late g300 performances.
    4670k 4.6ghz 1.22v watercooled CPU/GPU - Asus Z87-A - 290 1155mhz/1250mhz - Kingston Hyper Blu 8gb -crucial 128gb ssd - EyeFunity 5040x1050 120hz - CM atcs840 - Corsair 750w -sennheiser hd600 headphones - Asus essence stx - G400 and steelseries 6v2 -windows 8 Pro 64bit Best OS used - - 9500p 3dmark11 (one of the 26% that isnt confused on xtreme forums)

  16. #141
    Xtreme Addict
    Join Date
    Mar 2007
    Posts
    1,489
    You know, ATI holding the card back really goes pretty far into tinfoil hat territory, but it almost makes sense...
    Asus G73- i7-740QM, Mobility 5870, 6Gb DDR3-1333, OCZ Vertex II 90Gb

  17. #142
    Xtreme Member
    Join Date
    Jun 2005
    Location
    MA, USA
    Posts
    146
    I think it might be accidentally held back due to drivers. If I understand it correctly, a lot of the performance from ATI's design comes from the GPU being fed information optimally. I'd wait for a few driver releases to see final performance numbers.
    | Cooler Master 690 II Advanced | Corsair 620HX | Core i5-2500K @ 5.0GHz | Gigabyte Z68XP-UD4 | 2x4096MB G.Skill Sniper DDR3-2133 @ 2134MHz 10-11-10-30 @ 1.55V | 160GB Intel X-25 G2 | 2x 2TB Samsung EcoGreen F4 in RAID 1 | Gigabyte HD 7970 @ 1340MHz/1775MHz | Dell 30" 3007WFP-HC | H2O - XSPC RayStorm and Swiftech MCW82 on an MCP350 + XSPC Acrylic Top, XSPC RX240 and Swiftech MCR220 radiators.

  18. #143
    I am Xtreme
    Join Date
    Dec 2008
    Location
    France
    Posts
    9,060
    Yep, need better drivers, hard to expect some crappy release candidate version to offer 100% performance.
    Donate to XS forums
    Quote Originally Posted by jayhall0315 View Post
    If you are really extreme, you never let informed facts or the scientific method hold you back from your journey to the wrong answer.

  19. #144
    Xtreme Enthusiast
    Join Date
    Dec 2003
    Location
    Nederlands
    Posts
    635
    I think its driver related but i dont think AMD does this in purpose. Because the card now looks bad vs the HD4870 X2. The X2 is out gunning it. They did change a lot in the core. So i expect they will be able to boost the performance by 10-20% in the future. We have also seen massive gains with newer drivers for the HD48xx cards.

    And about furmark. I know AMD does something in there drivers to cut down the load on the GFX card. Furmark puts a rare load on the card causing some to overheat. Because of that AMD has put a limiter in the driver. You can disable it by renaming the .exe in something else. You might wanna try that if you want a fair compare. But keep an eye out for you temperatures. They can get higher then before !
    Last edited by Astennu; 10-05-2009 at 04:22 AM.
    System Specs: -=Game PC=- | -=Lan Box=-

  20. #145
    Registered User
    Join Date
    Nov 2008
    Posts
    16

    poor performance with 16xAF

    So despite what they claimed the new AF does take more serious hit than before:

    http://pclab.pl/art38674-9.html

    Might be a better idea to run these cards at 8xAF instead of 16x

  21. #146
    Xtreme Addict
    Join Date
    Nov 2005
    Location
    UK
    Posts
    1,074
    Quote Originally Posted by saaya View Post
    rv600 and rv670 and rv770 all adjusted the timings when you overclocked the memory... id be surprised if rv870 doesnt do it anymore...
    That's very interesting, I remember reading that before somewhere. Don't suppose you have a link?

    i7| EX58-EXTREME | SSD M225 | Radbox | 5870CF + 9600GT

  22. #147
    Xtreme Enthusiast
    Join Date
    Mar 2009
    Location
    In a van down by the river
    Posts
    852

  23. #148
    I am Xtreme
    Join Date
    Jul 2007
    Location
    Austria
    Posts
    5,485
    Quote Originally Posted by saaya View Post
    ... its kinda like linx that bombards cpus with random instructions without any dependencies so every single unit is fully loaded. i think...
    Sry but linx is nothing more then a gui for linpack...

  24. #149
    Xtreme Mentor
    Join Date
    May 2008
    Location
    cleveland ohio
    Posts
    2,879
    http://www.xtremesystems.org/forums/...=235831&page=2

    look at the score on the pages before oc on card and after.
    seems more like a bios problem, note this was an AMD system though
    HAVE NO FEAR!
    "AMD fallen angel"
    Quote Originally Posted by Gamekiller View Post
    You didn't get the memo? 1 hour 'Fugger time' is equal to 12 hours of regular time.

  25. #150
    Registered User
    Join Date
    Sep 2009
    Posts
    1
    Quote Originally Posted by JimmyH View Post
    One theory why 5870 is slower than its spec suggest: Its 1600 shaders are nowhere near working at their max capacity. This would explain why

    1) increasing memory clocks does not improve results as much as expected.
    2) load power is lower than other cards in games but much higher in occt.
    3) raising memory has a greater effect in furmark
    If you want to verify the capacities of the ALUs you should consider testing the same stuff on the "Froblin"-demo with a 4870 and a 5870, because you got adaptive tessalation even without DX11.
    That somewhat disconnects the internal from the external bandwidth requirements, as no (proportionally) more geometry is streamed into the processor.
    I would also make the very vague guess that crunching on up-LODed geometry creates a somewhat shaderinstruction-grouping favoring situation. But that's very hard to say. That can really only be verified if you can play around with the LOD-strength in the demo.

    You'd check something like this combinations:

    LOD: 1, 2, 4
    Chips: 4870, 5870
    Core-Clock: x, y, z

    BTW the architecture is well described (ATI published ta 392 page doc to the linux guys), so someone eager may find the reason in there.

    Good luck

Page 6 of 7 FirstFirst ... 34567 LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •