[SIGPIC][/SIGPIC]Bring... bring the amber lamps.
You have to consider the values as percents, not the raw MHz. The 65MHz increase on the core is ~7.6% while the 100MHz increase on the RAM is ~8.3% - both very close, and the increase in FPS reflects this. Because increasing either the core or the RAM frequency nets noticeable performance gains, it appears that neither is bottlenecking the other (at the frequencies tested in the game tested). I'd imagine that in older games, the RAM will become more of a bottleneck to the FPS rate than would the core frequency.
| Cooler Master 690 II Advanced | Corsair 620HX | Core i5-2500K @ 5.0GHz | Gigabyte Z68XP-UD4 | 2x4096MB G.Skill Sniper DDR3-2133 @ 2134MHz 10-11-10-30 @ 1.55V | 160GB Intel X-25 G2 | 2x 2TB Samsung EcoGreen F4 in RAID 1 | Gigabyte HD 7970 @ 1340MHz/1775MHz | Dell 30" 3007WFP-HC | H2O - XSPC RayStorm and Swiftech MCW82 on an MCP350 + XSPC Acrylic Top, XSPC RX240 and Swiftech MCR220 radiators.
Okay Mr. K6 - good point. They've basically raised the core & memory 8%. They get the same increase in framerates from either +8% core or +8% memory. To me this means that the 5870 could use all the more core or memory speed you could give it.
This card is going to give very good results to anyone who overclocks the out it.
[SIGPIC][/SIGPIC]Bring... bring the amber lamps.
Oh most definitely . I'm very much anticipating a stable release of Rivatuner so I can get to work on the voltages . As it stands, AMD GPU Clock Tool isn't working with my card (drivers?), so I currently can't go above the limits of CCC (unless I'm using the program incorrectly). Anyway, with a little voltage, it seems these cores will easily do 1GHz+ on the GPU. I'd be interested to see not only the performance gains from this speed, but also if a memory bottleneck finally shows itself
| Cooler Master 690 II Advanced | Corsair 620HX | Core i5-2500K @ 5.0GHz | Gigabyte Z68XP-UD4 | 2x4096MB G.Skill Sniper DDR3-2133 @ 2134MHz 10-11-10-30 @ 1.55V | 160GB Intel X-25 G2 | 2x 2TB Samsung EcoGreen F4 in RAID 1 | Gigabyte HD 7970 @ 1340MHz/1775MHz | Dell 30" 3007WFP-HC | H2O - XSPC RayStorm and Swiftech MCW82 on an MCP350 + XSPC Acrylic Top, XSPC RX240 and Swiftech MCR220 radiators.
RiG1: Ryzen 7 1700 @4.0GHz 1.39V, Asus X370 Prime, G.Skill RipJaws 2x8GB 3200MHz CL14 Samsung B-die, TuL Vega 56 Stock, Samsung SS805 100GB SLC SDD (OS Drive) + 512GB Evo 850 SSD (2nd OS Drive) + 3TB Seagate + 1TB Seagate, BeQuiet PowerZone 1000W
RiG2: HTPC AMD A10-7850K APU, 2x8GB Kingstone HyperX 2400C12, AsRock FM2A88M Extreme4+, 128GB SSD + 640GB Samsung 7200, LG Blu-ray Recorder, Thermaltake BACH, Hiper 4M880 880W PSU
SmartPhone Samsung Galaxy S7 EDGE
XBONE paired with 55'' Samsung LED 3D TV
This thread that we are now in is about the 5870's internal memory bottlenecking the card's performance.
The thread I purposely made seperate was to investigate CPU bottlenecking of the 5870 as a whole.
At your discretion of course as a moderator, but I think maybe these threads should be un-merged. They were for two totally different topics and merging them will most likely prevent the original goal of this thread's OP from being reached.
edit: Ashraf edited thread title to "5870 Bottleneck Investigation (CPU and/or Memory Bandwidth)" to reflect two different topics being discussed within thread.
Thanks!
Last edited by iandh; 09-28-2009 at 03:37 PM.
Asus G73- i7-740QM, Mobility 5870, 6Gb DDR3-1333, OCZ Vertex II 90Gb
i dont see any different scaling in benchmarks. the 4870 was 2x faster than rv670 and it had 2.5 times more shaders. i would not expect doubling the shaders to double effective performance. the problem might be the fact that there really isnt much to compare it to. might be l1 to l2 cache bandwidth.
_____________________RV770______RV870
Texture units ________40. ______80.
L1 cache bandwidth 480GB/s 1,000GB/s
L1 to L2 bandwidth 384GB/s 435GB/s
please unmerge the thread... we cant have a coherant discussion if everythread about the 5870 gets merged... meh....
Yeah, it's pretty screwed now as far as coherency goes. I wish the person who reported the threads had bothered to read them before clicking submit... just because two threads have a similar title doesn't mean they are discussing the same exact subject. I can understand Ashraf's mistake because he probably just was going down the job queue and saw the thread titles, then hit "merge threads".
I asked Ashraf and he said that there isn't really an "unmerge" option, so I basically I'd have to start a new empty thread, and then delete all the posts from my old thread that are now in this thread.
Asus G73- i7-740QM, Mobility 5870, 6Gb DDR3-1333, OCZ Vertex II 90Gb
Got my 5870 today, but I can't seem to get the GPUclock utility to work. I'm using the RC7 drivers leaked from MSI, maybe that's the problem...
Its depents on the game and the driver you can see it like this:
So if instructions can be grouped together you can have quite a performance boost. Down side is if you cant group instructions you only have 1/4e or 1/5e of the performance.
I dont know if its true but i have heard the AMD compiler does a quite good job. Grouping up to 3-4 of them most of the time.
But if you have a heavy nVidia optimised game you can have lower value's and bad shader performance. (you might it 1-3 then)
Then about the performance of the HD5870. It does not seem to me memory bandwith limited. But i also think there is more performance inside this core then we see now. It might be driver related. It wont surprise me if we get up to 20% higher performance in the future.
The RV870 core is still new. And i think AMD could optimize the scheduling of the threads a bit better so you can keep those 1600 alu's fed with data. It would be nice if there was a way to see the load on those shader units. And compare it to the load on RV770 and RV790 cores. Dont forget those are well optimized in the last years driver releases.
Last edited by Astennu; 09-29-2009 at 01:57 AM.
System Specs: -=Game PC=- | -=Lan Box=-
Interesting read: http://firingsquad.com/hardware/ati_...ng/default.asp
| Cooler Master 690 II Advanced | Corsair 620HX | Core i5-2500K @ 5.0GHz | Gigabyte Z68XP-UD4 | 2x4096MB G.Skill Sniper DDR3-2133 @ 2134MHz 10-11-10-30 @ 1.55V | 160GB Intel X-25 G2 | 2x 2TB Samsung EcoGreen F4 in RAID 1 | Gigabyte HD 7970 @ 1340MHz/1775MHz | Dell 30" 3007WFP-HC | H2O - XSPC RayStorm and Swiftech MCW82 on an MCP350 + XSPC Acrylic Top, XSPC RX240 and Swiftech MCR220 radiators.
Very interesting indeed . It seems that coreclocks give the most gains in the games they tested .
I actually can't wait to see what these cards can do when more voltage is given . And I also want to see what the little brother 5850 can pull out of it's hat in terms of overclocking
Main rig 1: Corsair Carbide 400R 4x120mm Papst 4412GL - 1x120mm Noctua NF-12P -!- PC Power&Cooling Silencer MK III 750W Semi-Passive PSU -!- Gigabyte Z97X-UD5H -!- Intel i7 4790K -!- Swiftech H220 pull 2x Papst 4412 F/2GP -!- 4x4gb Crucial Ballistix Tactical 1866Mhz CAS9 1.5V (D9PFJ) -!- 1Tb Samsung 840 EVO SSD -!- AMD RX 480 to come -!- Windows 10 pro x64 -!- Samsung S27A850D 27" + Samsung 2443BW 24" -!- Sennheiser HD590 -!- Logitech G19 -!- Microsoft Sidewinder Mouse -!- Fragpedal -!- Eaton Ellipse MAX 1500 UPS .
Thanks for the link. Doesn't look like the 5870 is that bandwidth limited here. But it seems bottlenecked by something else. Most games return less gains than the increase on the core and memory. Poor drivers? Can't help thinking that the impressive power consumption of this card is actually could be the shaders sitting around doing nothing.
Anandtech power test with occt shows the 5870 actually uses alot more power than the other reviews suggest when loaded with a highly optimized application.
http://www.anandtech.com/video/showdoc.aspx?i=3643&p=26
based on jimmyh's cod4 benchmark:
(compute power to bandwidth ratio)
memory:
4870 speeds / 5870 equivalent / FPS / % increase
750 / 530 --- 850 / 1200 ----- 100 --- 0%
750 / 575 --- 850 / 1300 ----- 104 --- 4%
750 / 618 --- 850 / 1400 ----- 107 --- 2.9%
750 / 663 --- 850 / 1500 ------ 112 --- 4.6%
750 / 900 --- 850 / 2040 ------ 121 --- n/a
core:
4870 speeds / 5870 equivalent / FPS / % increase
750 / 530 --- 850 / 1200 ------ 100 --- 0%
800 / 530 --- 906 / 1200 ------ 103 --- 3%
750 / 575 --- 850 / 1300 ----- 104 --- 0%
800 / 575 --- 906 / 1300 ----- 107 --- 2.9%
835 / 575 --- 946 / 1300 ----- 110 --- 2.8%
http://www.xtremesystems.org/forums/...7&postcount=88
Lightman's 3d06 benchmark:
core / mem / FPS / + fps / + %
850 / 3600 / 091.7 / 0 / 0%
850 / 4000 / 095.2 / +3.5 +3.8%
850 / 4400 / 097.9 / +2.7 +2.9%
850 / 4800 / 100.3 / +2.5 +2.5%
850 / 5200 / 102.4 / +2.1 +2.1%
http://www.xtremesystems.org/forums/...1&postcount=23
Extrahardware.CZ Crysis 1920 × 1200, 4× AA
core / mem / fps / +fps / +%
memory:
850 / 4400 - 40,9 - 0 / 0%
850 / 4800 - 42,0 - +1.1 +2.7%
850 / 5200 - 43,1 - +1.1 +2.6%
core:
785 / 4800 - 40,1 - 0 / 0%
850 / 4800 - 42,0 - +1.9 +4.7%
915 / 4800 - 43,2 - +1.2 +2.9%
core and memory:
785 / 4400 - 39,3 - 0 / 0%
850 / 4800 - 42,0 - +2.7 +6.9%
900 / 5200 - 44,7 - +2.7 +6.4%
http://www.extrahardware.cz/pretakto...adeonu-hd-5870
Firingsquad Crysis 1920 × 1200, 2× AA
core / mem / fps / +fps / +% (from stock)
850 / 4800 - 31.6 - 0 / 0%
850 / 5272 - 32.3 - +0.6 +2.2%
850 / 4800 - 31.6 - 0 / 0%
930 / 4800 - 33.1 - +1.5 +4.7%
850 / 4800 - 31.6 - 0 / 0%
930 / 5400 - 34.3 - +2.7 +8.5%
http://firingsquad.com/hardware/ati_...king/page5.asp
Meh. You can get gains from overclocking core, or memory, or both together. You get higher gains (usually) from overclocking the core. Only in the game Batman did firingsquad get higher performance from overclocking memory vs. core. Conclusion: overclock core & memory as much as possible. Memory bandwidth isn't the bottleneck, otherwise we would have relatively no gain from core overclocking...?
Yeah seems like it... jeez.
[SIGPIC][/SIGPIC]Bring... bring the amber lamps.
I wouldn't say it isn't memory bandwidth bottlenecked just because core overclock works. Memory bandwidth bottleneck isn't a hard capped bottleneck like the gpu core. From my experience overclocking 4870's memory from stock by over 10% returns less than 1% fps gain. So in comparison yes the 5870 is relatively bandwidth starved compared to 4870/ 4890. In batman it highlights memory's importance. core 9% + memory 12% increase fps by 10.5%
Jimmy - what situation maxes out memory bandwidth? Highest texture quality + quality AA filtering? I'm not sure if this maximizes the need for bandwidth. I know it loads up a greater amount of video memory potentially maximizing capacity, but how do you go about maxing out bandwidth? Is there a special test?
So you DO think it has a memory bottleneck? IMO it could definitely use faster memory. IDC @ anandtech says, "f you can overclock the GPU cores and see a performance improvement that exceeds that which comes from increasing the memory clocks then that is about as close to proof you are going to get that your compute system is not memory bandwidth constrained. "
[SIGPIC][/SIGPIC]Bring... bring the amber lamps.
I don't know. Maybe look for games benchmarks where 4870 outperformed 4850 by significantly more than 20%. 16xAF could likely be one scenario. Colorfill test could show up limitations here too: http://www.techreport.com/articles.x/17618/6
Well the gpu core is doing the actual rendering work so increasing that usually gains more unless you are really badly bottlenecked by slow memory.
Last edited by JimmyH; 09-29-2009 at 08:49 AM.
Has it been confirmed that all cards can have their voltages adjusted if flashed with the ASUS bios?
Computer: i7 2600k @ 4.7Ghz | Asus P8P67 Evo | Corsair LP 16gb 1600CL9 | Silver Arrow | Essence STX | Crucial m4 128gb | Silverstone Raven 3|
Video: 2x Sapphire 6950 Toxic 2gb @ 6970 Switch @ 880 / 1350 | Asus VG248QE |
Audio: ODAC+O2 => JH|13 Pro | STX => ATH-AD700X / Audioengine A5
what cpu speed did they test with?
might stop scaling cause cpu is limiting?
thx JimmyH for the 4870 results!
so reducing 4870s bandwidth by 88% to the same bw/compute ratio as a 5870 results in a mere 20% performance drop. sounds like yet another hint that 5870 is NOT held back a lot by memory bandwidth...
i think hes saying the thread count is NOT a limitation cause even if all parts of the gpu are fully loaded its only using 30% of the max possible threads the dispatch processor can coordinate. and it can handle that many threads cause in xfire one dispatch processor apparently runs as master and oversees the threads running on all gpus in the system, hence the hint that in quad gpu configs the thread dispatch MIGHT limit.
3870 to 4870 was a 150% shader unit boost that resulted in a 100% performance boost. this time we have a 100% boost of not only shader units but tmus and rops too! yet the perf boost is only 40% or even less in some cases... that would be as if 4870 would only have been 60% faster with a 150% logic boost instead of 100% faster. theres def something limiting...
l1 to l2 cache bw... interesting!
was looking for 770 figures but couldnt find any...
l1 to l2 barely increased at all... but then again, doesnt each 5way processor or alu or whatever you wanna call it its own L1? and each group of those shares the l2 right? the grouping hasnt changed, so then l1 to l2 bandwidth actually shouldnt matter and could have remained the same...
maybe it actually is if you normalize those numbers clockspeed wise for 770 and 870?
it depends on the what it is doing .you can see here Ati is the intel of synthetic benchmarks.http://www.bit-tech.net/hardware/gra...ture-review/10. the wider you make a vector, the harder it is to keep under full load.
on average one flop takes 1 byte per second of memory performance. this translates to 2 terabytes per second of required bandwidth for rv870 so every gpu made is bottlenecked from this. the only way is to further reduce the memory operation to calculation ratio. its already 100:1 but it must go higher.
Then about the performance of the HD5870. It does not seem to me memory bandwith limited. But i also think there is more performance inside this core then we see now. It might be driver related. It wont surprise me if we get up to 20% higher performance in the future.
As the 5850 is out i think its clear that something is holding the 5870 back, but what ?
http://www.anandtech.com/video/showdoc.aspx?i=3650&p=14Conclusion
When you take the Cypress based Radeon HD 5870 and cut out 2 SIMDs and 15% of the clock speed to make a Radeon HD 5850, on paper you have a card 23% slower. In practice, that difference is only between 10% and 15% depending on the resolution. What’s not a theory is AMD’s pricing: they may have cut off 15% of the performance to make the 5850, but they have also cut the price by well more than 15%; 31% to be precise.
CM-Stalker Evga 8800GTX
Gigabyte GA-P35-DQ6 Q6600 3,8Ghz 726B
Crucial DDR2 BallistiX PC6400 4x1 Gig
Black ice Extreme II 4x Titan 120mm APOGEEGT
Thermaltake Toughpower 850W
Bookmarks