Doesn't look too bad so far. Have to keep in mind it's early silicon.
ATi is making good gains in the value segment, which helps a lot.
Doesn't look too bad so far. Have to keep in mind it's early silicon.
ATi is making good gains in the value segment, which helps a lot.
Wrong. You just ate the marketing raw. Does the GTX280 have 480?
800 sounds better than 160, specially if your competitor got 240.
Your 3870 outperform the 9600GT because they are 1:2 SPs and the 3870 are 1:5. So even the doubled clock on 9600GT leaves it around 1:4.
Why you think a 112-128 SP easily beat a 320 SP 3870 card? Even with doubled SP clock it would still be 256 vs 320 at peak.
It is, but not that different. And 4 GHz Yorkfield, really screws up the 3DMark06 score. Compare a Q6600 at stock with a 2900XT, and then with it clocked at 4 GHz. The difference is obvious to anyone.
R6xx and high CPU OC rapes my system in 3DMark06, but in games, at 2560 * 1600 and with just 4 * AA / 16 * AF, that is not so... 3DMark06 sucks for compare to anything but same GPU architecture, and CPU ... Not any good comparisson with games and high quality settings and resolutions.
BS!Quote:
IMO nobody should be benching cards on '06 anymore; it is 99% CPU dependent
anyway, gaming performance needed. let's see AA/AF in action with 4800 series please :lol:
Why does a 112SP easily beat a 320SP 3870 card? Well, it doesn't. http://www.firingsquad.com/hardware/...ance/page8.asp
The HD 3870 has more shading power than a 8800GT, as seen by the Perlin Noise test.
If we take the theoretical....
8800GT = 112SP * 1.5 = 168
HD 3870 = 320SP * .775 = 248
HD 3870 should be 47% faster, in real world it is 16% faster. So yes, R600's 320 stream processors are a bit less efficient than those of G92. But by no means is it fair to compare R600 as having 64 SPs and G92 as 128. It certainly is a means of comparison, but it does not at all reflect real world performance. It is much more accurate to describe the HD 3870 as 320 SP @ 775MHz than describe it as 64SP @ 775MHz, in comparison to the 8800GT's 112SP @ 1.5GHz.
Nice FUD ... haven't you understood the numbers?
ATI's way of counting SP is in terms of execution units = 320, regardless of structure, which really is 64 SP * 5 EU = 320 EU total
NVIDIA was more correct and only counted SP and not EU = 128, but if counted like ATI did it, then the number is 128 SP * 2 EU = 256 EU... If the "missing MULL" is counted with nVIDIA, then it is 384 EU, but that one is shadowed area, and shouldn't be counted, as I have yet to see it to be effective.
Do the math and clocks with these numbers... but also remember shaders aren't everything, and only one part of effective performance, relative to the program executed.
By your numbers, R600 destroys G80... but that isn't the reality, now is it? Especially, when both solutions are pressed: 4 MP resolutions, and with maxed AA and AF... or even just 4 * MSAA + 16 * AF.
then how can u boot up something that uses streams and get 320 simultaneous operations, it counts them like a cell, NV counts per unit snce it cant always execute 2 operation at a time (when acting like a pixle shader or using cuda)
NV can score better since it has faster use of emulated vertex shaders it has a much faster texture fill rate than its pixle fill rate so when u have something optimized for 128 shaders then also using texture overlays instead of rendering from scratch then the NV card will score much better, but if u have full 3d rendering and texture creation (3d mark 06, maya, cad, solid works, COJ, mass effect somewhat) then ati will score much better than the NV card. thats why vantage scores better on NV it uses vary little texture rendering on gpu but handles 3d interactions on cpu/ppu
I guess this pretty much confirms the 800 shader units.
Sweet.
Hmm. Where do we start...I'll use the R600 as baseline since you seem to be familiar with that. The R600 had 64 Shader Processors, with each Shader processor having 5 scalar ALUs (the Stream Processors). One of the 5 ALUs is capable of doing transcedentals(sin, cos etc.). 64*5=320(the magic number).
Based on utilization achieved by the compiler(it's generally around 80% on average nowadays, it was somewhat worse at launch), you can at worse have 64 ALUs doing work(worst case scenario, exclusively dependent serial code, doesn't happen in practice), or at best have full utilization and manage to schedule 5 ops per Shader processor and thus get 320 ALUs doing work. Each ALU is MADD capable so you get 2 FLOPS (floating point ops) per ALU.
nV's 128 ALUs for the G80 are also MADD capable, and they were also supposed to be able to coissue a MUL alongside that scalar MADD, but that never quite materialized (the GT200 is supposedly capable of doing this). For all intents and purposes they're also pegged at 2 FLOPS per ALU. What's peculiar for nV's ALUs is that they work at a far higher frequency, so you can either claim you have 128 ALUs running at that frequency, or 256 ALUs running at frequency/2. One of the advantages of nV's independent scalar ALU arrangement is that scheduling is easier, and they're less sensitive to the type of code they're running.
http://www.theinquirer.net/gb/inquir...ormance-posted
oh yeah, could be fake...:shrug:Quote:
According to the site’s post, the DAAMIT benchmarketing shows its single-GPU newborns thoroughly thrashing Graphzilla’s 9800GTX (4870) and 8800GT (4850). In the first instance, the 4870 outguns the 9800GTX in all the games with anything from a 36 per cent to 48 per cent lead, in the second instance (4850 vs. 8800GT) DAAMIT has anything from 26 per cent to a 48 per cent lead. Even if these numbers are only benchmarketing, it does look like ATI took special care on improving AA, AF and DX10 performance – which also means it’s been paying attention to its customers.
AMD has 5-way shaders, yes. They chose to count their stream processors a different way than nVidia, because in comparing G92 vs. RV670, we see that it reflects performance quite nicely.
By my numbers, R600 loses to G80 (8800GTX), but wins against the castrated parts like 8800GT / G80 GTS. In games, is this the case? No, definitely not in the case of the 8800GT. Games take into account much more than pure ALU performance. G92/G80 has more texture power and in the case of G80, more pixel power. Also, R600's SPs are not as efficient in real world usage as they are in benchmarks like Perlin Noise - the R600 design (1 "fat" SP + 4 weaker ones) proves less efficient in games than nVidia's G80 design. In theoretical ALU benchmarks like Perlin Noise, this inefficiency is not present and R600 presents itself well.
Doing the numbers your way makes no more sense than doing the numbers AMD's way. If I do what you are saying, then I compare HD 3870 with 64 SPs @ 775MHz to 9600GT with 64 SPs @ 1625MHz.... yet the 3870 wins by nearly a factor of 2x in that battle. So is that a fair comparison? Heck no.
G80/G92 and R600/RV670 have VERY different shader designs, so they deserve very different ways of counting them. AMD's way is certainly preferrable for marketing, but when you factor in that AMD's parts have less than 1/2 the clock of nVidia's, they make sense from a technical standpoint. Comparing 320 RV670 SPs @ 775MHz to 64 G94 SPs @ 1625MHz is just as valid of a comparison as it would have been to compare 16 R520 pipes @ 625MHz to 24 G70 pipes @ 430MHz..... in other words, not a good comparison because of the different architecture, but a fair one nevertheless.
When will people learn that shaders across architectures aren't meant to be literally compared and that not all shaders are made equal!
Great explanations Extelleron and Morgoth
Thank you, that is the kind of insight, I have seen from the Beyond3D forum :)
Sorry, I didn't notice your comparison (G92 vs RV670).
I only assumed G80 A2 (GTX) / G80 A3 (Ultra) vs ATI's offers at that time. I'll be the first to admit that G92 has its flaws vs G80 A2 / A3. I can still find benchmarks, especially with Crysis, where my two Ultras destroys 2 * G92 (GX2) ... 2 * G80 A3 vs 4 * G92.
My apologizes :yepp:
You can't compare the G80's shaders with R6xx ones. RV670/R600 only has 64 shader processors "equivalent" to the G80/G92 ones, and nVidia has them at twice the clock speed. It's everything so different that you can't compare them that way. Besides, the benchmarks' score won't tell you the exact ammount of shaders.
I don't remember where I saw that, but I remember seeing some calculations proving that if ATI said it could do 1 TeraFlop, the RV770 had to have 800SPs for the numbers to match. Maybe it was in Beyond3D forums.. If I find it again I'll post in here.
We know that the 4850 comes stock at 625 MHz
ATI Flops: Frequency * SP count * 2 (for MADD)
so 625 * 480 * 2 = 600GFlops
625 * 800 * 2 = 1000 GFlops = exactly 1 Tflop
The only way 480SPs reaches 1TFlop is if it has a ~1000MHz shader clock but we do not have any evidence it still has a shader domain
Fair enough, but aren't you sure it wasn't the R700 / 2 * RV770? Even if ATI could provide that theoretical max in terms of flops, the reality of it would hard would be same. No architecture could ever get 100% of theoretical max, and ATI's architecture is very limited in this regard.
But then again, ATI still claims 320 / 480 SP vs NVIDIA's 128 / 240 SP from papers, and that alone should tell anyone, that ATI is full of it...
As soon as they called their part 320 SP vs nVIDIAs G80 GTX / Ultra's 128 SP... then anyone should know the depths this company has been lowered to, in terms of PR / FUD.
In terms of single card / single GPU, I would love ATI to be competitive in terms of raw performance and effective as well, but that is not going to happe, and thus this can never be a "R300 Deja-Vu" or even something like it.
ATI may well have the performance vs cost, in some markets, but in terms of raw effective performance, I can't see how nVIDIA can't win with the GTX 280.
If anyone wonders, then yes, only the best in terms of performance per GPU chip matters to me, no matter the cost. The GTX 280 bought from the US with tax, is about %60 the cost of each of my current Ultras...
That is my personal view and wants, but I respect and accept others see it differently :)
they call them stream processors anyways, so its not like they're using the same term
And 800 SP's is the only way 1TFlop works for the 4850 at those clocks
And while I want the best performing GPU because i run at 2560x1600, only Crysis is limited at all at this moment
when u talk about the shaders though NV isnt always doing 2 operation, and its more like hyper threading, while ati is always doing the 5 per physical unit, i always thought that ati has something equivalent to a multi core shader like a cell, and when i asked their rep at "heroes happen" he also said like a cell (thats were i got that from) so i thought that it didnt matter how they got there so long as they all were working at the same time
u wouldent say that a p4 HT had 2 cores but u would say that a 6 stream cell has 6 cores
and NV says SP = shader processor, ati and MS says SP = stream processor
How does Supreme Commander and Call of Juarez work for you at that resolution, with HQ settings (in game maxed + 4 * MSAA / 16 * AF)...:)
Not that I really disagree with your point. Aside from those 3 games, I have wonder why I would want to upgrade my two XFX 8800 Ultra XXX 685M.
Still, I want more... and the new offerings from NVIDIA and ATI + Intel's Nehalem, is the closest alternative for more than a year to what I have now.