meh, its well known that gpuz uses a database to identify the card, it does not magically read the card and know what everything is, w1zzard himself has said the database is simply outdated and that rv770 data is wrong
Printable View
meh, its well known that gpuz uses a database to identify the card, it does not magically read the card and know what everything is, w1zzard himself has said the database is simply outdated and that rv770 data is wrong
No, ORBR is more accurate in this case. The duplicate transfer of textures from the host to the GPU happens over the PCIe bus and it doesn't happen every frame. So even if the same transfer occurs to both cards it doesn't have anything to do with the on-board bandwidth usage.
What he's referring to is the use of the on-board bandwidth to render the frame. The two cards are not doing "identical tasks". They are processing different geometry and rendering different framebuffers for their individual frame. However, there may be instances where the two cards generate the same render target or other buffers that normally would be reused in a single-card scenario which is a bit of a waste.
In any case both 256-bit buses are being used somewhat in parallel to process two different frames at once. So it is closer to the bandwidth of a 512-bit bus than it is to a single 256-bit one.
It's very simple really. Compare 9600GT-SLI to an 8800GTS-512. The former is much faster even though the number of processing units is the same. Why? Double bandwidth.
No, diltech is correct, the data is mirrored on both memory banks, so while the cards do not render the same frame, when using sli you really only get the memory bandwidth and pool of one card
Nope. The g94 gpu is simply far more optimised for current games, even in single card it is just a few frame behind the 8800gt as long as you don't crank up too much detail. Once you hit the high detail, the single gts will win, which concludes that you aren't doubling the bandwidth because if you were the g94s would have get better frames at max details
Gah! You guys are confusing the duplication of data on the framebuffer with memory bandwidth. They are two completely separate things!
I'm not sure how else to explain it but I'll try.
The duplication happens when the host CPU sends texture and other reference data to the GPU over the PCIe bus. This happens only when needed...if a texture is already in local memory on the GPU it's not resent. At this point both cards have the same data.
Bandwidth however is used to do the work to actually render the frame. Reading texture data into the chip, writing buffers back out to memory etc. This is not duplicate work as it's being done for two different frames. This is stuff a single GPU would have to do twice anyway so it's not wasted bandwidth.
No, G92 is just bandwidth limited. Show me where a single G92 is faster than SLI 9600GT at any settings in a title that scales with SLI.
http://techreport.com/articles.x/14168/8
It already is well known that g92 is bandwidth limited, but check this review out
http://www.neoseeker.com/Articles/Ha...it9600gtsonic/
With exception of world in conflict (where the gts beats the sli in every test), in general the sli wins until hit the very high resolutions with AA (such as with crysis) and then the gts single card wins
http://i.neoseeker.com/a/palit9600gtsonic/cry_bar.png
Quote:
memory bandwidth are the weaknesses of G92.
one of these statements is wrong.Quote:
the single gts will win, which concludes that you aren't doubling the bandwidth because if you were the g94s would have get better frames at max details
why does a single gts beat the 9600gt sli at high settings :wasntme:?
ie. @16x12 AA=4...9600gtsli is bottlenecked by something it seems to me.
i still say, (because those who know more than me have said it) that 9600gt's are bandwidth limited, even when you have 2 of them
even vs an 8800gt, or a gts or a 9800gtx or particularly 8800gtx/ultra, and new 512 bit cardies. :)
sli gives you an fps boost, but there seems to be a bottleneck@ high res+AA, and i'm not sure what it is.
i thought it was the bandwidth.
people have rabitted on about the bus also, but i dont really understand it.
all i know is i dont like busses because they get stuck in traffic :lol:
no, the 9600gt isn't bandwidth limited, why would it be? Think about it, it performs at its full potential and is almost at the level of the 8800gt with just over half the shaders but same bandwidth
But either way, it doesn't matter because this is an ATI thread
well it is limited by something; i dont care whether you call it kentucky fried chicken limited.Quote:
no, the 9600gt isn't bandwidth limited
think about what?Quote:
Think about it
youve provided the evidence that 9600gt is bottlenecked at 16x12 4xAA and yet you still have not provided the explanation.
but dont worry about actually answering; just say it's an ati thread:yawn2:
Its limited by the shader power (well I shouldn't really say limited as its at full potential pretty much), 2*64 shader cards in SLI =! 128 shaders in single card, sli just isn't that efficient yet. If you gave the 8800gts 512mb a 512bit memory interface, it would fly as it is definitely held back by memory bandwidth.
But once again, this is the ATI thread, lets drop it here, if you wish to continue, bring it up at the nvidia thread
you could have said that b4 now.Quote:
Its limited by the shader power (well I shouldn't really say limited as its at full potential pretty much), 2*64 shader cards in SLI =! 128 shaders in single card, sli just isn't that efficient yet. If you gave the 8800gts 512mb a 512bit memory interface, it would fly as it is definitely held back by memory bandwidth.
shader diplomacy? but thx for the clarification there.
This statement is correct regarding STORAGE.
This statement is incorrect regarding PROCESSING. A single card would have to do this job twice over two frames. Two cards do the same thing somewhat in parallel.Quote:
using the memory bandwidth of both cards to perform identical tasks. Seeing as how they can't share the same information from the same set of ram, they're effectively doing the same job twice.
No it is not the same as a single 256-bit memory bus. It certainly isn't the same as a single 512-bit one either but it's definitely closer.Quote:
As such, it's the same as a single 256bit memory bus, and a single set of 512mb of ram. That's why most stores make sure to specify 256bit x2 and 512mb x2. 2x256bit doesn't equal 512bit. ;)
ORBR and trinibwoy:
1. what is your explanation for the microstuttering in 3870X2 and do you expect it to be solved in 4870X2?
2. Is/will be crossfire in practice more efficient on 4870X2 than on 2 single 4870 cards?
3. If a game "X" isn't optimized for cross/X-fire in drivers, will the 4870X2 in that case automatically act as a single card (2nd gpu not being a hindrance), and performance never drop to below that of a single 4870?
Many thanks in advance for answering. :)
Heh, if we had the answers to those questions everybody else would too. At this point nobody knows what AMD has done to improve on the 3870X2 formula.
1. There are a lot of threads explaining the issue. Essentially the frames rendered by the individual GPU's are not distributed evenly either in terms of game time or the length of time they remain on-screen. This is an inherent issue with AFR and the only true solution is to have all the GPU's co-operating on the same frame (Supertiling, SFR etc). Of course you don't get the geometry processing scaling of AFR.
2. It should be. Rumours have it that they are switching to a faster on-board PCIe interface between the chips or doing something totally different. I see no reason why two individual cards should be faster.
3. If it is AFR based then yes, if the game is not AFR friendly you will have the same issues that we have today with multi-GPU scaling.
But like I said nobody has the answers at this point.
I.m.o it makes sense that 2 individual cards would be faster (as 2 x 3870's often outperform the 3870X2) - it's a sacrifice made to get both the GPU's on a single slot, a compromise if you like which of course has advantages as well as disadvantages, for example the possibility of crossfire-like performance on boards that don't support it, and obviously Quad-Xfire potential on boards that do :up:
This of course goes for NV's dual-gpu solutions as much as the radeons.
The fact that 2 x 3870 can outperform a 3870X2 has to do with the fact that the 3870X2 uses GDDR3 at 1800 MHz. while a 3870 uses GDDR4 at 2250 MHz. and there is also a core clock difference. The 3870X2 has it's cores clocked at 825 MHz. and a 3870 has its core clocked at 775 MHz. Then you also have the fact that there is a PCIe gen 1 bridge chip on the 3870X2 and this will give 2 3870s more bandwidth to communicate if they are installed on a mobo with 2 PCIe gen 2 16x slots.
This makes 2 x 3870 faster if there is a need for bandwidth, the 3870X2 will be faster if there is more need for raw crunching power.
micro-stuttering is the result of AFR solutions, and it happens due to each gpu frames synchronization.
Example:
GPU1-> frames 1,3,5,7,9,11,13 @ 0,01sec 0,05 0,09 0,13, 0,17 0,21 0,25
GPU2-> frames 2,4,6,8,10,12,14 @ 0,02sec, 0,06 ,010 0,14 0,18 0,22 0,26
If you come to notice, there is a 0,03sec gap between GPU1 and GPU2 frames, and that's the micro stuttering thing, while you have 2 frames displayed by each gpu one imediatly after another, but then you have a 0,03 gap.
This is just a quick example nothing really deep.
Read what I wrote again. How is "the CPU delivering the processing data too fast" different to "the frames are not distributed evenly in game time" ? Obviously game time is controlled by the CPU. Don't argue just for arguing sake...a little reading comprehension goes a long way.
Am i right in thinking that if they 'solved' this problem, and sync'd the frames, there would likely be a sizable performance hit? Reminds me of V-sync.
If that's the case, perhaps they just don't want to do this, with SLI and XF scaling already being far from perfect.
I hope we mean both the same thing.
I've made ton of graphs visualizing microstuttering, and it appears in cascades. For example the frametime-differences alternate beetween 10 and 60ms.
this stutters
http://www.abload.de/img/crysis_cf_on_2t9w.jpg
this not
http://www.abload.de/img/crysis_cf_on_34r1.jpg
Because the unevenly distributed frames are a result of the too fast cpu :yepp:
cold is the result of viruses...i don't have a problem understanding this
but this is a way too ot for now
Hopefully these diagrams can make it a bit clearer for those who are interested.
http://img301.imageshack.us/img301/7051/gpuafrbe0.jpg
http://img119.imageshack.us/img119/3627/cpuafriy8.jpg
It is clear with different frame times.
The only thing nobody explains - why would the frames be so different all the time in the middle of game? I would understand they differ at the moments of begining of changes of picture like new big object appears on the screen. But! next frame would still have it! If you say that object changed position/size and next frame would require more processing? -Ok. But it supposes to remain on next frame(s)! So time difference should spread over all consequent frames pretty evenly...
So it should not stutter?!
Apples and oranges here but, as AMD are practically masters of interconnects would it be out of the question that the graphics group has leveraged some of that knowledge to improve core to core communication? They are the same company after all.
Depends on the game. In something like a racing game where the game world is going by rapidly and the view is changing at a high speed the stuttering can be very noticeable. In an RTS or RPG and most FPS's it won't be that noticeable. And the issue isnt about how long it takes to render the frame on each GPU. It's about the timing of when frames are delivered to each GPU and how long each rendered frame stays on screen.
The problem with it isn't only that you notice it. The other big problem is that the big fps numbers are false. In some scenarios (highly GPU limited) you're getting essentially the same frame rendered twice in a row followed by a long delay then two similiar frames rendered in short time etc. That's why some people find single GPU play "smoother" even though the FPS numbers are lower.
AFR doesn't work well in extremely GPU (frames thrown away, stuttering) or CPU limited situations (no scaling). The sweet spot is somewhere in the middle (my second diagram above).
Sry if this has been posted but,its a 100 pages thread :)
Does anyone know will accelero S1 fit on 4870 since it fits on 3870?
the mounting seems to be the same as the 3870 so i would have to say yes
Is the 4870 coming out on june 25th as well?????
And still nothing about the 70. No leaked bench, no nothing. :(
lol there are allready guys that have bought the 4850 and have them in there hands, head over to 3dcenter or hwluxx forum. :p:
http://www.forumdeluxx.de/forum/show...495808&page=62
He meant the 4870 i think but thanks for the link
Looks like 4850's are in a lot of retailers hands already, some just want an early head start on them sales ;)
I dont see how the 4870 would launch at the same time as the 4850, no pics no benchs, i find it hard to believe that it will launch the same time.
well we will se right :)?
New GPU-Z: http://www.techpowerup.com/downloads...-Z_v0.2.4.html
Quote:
Originally Posted by GPU-Z 0.2.4
me2, that why i posted it here. I hope someone w/ a HD4800 card will use it and show a validation/ss.
I suspect vSync fixes it, i only noticed micro stuttering when I had 8800GT SLI, Crysis @ very high vista x64 40 fps seemed like 18fps, weird thing :s
I didn't try vSync though, although pasting 60fps, micro-stuttering isn't noticed because frames displayed are enough to keep fluency.
All this talk about dual GPU cards brings back memories of my old Rage Fury Maxx. I thought we were on to something then, putting two GPUs on a card. Guess it took a couple years longer till the idea came around :)
the idea is even older then the rage maxx -> just look at the voodoo chips, or custom military grade flight simulators. :D
Looks like jimmyz has the updated GPU-Z shot + oc results in his review thread
W1z updated GPU-Z to correctly read 4850 and he's of the belief it is indeed 800SP's...
yumi waiting two 4870 XD
hmmmmmmmm
x58/nehalem and 2 x 4870x2 looks to be my next upgrade
New GPU-Z detects 800SP: http://www.techpowerup.com/gpuz/mq8dr/
But AFAIK W1zzard has got his HD4850 yet, so he should be able to detect it.
I'm going to break it down for both of you....
When using AFR, with 2x256bit each frame is limited to the 256bit bus of the gpu rendering it, just like a single 256bit bus. That's plain and simple. If a frame needs more bandwidth then the single 256bit bus offers, guess what happens? The gpu Chokes, on you guessed it, it's single 256bit bus. As such, even in AFR, 2x256bit is NOTHING like a 512bit bus.... With a 512bit bus, it has the full bandwidth available on every single frame.
See the difference? No matter how you break it down, or what claim you make, the fact is that 2x256bit just will not function like a single 512bit but rather closer to a single 256bit bus.
quite interesting that they droped tiling, cause there you would have a bandwidth advantage.
Actually the cards are being produced right now because of the gddr5 memory delay, so don't worry about ati, their schedule is very strict and it looks like they will succeed to produce the cards in time. There were samples of 4850 in late May, so it looks that the only reason for this delay is the memory.
P.S. Are there yet any predictions for the difference between 4850 and 4870 on same gpu clocks?
i thought with afr that one gpu was rendering each alternate frame :)
so you have the potential for increased fps using "afr", but you also have all the limitations of each card for each frame that is rendered. i think that is how it works.
am i right? or wrong?
based on what aliG said i think it comes down to shader power, i dont know where the choke point is reached with bandwidth and frame buffer size, and what exactly happens when frame buffer is exceded and or bandwidth exceeded? i guess youll get a lag/judder or somesuch
must be some benefit to having 512bit bus and 1gb ram and heaps more shaders in the newer cards. :)
assuming a non gpu bound situation you are obviously gunna get more fps if the cpu can feed the gpu more info.
as for microstutter, are there any ways of choosing diffrent settings to eliminate them:shrug: diffrent drivers or something? anyone fixed their ms problem/s?
getting some more fps is all well and good when it comes to a dual card setup, but only so long as you are happy with the limitations of each single card's abilities / choking points / bottlenecks....shader power /core speeds / etc. wtf am i talking about?; i got no clue:rofl: but if someone would like to enlighten me im all ears :)
microstuttering can't be fixed with just a simple driver change (at least not with today's drivers, probably at some point nvidia and ati will find a way around that through software), for now only design changes by ati or nvidia can eliminate it
About stutter again -
Then, does that mean that stuttoring usually looks like single jumps time to time?
I thought frames are being delivered to the screen as they ready (if sync waiting is off, of cource).Quote:
And the issue isnt about how long it takes to render the frame on each GPU. It's about the timing of when frames are delivered to each GPU and how long each rendered frame stays on screen.
just wanna ask, how much power does the 4870 consume? will it be needing 2x6-pins or 1x6-pin + 1x8-pin?
2 6-pins is the official design, although I suspect vendors might use GDDR4 to cut costs and with less power usage 1 6-pin would be okay.
Looking forward to 48x0 launch, as are all the Nvidia users after those GTX 280 and 260 numbers :p:
Perkam
4850 is also with 2x6pin connectors, although is equipped with drr3 memory, so there is no way 4870 to be with only one :). I'm quite sad, that ati left behind the 6+8pin configuration. It looks like it brings bad luck to GTX 280, just like R600 a year ago :).
They shall start pray and go to fire a candle in order ati to fail in their effort :).
"The other thing I have to say before I wrap this all up is that I’ve tested the HD 4850, and I’ve tested it in Crossfire. Now, if I hadn’t tested those cards I may have been more impressed with the GTX 280, but I have. I’ve seen the performance figures the cards put out. We also know the price on a pair of HD 4850s is going to be under $600 AUD, while the new GTX 280 in stock form seems to be launching at the absolute cheapest in Australia in the low $700 AUD area. Ouch."
- Tweaktown...
http://www.tweaktown.com/reviews/146...ion/index.html
ps. Forgot the link. Quote from last page http://www.tweaktown.com/reviews/146...hts/index.html
<---
Nahh I just have that avatar because I dislike Nvidia's marketing program :D
I hope some 4850's come with GDDR4, would make for killer "budget" cards :up:
Well i don't think so.
If i'm correct GDDR4 were quite expensive and weren't much better tan gddr3, thats why allmost nobody uses them...
anyone got me adresses from chinese / japan / taiwan webshops which got the 4k series already in stock ? :D
Hello guys im new :D
Does anyone knows something about the claims that the new Ati cards will have phsics capabilities?? I think i heard from some slide from amd that it will, what is it known about this?
It´s looking good for the Ati camp xD
lol mascaras beat me by seconds. :p:
In that diagram pay attention to "CrossfireX Sideport".
On the right side of the image.
What it is? New technologies to crossfire?
thats not a official pic, its by Edison and how he thinks RV770 looks like.
still no pics of the 4870 guys?
That's only a mockup?
Well if true, compare that to:
http://www.pcworldtech.com/Categorie...s/image004.jpg
CrossFireX Sideport... probably used as a dedicated lane/path for chip to chip communication? Might be what R700 has in store for us, as there were hints that R700 was designed early on for multi-GPU support, so a dedicated area of the chip for it might make sense
Anyways, 4 paths on the ringbus (4 x 64-bit memory controllers) it looks like each with RBE's (4 x 4 = 16 ROPS/RBES). But it looks like they beefed up the RBEs too cause each has 4 units instead of 2 within them.
Shaders look to be set in 10 arrays of 16 ALU's (x5 each) for 80 SP's per lane so that means there are 40 TMU's. Looks like the 4:1 ALU:TEX ratio is maintained from R600 but the raw units are significantly beefed up.
If that figure is indeed true, it could be amazing! I was definitely thinking 32 TMUs as well w/ w0mbat but at the same time, 32 TMU's suggests 8 arrays of 20 ALUs which seems a little "long"
I don't know anything, in fact I am probably just as creditable as "le Fud" or "Teh Inq", however the way I see it would be that the HD4850 series would be more or less equal in terms of performance as the 260GT series.
The 1GB HD4870 would be within 5%-10% (slower) than the GTX280.
HOWEVER
The ATi cards would be a lot cheaper, I mean at the moment you are talking £460 for a GTX280. I estimate the Radeon 4870 1GB GDDR5 will be around £350
The only reservations I have when it comes to turning from Green to Red (yep call me a shameless nVidia fanboi if you must) are.
1) Poor FSAA performance. The R6x0 series cards were dogged with FSAA performance issues.
2) Lack of Physx support, yes it is not really a big deal now.....but will it become one?
3) ATi compatibility issues, this might sound like a mindless fanboy rant here but back in the day of my Radeon 9700 I had a few issues with games due to a lack of support by the Devs for ATi cards (SimCity 4 and Metal Gear Solid). Later the X1800XT had issues to, due to not quite fully supporting Shader Model 3 (a lack of Vertex Texture Fetching). I just dont want to end up getting a card which lacks dev support or something...
4)Heat and Noise, lets face it the X2900XT cards were blimin hot and noisy.
Although that sounded pretty much like my Green Side talking, the ATi cards do have a lot of positives.
1) DirectX 10.1. Yes it is not a big deal now...but it will be....soon?
2) If rumours are correct, design should be more elegant and less power consumption.
3) Crossfire.. Yes I know Crossfire is pretty much the Devil's poison but my Intel X38 Chipset supports Crossfire...and it means I might be able to buy another card further down the line to enjoy this:) (Thanks to the evil Green Team locking their SLi in a cupboard marked nForce only).
4) Image Quality, Traditionally speaking ATi have always had better quality for Videos and FSAA and Texture Filtering...the G92 did kind of perhaps level this however I am sure that ATi with their double precision will regain the Image Quality Crown.
At the moment I am close to ordering a GTX280, (Zotac AMP), however this shameless nVidia Fanboy may change back to the red team....and is eagerly awaiting the reviews for the HD4870 1GB (DDR5)
Speaking of which....when are the reviewers allowed to remove their gags and give us something to read and salivate over?
Here endeth my mindless waffling
John
Simple mathematics
several fillrate tests from 4850 in the wild show ~19500MT/s and clock of the Card (TMUs) is 625Mhz
with 32TMUs and 625mhz you get a theoretical fillrate of 20.000MT/s
with 40TMUs at 625mhz you get 25.000MT/s.
i dont think RV770 is that inefficent. :p:
Can't say too much due to NDA but it seems like ATi's press release of 30-40% better than G92 GT for the 4850 are correct, not enough time to bench thoroughly due to a PSU brown out.
Thanks for the tip about the TIM, was necessary :down: and did help!
BTW we should have new nVidia cards soon:cool:.
I had a feeling from when andreas said there would be 800 shaders it would have 40 TMUs, simply because you'd be running into retarded alu : text ratios, that and I can't see ati changing the architecture that much as the r600 was designed to be modular and highly scalable, there would be no need to create new shader array structures etc
shh! 8 TMUs are a sleep, that's why you only get 19500 MT/S instead of 25000 MT/S:p:
I honestly think that the card has 40 TMUs, for one its kinda hard to fake a diagram like that, and since there are 10 shader arrays, that would mean there are 10 clusters of 3.2 TMUs:shocked:, which I don't think is likely. That, and ati would further weaken their tex : alu ratio if they stuck with 32 TMUs. If the 480 number was correct, I could see 32 TMUs working out as that would actually increase the ratio from 1:4 to 1:3, but since there are 800, 40 makes more sense