Horrible ati yeilds (3%!!)

Printable View

07-23-2005, 04:48 PM
P_1

Horrible ati yeilds (3%!!)

http://www.vr-zone.com/?i=2487&s=1 :eek:
07-23-2005, 05:04 PM
crodan85

Thats bad and this is after they retaped it about 3 times
07-23-2005, 05:16 PM
freecableguy

Mother f@#&*@(!
07-23-2005, 05:35 PM
[FoRcE]

What seems to be the problem with ATI not being able to turn the chips out?
07-23-2005, 05:35 PM
Disposibleteen

Wow, this is really bad, and all this time Nvidia is just basking in the sun of domination.
07-23-2005, 05:37 PM
charlie

whew!!! And to think, I almost bought some ATI stock (atyt) this week... will wait til it hits 11 now.

C
07-23-2005, 06:09 PM
turtle

This is kinda old news...Thought it had been posted already. There's been lots of discussion about it at B3D and R3D.

The idea is that R520 has been 16 pipes all along, and peeps that thought elsewise are no covering their bases...IE expect a similar article at the inq soon. This has also been speculated on...I don't know who released it but they've been revered to as fact for quite some time...althought noone knows what they mean:

R520: 16-1-1-1
R530: 4-1-3-2
R580: 16-1-3-1
R515: 4-1-1-1

Assumption is the first number is pipes, third number is TMU's, last is Z/stencil All should do 3 ALU's per pipe, and R530 is 50% the speed of R520...Meaning R580 could be wickid fast.

Also things to consider are that ATi has said on a couple different occassions to stop thinking about pipes in the traditional sense from here on out, including not only R500 (C1,Xenos) but also R520 and R580. If it processes 3 ALUs per pipe instead of 2, this makes the pipes 1.33x more effective, which has been something that's been rumored all along. It could also contain unified architecture ala 24 programable pipes (16pp + 8vsu)...althought unlikely. Also take on top of that G70 is still 16 rops, just 24 pixel shader pipelines. R520 is prolly 16pp(x1.3) and 16rops.

Furthermore baseline assuming regular pipes:

7800GTX (430Mhz /24p): 10.32 Gigapixel/s fill rate
X900XT:PE (700Mhz /16p): 11.2 Gigapixel/s fill rate

Don't count them out yet guys...Even if it's 16 pipes, which it may have been all along...It could still turn out faster than G70. Also remember R580 is supposed to be more than just a little refresh. There's a lot more to ATi's architecture than just the pipe number...It's such a rediculous thing to base performance on. Granted, it may suck, or if it is better than G70 the 90nm G71 may be better than it...who knows...It may even be 24/32...I havn't a clue. But the thing is we all really don't know until we see some numbers. The only thing that we have in that respect is numerous reports from independant sources of it scoring over 10,000 in 3dmark05, and that surely is better than G70. Some may say this was a part with more pipelines, I don't think so. I think it's the same product that will end up being released.

That being said...I still think this is it's launch date:
http://www.gecube.com/support-events-detail.php?id=321
http://www.sydneyshowground.com.au/i...86&SectionID=1
07-23-2005, 08:29 PM
Reznik Akime

Although I am an Nvidia fan, I think that ATI is going to mop the floor with Nvidia this round performance wise. Turtle's post makes me think to how AMD and Intel are. Intel seems to focus on one thing while AMD wants to focus on the grand picture and optimize everything.

Thats what seems to be the case here. Nvidia is just focusing on small smudges while ATI is wanting to repaint the picture and I say all for it.

Just how I see things as of now.
07-23-2005, 08:38 PM
$a1Ty

it doesnt matter if ATI will have a better product, if they cant produce enough of it and get it to the consumer
07-23-2005, 08:50 PM
Cybercat

3 ALUs is quite a lot. The G70 has two, but then it also has a second shader unit (akin to the original NV40 architecture).
07-23-2005, 09:12 PM
turtle

Latest rumor...expect this to show up on the INQ shortly!
__________________________________________________ ___
R520 ala XT:
500mhz "sample speed" (ie some vendors will clock higher)
24pipe, (was 32 but failed quads caused the cut)

R530 ala "pro"

16pipes
similar clocks

IT WILL BE SHOWN Aug 26/27 at the launch thinger in Australia.

You will not see 32 real pipes until R580 for sure.
__________________________________________________ ___
07-23-2005, 09:21 PM
einCe

fanboyism is the key to ati success
07-23-2005, 09:53 PM
turtle

lol...Hardly a fanboy...only to AMD. ;)

Just trying to share which way the wind seems to be blowing, and give both sides. If G71 kicks everything's ass, good for nvidia, and even better for us.

I'll be the first to say it...Noone (besides a select few who ain't talking) really know what's going on.
07-23-2005, 11:19 PM
Cybercat

500MHz? That's it?
07-23-2005, 11:29 PM
turtle

That's what this dude is saying:

http://www.rage3d.com/board/showthre...post1333818229

and it's been something many diff people have been saying for QUITE some time. First they gave the same specs with 32 pipes...now it's the exact same specs cut to 24 pipes because of yield problems...and apparently RV530/rv515's will be R520's with failed quads and similar clocks. It does make sense...
07-24-2005, 12:12 AM
Quanticles

Who does ATI partner their manufacturing with?
07-24-2005, 12:20 AM
turtle

I don't know exactly what you're asking...

I know Sapphire makes all the pcbs, or at least they did.
07-24-2005, 01:07 AM
craig588

Random thoughts: Why can't someone just make a card with 200 pipes or something really high like that and have the drivers automatically disable all of the failed piplines? Even if the yeilds are really low there should be a pretty good chance of at least 24 or however many they want to still be intact for their high end model. Also, do GPUs have cache?
07-24-2005, 02:53 AM
Cybercat

lol, anymore than 32 would be murder with the limits of current technology.
07-24-2005, 02:57 AM
crodan85

Quote:

Originally Posted by Quanticles

Who does ATI partner their manufacturing with?

It is tsmc same as nvidia use im not sure but I think Ati tape the chip and then they send it to tsmc.
07-24-2005, 03:58 AM
craig588

Quote:

Originally Posted by Cybercat

lol, anymore than 32 would be murder with the limits of current technology.

That's why you have all of those dead pipes disabled on a driver level automatically. Just by random chance, even if more pipes are going to be destroyed, having so many of them possible would mean that at least enogh to perform competently would be left fine. Why doesn't this work? (Other than the time it takes to design a 200 pipe chip and drivers that can figure out how to stop using a pipe if it's damaged)

I have no idea how chips are made.
07-24-2005, 04:00 AM
alexio

Quote:

Originally Posted by craig588

That's why you have all of those dead pipes disabled on a driver level automatically. Just by random chance, even if more pipes are going to be destroyed, having so many of them possible would mean that at least enogh to perform competently would be left fine. Why doesn't this work? (Other than the time it takes to design a 200 pipe chip and drivers that can figure out how to stop using a pipe if it's damaged)

I have no idea how chips are made.

bigger die = less chips per wafer = more expensive chips :fact:
07-24-2005, 04:46 AM
Cybercat

200 pipes, even 50 pipes, would equate to a die the size of Texas (figuratively speaking of course). It would require massive amounts of power, create an insane amount of heat (even with a majority of the pipes disabled), and, like someone said before me, it would be very expensive. The raw complexity of a chip with that many pipes would be beyond the limits of current fabrication techniques. You know how many transistors that would come out to? The word impractical doesn't begin to describe the problem with such a design. Not to mention, the more pipes you try to squeeze into a chip, the higher the rate of failure will be within the chip. NVIDIA didn't try to go for any more than 24, while still using a refined and well-progressed process size (110nm), and they ended up with very good yields for their chip. ATI is trying to go for 32 pipes, on a process that hasn't even been tested as a mass production vessel, and that's precisely why they're having so many problems. They pretty much jumped-the-gun with the best they could afford, in an effort to push clockspeed and fillrates to another level and beat out NVIDIA in much the same manner as they have in the past. This ended up being a double-edged sword for them.

in short, more pipes = bad
a conservative number of pipes = good
07-24-2005, 05:33 AM
turtle

What he said ^^^

R520 is supposed to be a big-ass chip...and that would follow the "It was supposed to have 32pipes rumour." G70 is smaller but it also was made to have 24 pipes. I imagine pipes require a lot of transistors and therefore a lot of space on chip.

I'm thinking this "Was 32pipe 500mhz but is now 24pipe 500mhz" thing is true, although i've been wrong in the past. I guess we'll prolly see by the end of next month, or by early September at the latest.

I was doing the calculation on fillrates though, and it looks like 24/500 would be better than 16/700 by a good chunk (12,000 as opposed to 11,200). So let's hope that's true. I know there's a lot more to it than fillrates (as I explained above possible things that could be happening) but higher ones never hurt. Also if it's 500mhz on a 90nm process, who knows, maybe it has room to overclock...although with 32pipes on-board (although 2 failed quads), they may cut down the abililty for it to overclock well. It just looks to me like it's going to be a little bit faster than G70 on paper if this is true, but again, who knows how "optimized" the pipes are in terms of ALUs, TMU's, ROPS, ect.

I also was thinking about this:

16x500 = 8,000 (rumoured "pro" part)
400x20 = 8,000 (confirmed 7800gt clock)

Makes sense kinda, doesn't it? Both will prolly be in the 600-650 area in mem (in theory) as well, although ATi's might be higher than the 550 stock Nvidia's will be. Like I said, it's all going to be about which chip/mem combo has more headroom in that sector, RV530 vs 7800gt. It looks like R520 will be a little faster than G70 no matter what, without a doubt...Just don't expect miracles. Of course, that's again just my speculation. It may be different tmw when we have a whole new set of info tmw. :rolleyes:
07-24-2005, 05:42 AM
Cybercat

Call me naive if you will, but I just expected a lot more in terms of clockspeed with this jump to 90nm low-k. You know, closer to 600 or so.
07-24-2005, 05:50 AM
turtle

Word, so did I. Here we were expecting 90nm to bring super high clocks from ATi, but 32pp and a high clock (the best of both worlds) may have been asking too much.

In theory, if it was MADE to be 16pp, the room to clock to 700mhz would be there, but with a chip made to have 32pp, couldn't that hamper it's ability to have a high clock? This makes sense to me as when nvidia moved to 110nm you'd think the chip would clock a good deal better than the 130nm gf6, but it doesn't, just a little. It makes me think this is hampered by the addition of pipes. I'm not an engineer though, so don't take that as fact.

Also, obviously ATi is having yield problems. Some chips probably do clock really well, but they have to set a standard somewhere so they can get an acceptable amount of chips out to sell. Maybe it's 3% that can do 700mhz with 24pipes? Maybe 700mhz @ 16pp was possible with 16 quads not working at an acceptable, but they figured 500 @ 24pp was a optimal design that also had an acceptable yield? That would make sense then they'd release it at 500mhz if a greater yield could operate at that clock with 24pipes, and still beat G70. Who knows what they did and why they did it, maybe those 3% will become the xtpesupermegadeluxeultra's to beat the G70 ultra. That would also give you a clue how high the R520's will clock (If 3% hit target speeds...say 600/700/800...we'll obviously hit a wall in our overclocking much below that under normal cooling...but who knows what that target was...

It still looks to me though as the refresh will look more exciting.

ATi R580 - 32pp, clock prolly similar to R520 but really anyone's guess, more TMUs? (500x32?)
Nvidia G71 - 90nm G70...so 24pp, prolly a super high clock. (24x650?)
^^anyone's guess on clocks

That'll be a fun one...
07-24-2005, 06:11 AM
Cybercat

Yeah, more pipes would definitely limit the clockspeed. Think of it this way, more pipes means more transistors. More transistors means more heat, and power consumption, plus the signal integrity suffers (communication via wires between transistors). I mean if you've ever overclocked before you know that heat can be a real obstacle when you're trying to get up there in clockspeed.

What I find funny is that people who buy the RV530 are gonna have a chip with 2/3 of the pipelines that the R520 has, less transistors, while being built on the same process. Now see, last gen the X700 series had only half as many pipes as the X8x0, but they were built on a less mature process, 110nm. As the X800XL proved, this 110nm wasn't as efficient in clocking as the refined 130nm low-k process. This time around mainstream users are going to get all the clocking benefits of the highend card, without the added heat. I mean the R520 doesn't just have 16 or 24 pipes creating heat, it has all 32 contributing thermal waste, whether they're all enabled or not. People who buy the RV530 are going to be able to clock MUCH higher than the R520 can achieve, and get very similar (though not quite as much, as you observed turtle) performance, for like half the price.

NVIDIA will probably make the G72 (7600 series) on a 90nm process. I'm eager to see if this line has 16 pipes or not. Most likely 12, I'm guessing.

BTW, what are TMUs?
07-24-2005, 06:11 AM
alexio

Quote:

Originally Posted by turtle

What he said ^^^
I was doing the calculation on fillrates though, and it looks like 24/500 would be better than 16/700 by a good chunk (12,000 as opposed to 11,200). So let's hope that's true.

Well I've done some testing on my Asus 6800NU and I have to say that the card running 8 pipes @ 400 mhz is faster then 16 pipes @ 200mhz. 3dmark gives higher scores @ 8 pipes. 16 pipes is like a dual-core 8 pipeline card, it can handle more threads at the same time but it completes a thread in a slower time.

What I try to say is that the fastest card would just have 1 pipeline at an xtreme speed. That's why you can't say that 16 pipes at 700mhz is slower than 24 pipes at 500mhz. The fillrate is lower, that's for sure. But just look at the R420 vs NV40. The R420 has a much higher fillrate but isn't faster in particular.
07-24-2005, 06:23 AM
turtle

Quote:

Originally Posted by Cybercat

Yeah, more pipes would definitely limit the clockspeed. Think of it this way, more pipes means more transistors. More transistors means more heat, and power consumption, plus the signal integrity suffers (communication via wires between transistors). I mean if you've ever overclocked before you know that heat can be a real obstacle when you're trying to get up there in clockspeed.

What I find funny is that people who buy the RV530 are gonna have a chip with 2/3 of the pipelines that the R520 has, less transistors, while being built on the same process. Now see, last gen the X700 series had only half as many pipes as the X8x0, but they were built on a less mature process, 110nm. As the X800XL proved, this 110nm wasn't as efficient in clocking as the refined 130nm low-k process. This time around mainstream users are going to get all the clocking benefits of the highend card, without the added heat. I mean the R520 doesn't just have 16 or 24 pipes creating heat, it has all 32 contributing thermal waste, whether they're all enabled or not. People who buy the RV530 are going to be able to clock MUCH higher than the R520 can achieve, and get very similar (though not quite as much, as you observed turtle) performance, for like half the price.

NVIDIA will probably make the G72 (7600 series) on a 90nm process. I'm eager to see if this line has 16 pipes or not. Most likely 12, I'm guessing.

BTW, what are TMUs?

1. Your thing on transistors, heat, ect...Exactly what I was thinking. Makes perfect sense.

2. 2/3 the pipes, 16 broken pipes, 16 working pipes, same amount of transistors (RV530 sounds like it'll be a R520 with 4 broken quads, 2 more broken quads than R520, which already has 2 because it was meant to be 32pp...that's what i've gathered anyway...IF THIS WHOLE THING IS TRUE) who knows how well they'll clock. True, they may clock much better because only 4 quads have to hit that number instead of 6...who knows? Yes, it being on 90nm, just like 520 helps it, but there still will be all the "waste", at least the way I understand it at the moment...there will actually be more, % wise. My guess is RV530 will be a great mainstream part and compete against 7800gt well, and you'll pay more than the last gen high-end mainstream card. It'll prolly be $100 less than R520 or something.

3. No idea about 7600gt. Hope it kicks ass. Everyone that only had $150-$200 to spend on a gfx card last gen really got a good deal on the 6600gt, hope they keep up that tradition.

4. Texture Mapping Units.
07-24-2005, 06:40 AM
Cybercat

Eh, I really have my doubts about the rumors for the RV530. Maybe it's just R530, as the "RV" series is reserved for ~$200 parts usually. A codename like "RV510" would be more in line with what's been done in the past. Then again, with so few details on the big flagship itself, you can't fault the details on the mainstream parts for being a bit shakey at best.
07-24-2005, 06:48 AM
turtle

Alexio, you're right. In SOME instances one will be faster, and others it'll be reversed, won't it? I would think in some instances a greater fillrate would be to much an advantage, as some benchies/games R420 beats up on the NV40 rather well...like HL2 and Far Cry for example. I imagine those use the higher fill rate? Granted there are times it doesn't help it, and nv40 beats up on R420.

Admittedly, I don't know a lot about greater mhz vs more pipes. I just know the fillrate would be higher, and it seemed logical as a possible reason why they'd do it. Couldn't ATi's pipe architecture be more efficient than nvidia's when doing this comparison, especially in this upcoming generation? That would be one reason to do with 24pp@500 rather than 16@700 right?

Also 200x16 = 3200, 400x8 = 3200. They're equal. The later is more efficient, but would it be enough to make up the differance in fillrate assuming the pipe architectures are similar (11,200 as opposed to 12,000). Feel free to enlighten me...I'm intrigued.
07-24-2005, 07:01 AM
turtle

Well, I assume RV530 is the pro model...and it's supposedly confirmed that it's 16pp...who knows if it's a different chip or just just a failed R520. Considering the yields though, it'd make sense they'd use failed R520's to make the lower-end chips, as otherwise (according to the yield rumours) they'd be wasting a WHOOOLE lot of money on those wafers.

And remember, 7800GT's are prolly just gtx's with a failed quad and they're using them to use the failed yields of the gtx, and they will probably overclock pretty well...I mean, look at how close the core clockspeed is to the GTX...and even if the mem is rated at 1100, it probably uses the same stuff as GTX, so you can expect a similar clock there as well. I truely only think the 20p vs 24 pipes is going to make a 2-3% differance anyway...(based on a 5% differance of a GTX vs 6800GT/U at same clocks..16pp vs 24pp) so assuming the GT can hit similar clocks to gtx, it'll be a good deal just like the 6800gt was. Who's to say RV530 couldn't do the same, it'd especially be true under Alexio's reasoning. I do think that ATi's pipes are better than nvidia's though, and it might make a bigger differance than 5%. I imagine RV530 will clock overall just slightly better than a x850xt or something, with the same amount of pipes, but probably better pipes then that previous gen. I expect RV530 and 7800gt to fight tooth and nail, but who knows...There will obviously have to be a winner. We'll know when we see some benchies for it...We've already seen 7800gt benchies, and they're smack between 68U and 78GTX, I expect RV530 to be the same with x850xt and R520...so again, it'll be close, might come down to price and availability.

But yes, things are shakey. We should know in the next month or so though...
07-24-2005, 07:15 AM
alexio

Quote:

Originally Posted by turtle

Alexio, you're right. In SOME instances one will be faster, and others it'll be reversed, won't it? I would think in some instances a greater fillrate would be to much an advantage, as some benchies/games R420 beats up on the NV40 rather well...like HL2 and Far Cry for example. I imagine those use the higher fill rate? Granted there are times it doesn't help it, and nv40 beats up on R420.

Admittedly, I don't know a lot about greater mhz vs more pipes. I just know the fillrate would be higher, and it seemed logical as a possible reason why they'd do it. Couldn't ATi's pipe architecture be more efficient than nvidia's when doing this comparison, especially in this upcoming generation? That would be one reason to do with 24pp@500 rather than 16@700 right?

Also 200x16 = 3200, 400x8 = 3200. They're equal. The later is more efficient, but would it be enough to make up the differance in fillrate assuming the pipe architectures are similar (11,200 as opposed to 12,000). Feel free to enlighten me...I'm intrigued.

It's all about optimizing really. You can optimize a game to be the fastest at 1 pipeline but also for 24 pipelines. Todays games are made to also run on slower cards with less pipelines. This is the part where the unified shader model becomes interesting for high-end cards.

On a normal fixed 16 pipes 6 vertexes card there are one or more pipes and/or vertexes doing nothing, waithing for a new task. With the unified shadermodel every task can be handled by all the pipes and vertexes; that way there will never be a scenario where for example the vertexshaders are working there asses of while the pixel pipelines are hardly doing anything, or the opposite ofcourse.

A combination of different game executables for different types of hardware (graphicscards, cpu's, amount of RAM) and the unified shadermodel will make games run much better. For example in a game all physics are being handled by the cpu, this is fixed. If you can make an executable that gives this task to the graphicscard when the cpu has to much tasks to handle while the gpu has to little than you can achieve much better performance in the game. The first step will ofcourse be making games multithreaded for use with multi core cpu's. In the beginning this might be fixed, than for example the A.I in the game is being handled by the second core (only). Further on in the development of games this process can be made dynamic, just like the unified shadermodel.
07-24-2005, 07:45 AM
Cybercat

Alexio got me interested in doing some runs in 3DMark05 to see how the NV40 architecture scales with clockspeed. He said he scored higher at 400x8 than at 200x16. Well I've got a 6600 so I don't have the luxury of of 16 pipes but I can go lower with higher clockspeeds.

So at 200MHz with all 8 pipes enabled, this is what I got:
http://www.angryelf.org/pics/graphics/280x8.jpg

At 560MHz (my max core OC) with 4 pipes, this is what I got:
http://www.angryelf.org/pics/graphics/560x4.jpg

I was pretty surprised with the results. Having higher clockspeeds versus more pipes gives about a 19% overall improvement. Here's how they do in each game.

Return to Proxycon: 7%
Firefly Forest: 28%
Canyon Flight: 23%

It could have something to do with the architecture, or it could have something to do with software optimizations like Alexio said. Can't know for sure. But this is very interesting, and it could mean that lower end cards with 16 pipes at higher clockspeeds could perform the same or better than a 24-pipe card at lower clockspeeds (not taking into account other differences and dis/advantages).
07-24-2005, 07:51 AM
HARDCORECLOCKER

:upset:Grrrrrrrr........., me wants a X900XT PE with 32 pipes 0.09... :(

:toast:
07-24-2005, 07:53 AM
alexio

Can you run at 200mhz 8 pipes and at 400mhz 4 pipes for a better comparation? The results a got when I tested 200mhz 16 pipes vs 400mhz 8 pipes was with a card having some problems so I'm not 100% sure that I got the right scores. I don't have 3dmark on my hdd right now so can you please test it for me?
07-24-2005, 07:56 AM
Cybercat

Why would that yield different results? 200 is half of 400, just like 280 is to 560. The scores would be lower, but the % difference should be the same.
07-24-2005, 07:59 AM
alexio

When testing with 4 pipes you should enable only 1 vertex shader and testing 8 pipes you should use 2. This is the best comparation possible. Your card normally has 3 vertexes but you can't divide 3 into whole numbers. You need to use double the vertexes at 8 pipes because the vertexes run at the same speed as the pipelines, in other words: you need double the amount at half the speed.

EDIT: I just wants to know if 3dmark can use the extra clockspeed better or the extra pipes. Like said above at half the speed you also need double the vertexes.
07-24-2005, 08:02 AM
Cybercat

Well that's something entirely different. I'll be back with the results soon.
07-24-2005, 08:26 AM
7he]-[0rr0r

As far as iv'e seen it's been rumored to be above 600mhz in fact the latest inq well rumor:

http://theinquirer.net/?article=24698
but then again its all a crap shoot till we see em
07-24-2005, 08:38 AM
WiCKeD

Quote:

Originally Posted by alexio

Well I've done some testing on my Asus 6800NU and I have to say that the card running 8 pipes @ 400 mhz is faster then 16 pipes @ 200mhz. 3dmark gives higher scores @ 8 pipes. 16 pipes is like a dual-core 8 pipeline card, it can handle more threads at the same time but it completes a thread in a slower time.

What I try to say is that the fastest card would just have 1 pipeline at an xtreme speed. That's why you can't say that 16 pipes at 700mhz is slower than 24 pipes at 500mhz. The fillrate is lower, that's for sure. But just look at the R420 vs NV40. The R420 has a much higher fillrate but isn't faster in particular.

You're forgetting that if the clockspeed isn't up there, the extra bandwidth is not being utilized. That is the main advantage of having more pipes. 200MHz is probably not enough to push 16-pipes worth of data.

Quote:

Originally Posted by crodan85

Thats bad and this is after they retaped it about 3 times

Where does it say that? They don't even list a source or when they got the info.

Quote:

Originally Posted by charlie

whew!!! And to think, I almost bought some ATI stock (atyt) this week... will wait til it hits 11 now.

C

Ya don't. They won't be turning a real profit until next year anyway.
07-24-2005, 08:53 AM
Cybercat

200x8 (2 VS)

400x4 (1 VS)

As you can see, the dividiing the vertex shaders made a very large difference to how the performance scales. Overall difference is only 3%.

Return to Proxycon: 2%
Firefly Forest: 6%
Canyon Flight: 3%
07-24-2005, 12:14 PM
WeStSiDePLaYa

looks like ati might be screwed now. no matter how fast their card is it doesnt mean :banana::banana::banana::banana: if they cant get it to the consumer.
07-24-2005, 04:24 PM
perkam

Quote:

Originally Posted by einCe

fanboyism is the key to ati success

Nods Enthusiastically :D :D :D lol

Well, the nVidiots have vanished because more often than not nowadays, Nvidia is the sensible choice...where as ATI is still the choice of the ATI Loyalist (aka fanboy) .. but at the end of the day, ppl will chose what gives the best performance.

ATI's problem right now isnt its yields, its the multiple dillemmas between 16/700 and 24/500 and what not. Once it makes a decision, it'll be ready to release the chips.

BTW Kudos to Turtle for an R520 thread that is more informative than these usually are. :up:

Perkam
07-24-2005, 05:15 PM
crodan85

Quote:

Originally Posted by WiCKeD

You're forgetting that if the clockspeed isn't up there, the extra bandwidth is not being utilized. That is the main advantage of having more pipes. 200MHz is probably not enough to push 16-pipes worth of data.

Where does it say that? They don't even list a source or when they got the info.

Ya don't. They won't be turning a real profit until next year anyway.

http://theinquirer.net/?article=24540 Only going on what Inq said and as far as im concerned they know more than me and you.
07-24-2005, 05:44 PM
DilTech

Crodan, that article was written by fuad, aka Fraud, we all know we can't trust anything he has to say...
07-24-2005, 06:57 PM
turtle

First of all, thanks for the compliment Perk. :)

I figure there's some extremely intelligent minds floating around this forum with some free time, and since we're all eagarly awaiting this thing called the R520, so me might as well hash out what we know, think we know, or hear and try to make sense of it all, using that extra brain cycles of said people. I find it xtremely interesting...even if it's not what we think, at least I learned something, and perhaps helps me make more informed guesses and choices in the future. :)

Thanks for running those tests. As you can see, with one less quad (on the gf6 gen) it only makes a 3% differance at the same clocks. That's half the quads, 1/2 the vertex shaders, and 7800gt will have 4/5 the quads of gtx, with assuming the same number of rops and vertex shaders, but on a newer architecture that's pipes are prolly slightly more efficient than gf6, so again, 7800gt probably will be pretty close to 7800gtx if it clocks well, probably VERY close. The less pipes might make more of a differance then they did before, but it still probably won't be substantial, especially since the amount of pipes (in percentage) is more, and all vertex shaders supposedly will still be there from the gtx on the gt.

When making an assumption on R520, wouldn't it be best to run similar tests using an X800 card? Granted, supposedly R520's pipes are supposed to be more efficient than x800, but wouldn't it give a closer idea? Is there a way to do this, as obviously there's no nvstrap for ATi?

As for the Inq...Well, the last article I read is that it could be 16-32 pipes and 600-700+ mhz. Um...okay. They're right, it probably will be somewhere in there. ;) Even if R520 is 500mhz, it may be able to overclock that well in some cases (3%...:) ) scaling that high, who knows? While anything could happen, it seems likely they couldn't get 32 pipes to work at an optimal clockspeed to feed the pipes in a large enough quantity, and 500 might be the spot where they can get out enough cards that will perform at that clock with 6 quads, while still being able to feed the pipes. Granted, maybe they can get 700/16 to work, just like maybe they could get 300/32 to work (following the same logic) but this was the most optimal. Who knows...Maybe the pro at 16 pipes in more cases than the xt will clock at 600-700mhz. If that's true, the pro would be an overclocker's darling indeed...especially if all the vertex shaders were intact on the lesser card. In that case, it would all boil down to the efficiency of the pipes...which is again why the xt may be 24 at a lower clockspeed, if they're 24 very efficient fatty pipes.
07-24-2005, 07:14 PM
crodan85

[QUOTE=DilTech]Crodan, that article was written by fuad, aka Fraud, we all know we can't trust anything he has to say...[/QUOTE

Ok but better to have any news than no news even if it is eventually confirmed to be wrong.
07-24-2005, 07:27 PM
turtle

I wouldn't put the Inq's sources as any more reliable than the three or four people i've seen claiming to know ATi partners/fab peeps (that all posted the same specs) Surely, they could all be wrong, without a doubt. Like I said, it's all speculation and/or news, some which may be outdated.
07-24-2005, 07:38 PM
.sentinel

I would expect them to release it 24/500 rather than 16/700 because normally people will only look at whats higher in numbers for pipelines and say it is better. On the other hand my friend said get this card it is 100 dollars cheaper and is only 50 mhz slower when it was 5 generations old a 9600Xt rather than a 6800.
This is going to be one interesting SpecWars
07-25-2005, 05:04 AM
Hans.Gruber

Quote:

Horrible ati yeilds (3%!!)

Maybe someone is selling lowgrade silicon ?
I wonder if NVIDIA has something to do with it..
;)

Quote:

Originally Posted by perkam

Well, the nVidiots have vanished because more often than not nowadays, Nvidia is the sensible choice...where as ATI is still the choice of the ATI Loyalist (aka fanboy)

IMO ATI is still a good choice if midrange is enough, like X800XL, but at highend ATI is not so good choice.
07-25-2005, 07:45 AM
P_1

Quote:

Originally Posted by Hans.Gruber

Maybe someone is selling lowgrade silicon ?
I wonder if NVIDIA has something to do with it..
;)

IMO ATI is still a good choice if midrange is enough, like X800XL, but at highend ATI is not so good choice.

err i doubt that cause of all the reports of ati needing 3 tape outs to get something successful meaning they had to tweak their proccess, meaning that its not the silicon. :rolleyes: :rolleyes: :rolleyes:
07-25-2005, 12:38 PM
-Acid-

This is the most informative thread i,ve read in ages Lots of information flying about and then benchies to prove points, Well done people esp Turtle.
07-25-2005, 02:54 PM
turtle

Merci. I'm glad that all the people have contributed their bits and pieces, it is very helpful in getting to the bottom of things.

multi-threading for R520?

Pcpop (for whatever it's worth) posted something similar to this a while back too, saying it could do128 way as opposed to 64way for Xenos.

http://babelfish.altavista.com/ - Japanese

http://pc.watch.impress.co.jp/docs/2.../kaigai198.htm

So R500 (Xenos) is 48 ALUs, 16 x 3 SIMD ALUs

So could R520 be 96 ALUs, 32 x 3 SIMD alus? (or of course, 24x3 or 16x3 following the same logic?) I imagine if yields were bad at 32 pipe, one could get 72 ALU's at 24 pipe, or 48 ALU's at 16 pipe, if R520 is similar to R500 in that respect.

This would follow the logic that ATi has been saying each pixel shader pipe is 33% more efficient, and the 3 ALU per pipe rumour. Stuff below is fact about Xenos:

"XENOS has 48 ALUs that are 16-way, and are grouped into 3 arrays of SIMD
ALUs. Each ALU can co-issue a Vector4 and a scalar instruction
simultaneously, essentially a "5D" operation per cycle (basically 2 Vec4 and
2 scalar instructions per cycle per ALU). The ALUs process everything in
FP32 precision with no internal partial precision requirements for FP16.
Additionally each of the 48 ALUs contains additional logic that performs all
the pixel shader interpolation calculations. ATI suggests that this would
basically equates to an extra 33% pixel shader computional capacity."

Perhaps they're closer than we thought? Perhaps the same setup, just probably not unified architecture?

IIRC, g70 is 2 alus per pipe, right? So 48 ALUs?
07-25-2005, 05:06 PM
Cybercat

I just realized something. Why couldn't they release cards with 28 pipes?

Quote:

Originally Posted by turtle

IIRC, g70 is 2 alus per pipe, right? So 48 ALUs?

Yes, but don't forget, the G70/NV40 architecture also has a second shader unit.
07-25-2005, 05:27 PM
turtle

Quote:

Originally Posted by Cybercat

I just realized something. Why couldn't they release cards with 28 pipes?

Yes, but don't forget, the G70/NV40 architecture also has a second shader unit.

I don't know what stops them from doing that. It's most likely they'll find the best ratio of pipes/clock/performance they can get from the yields.

As for G70, I imagine you're referring to the Geometry unit, the one that's clocked 40mhz higher? I think in the g70, The rops, pixel shader units, and vertex units are all at a different clock speed. Nvidia has said even MORE than three different units are clocked at different speeds, although who knows what else could have it's own clock. I believe, and I could be wrong, the geometry unit refers to the vertex unit clock. IOW, G70 just tricks out all the pieces to run at their own optimal clocks. R520 still will have all theses pieces (rops, VSUs,PSUs), whither they run at different clocks...Well, that remains to be seen.
07-25-2005, 06:08 PM
Cybercat

What? I never said anything about geometry units, ROPs, or clockspeeds. I mean the G70/NV40 achitecture contains a whole other shader unit altogether.

http://www.angryelf.org/pics/graphics/G70_arch.jpg

That's two shader units, plus two ALUs.

The R520, as far as we know, has one shader unit, plus three ALUs.
07-25-2005, 06:53 PM
turtle

Ah, I see. I wasn't aware. Thanks for the skoolin'.
07-25-2005, 07:13 PM
Turok

Isnt 32 pipes more possible in 90nm than 110nm temp wise since it consumes less voltage because its smaller?

It is a chalenge tho. The specs rumored is never before seen tech on a GPU.
If they make it, and its brilliant, this will bring ATi way above nVidia since they will have SM3.0, 512mb, 32 pipes, 90nm low-k tech, higher clocks, Crossfire (which could be better than SLI), and so on
They should release their lower end r520 first, powerful enough to match up with the 7800GTX, make it cheaper, and provide Crossfire compatibility, and the sale of Crossfire mobos with this card could be a good income
If they build up enough money, they should bring the 32 pipe r520 to life.
It would be a monster of a card and may make it really hard for nVidia to catch up.
07-25-2005, 08:22 PM
turtle

That, in theory anyway, is what people speculate about R520 and R580. R520 will slightly beat out the GTX, nvidia will release either an ultra and then G71, or just G71 (90nm shrink with higher clocks), and then ATi will come back with R580 that would be 32 pipes.

Silent_Buddha over at R3D deciphered some of that article (as he can read chinese and japanese iirc) and has speculated that ATi may think of ROPs as pipes, while Nvidia may think of pixel shaders as pipes. Interesting theory. This is turn would make R520 a "16 pipe card" because it has 16 ROPS, as does G70, and they may have been shooting for 24 or 32 ROPS. Now that...That would've been killer, and would definately explain yield problems, if not the multi-threading itsself, or both. He also goes on to state that R520 was supposed to have twice the threads of R500, and assuming ATi wanted 32 pipes, that's 4 per pipe...Because it was planned to be 128 threads as opposed to 64 on the R500. Who knows though, maybe it's 16 pipes and do 8 threads a pipe, that's what noone knows, if it's the current approach or what they were hoping for, therefore no conclusion on pipe count can be made, or threads per pipe can be made. The differance between R500 and R520 is R500 uses one thread to cache info to reduce stall on the other threads, so it's effectively 3x16=48. R520, according to that bit of info, would use all 4 threads, and even if 16 pipes, it would output 64threads, therefore creating similar performance to R500.

So...perhaps 24x4 = 96 ALUs...or something? :rolleyes:

You can check it out for yourself, articulated better here at post #10:
http://www.rage3d.com/board/showthread.php?t=33823638

and here at post #39:
http://www.rage3d.com/board/showthre...3810283&page=2

Sorry if I sound like the Inq, not trying to spread rumours or anything, but just trying to connect ya'lls with the latest info and ideas that some have put out there, just it case it interests you. I figure instead of 8 million R520 threads (guilty of creating one or two), we could just post all the crap here. A compiled source for those who care, if you will...'cause I know many of you don't give a crap, you'll just wait until it's out and see. But I, and I know some peeps (going by the fact this thread has 1000 views), enjoy the ol' rumour/latest info mill...and when R520's out, the R580/g71 fun will start. ;)
07-26-2005, 04:46 AM
Cybercat

Quote:

Originally Posted by Silent Buddha

Remember when their 4x2 (2 texture units per ROP) cards were 8 pipeline cards while this generation they are using pixel shader units? Different naming conventions for different companies driven by their respective marketing departments.

I could be wrong, but I think this guy is a little confused about what an ROP actually does. 2 texturing units per ROP? The ROP doesn't even do texturing! Texturing takes place in the shader pipeline, especially when talking about NVIDIA cards. The FX series used a 4x2 shader pipeline configuration, and never had 8 pipes (although this was debated for quite some time originally).

Now his theories about ATI refering to their ROPs as the primary pipes could be very true. But I've formed my own theories after reading through this information.

NVIDIA up until the FX series combined the ROPs with the shader pipelines. Back then they were refered to as pixel processing pipelines, or sometimes 3D pipelines. They went like this:

http://www.angryelf.org/pics/graphics/3dpipeline.gif

Now what you see there is a diagram for the achitecture of a GF3. Notice how EVERYTHING, all the functions that the graphics card does are meshed together in this one flow chart. TnL, rasterization, multisampling, vertex processing, everything's there, out of order.

Then you get to the GeForce 6 series, and vertex shading, pixel shading, and ROP suddenly become (or are suddenly refered to as) three separate parts of the chip. Pixel shading, calculations, and texturing become so complex in the GF6 series that they get their own pipeline now. ROPs (rasterization, z culling, multisampling) are now separate, but not really addressed until the G70. ROPs are really pipelines in and of themselves. They take the scene and actually turn it into a flat image of pixels. The shader pipes do special color calculations, while mapping textures to surfaces, along with other operations.

The G70 is the first card to have a different number of pixel shading pipelines and ROPs. Well, actually, the NV43 (GeForce 6600 Series) was the first, but this wasn't revealed until the G70 launch. The NV43 has 8 pixel pipelines, and 4 ROPs, and the G70 has 24 pipes and 16 ROPs.

ATI on the other hand still merges the functions of ROPs, pixel shading, and texturing into one pipeline. Up until the X800 series, they don't have a separate number of either. With the R520 who knows what's up. It could be that instead of separating pixel shading and ROP pipelines, ATI refers to them as threads. One thread of the pipeline does pixel shader calcs, the other does ROP functions, while others do other things. So when ATI says they have 16 pipes, that means they have 16 shader pipes and 16 ROPs, more or less combined in their terms.

Silent Buddha believes that ATI refers to the ROPs primarily, and I don't really think that's the case. They're refering to pipelines, which includes shading, texturing, and pixel processing. They're not as distinct and separate with ATI's architecture as they are with NVIDIA's.
07-26-2005, 01:08 PM
turtle

Right, I totally understand what you're saying, and I think you're right.

Here's Gibbo, sales manager at OCUK (or that's what someone referred to him as), the first dude that I knew of to have a G70... claiming almost the exact same specs:

http://forums.overclockers.co.uk/sho...2&page=3&pp=66

I'm thinking it's getting close to say R520 XT will for sure be 24pp, ~500/~1.3. That's like the fifth dude posting the specs saying he knows what's up. He also claims it's STRAIGHT FROM ATi.

It's starting to look like anyone that buys a 7800gt/gtx isn't going to be disappointed in the short term if they can handle the fact ATi might have a slightly faster part. Unless the cores on the xt/xtpe or more importantly pros overclock wonderfully, or these predictions are all wrong, 7800gt is going to be where it's at for price/performance, at least for now as the refreshes look promising in terms of clock for both companies and pipes for ATi. Granted, he could be right in saying that ATi is betting on crossfire to be more efficient and take overall crowns, but with the looming 7800U/90nm 71/whatever the hell it is/ and G80...oofta.

I hate saying this, but anyone looking to upgrade may want to hold onto their hats if they can over the next 6 months, it's looking like it'll be an evil game of back and forth...ending with R580 and either a 90nm G70 or G80. If G80's out that early, and R580 doesn't kick major ass...that really doesn't sound too great for ATi. Again, assuming all these guys arn't just people spreading complete crap. One thing for sure, these specs, although they make not be indicitive of performance, are starting to look more and more true.
07-26-2005, 04:26 PM
Cybercat

Quote:

Originally Posted by Gibbo

R520 which will be released in Platinum 512MB (9000 on 3D Mark 2005) available end of September but severe allocation issues, the willy waving product as ATI put it, but not easily available. Plus 512MB means a £450+ price area ish, I have pleaded with ATI to release a 256MB version so I hope they listen, but due to yield rates they may stick to 512MB for ultimate high-end as such. Same as G70, 24 pipelines, approx 520MHz core and 1.4GHz memory.

More readily available at end of September will be the non platinum which will no doubt perform identical to a 7800GTX, 24 pipelines, 500MHz core and 1.3GHz memory expect. Still only 512MB, but again pleaded for a 256MB version, lets keep our finger crossed.

OK, so he's saying the PE will score 9k in 3DMark05, and the non-PE will score the same as the 7800GTX? The 7800GTX scores like 8000, and the non-PE is supposed to be only 20MHz/100MHz underclocked? That doesn't sound right.
07-26-2005, 04:31 PM
perkam

Omg Turtle you never cease to amaze !!!!

Quote on R520 info:

Quote:

Hi there

To be short and breif I have this information which is accurate and from the main cheeses at ATI, it is NDA but I am not under NDA as such so can give you some info:-

Crossfire working motherboards, graphics cards (X850 XT & X800 XT) available early September.

R520 which will be released in Platinum 512MB (9000 on 3D Mark 2005) available end of September but severe allocation issues, the willy waving product as ATI put it, but not easily available. Plus 512MB means a £450+ price area ish, I have pleaded with ATI to release a 256MB version so I hope they listen, but due to yield rates they may stick to 512MB for ultimate high-end as such. Same as G70, 24 pipelines, approx 520MHz core and 1.4GHz memory.

More readily available at end of September will be the non platinum which will no doubt perform identical to a 7800GTX, 24 pipelines, 500MHz core and 1.3GHz memory expect. Still only 512MB, but again pleaded for a 256MB version, lets keep our finger crossed.

There will also be lower end 16 pipeline version too, no doubt called Pro's.....

So ATI will have a product by end of summer, it has no real performance advantage over the 7800GTX, except for the Platinum edition which will be so severe on availability it will have a £500+VAT price. Obviously R5xx series adds SM3.0 to the lineup and other features NV cards have...

What ATI will rely on is their Crossfire out performing SLi.

Then expect NVIDIA to release a 7800 Ultra in October to beat the Platinum that ATI release. NVIDIA will do their best to stay on top. As NVIDIA are using the 0.11 process they have much better yield rates. Wheras ATI are on a newer 0.09 process which means yields rates to begin with are poor but with time this will improve and when it does ATI cores will overclock like champions, but expect that around R580 time which is Jan/Feb next year which is when NV's new botty kicker cores will show face.

So there you have it, those who have gone ahead with buying 7800GTX you've not lost out and you will no doubt not change your cards for ATI's new one and anybody thinking to buy its clear 7800 GTX is a safe buy.

P.S. X850 XT Platinum on both AGP and PCI-E is now EOL but X850 XT is remaing on both platforms and OcUK might have some treats for you soon on prices.

16 pipe nex gen x900 pro...here i come !!! :p:

Perkam
07-26-2005, 05:42 PM
turtle

Thanks Perk, just thought everyone would like to know what's up. :)

Cybercat: I wouldn't trust his figures on the 3dmark05 score. While he said everything right pertaining to 7800gtx before it came out pertaining to specs, he also said it would score 20k in SLI in 3dmark05...lol. Remember though, a 3dmark score is all relative to how high you have it overclocked, and the specs of your system like fsb, cpu clock, cache, ect. So I would disremark the score as we have no idea on what kind of system that score was taken from, if it is indeed true. It makes one wonder though...could that 10,000+ score everyone whispered about...maybe it was on a high-clocked version of this R520 (24pp, 700mhz?), or perhaps on a 32pipe with 500mhz. If this interation scores ~9000 stock, that score would make sense in both cases. I wouldn't put the possibility, or even probability, of an overclocked R520 at 10,000 out of reasonability either.

Other people claiming to know how it fares though have said that R520's XT at stock clock (~500/1.3) will be right about even with the GTX, although it might fare a little better. Around 9000 seems like an accurate guess to me. Again, the wildcard is how it overclocks, which noone knows. The other wildcard of course is the pro, which while prolly 16pp, 16rops, 8vs, may clock substantially higher. I hear it's supposed to be set at the same clocks as R520, but i'd assume the overclock will be much better...so it may put up a good value fight against the 7800gt, which looks like it will score almost identical to the 7800gtx.

I'm sure you've seen these benchies, but look here:
http://www.gdhardware.com/hardware/v...7800gt/001.htm

A 7800gt at 400/1000 scores 91% of that of the GTX at 430/1200...and that's with 4pipes (1 quad) disabled. Like I said before, unless the GT is using inferior ram (and even if it is) it'll prolly end up being exactly like the 6800gt was to the ultra when overclocking is all said and done. Unless ATi comes out with a great $400 card, R520 is a great clocker, or crossfire is proven vastly superior, I think 7800gt is going to steal this cycle because of it's sheer value in terms of price/performance vs R520 and GTX. How the future with the ultras and the 90nms and the R580s and the G80's and all that jazz (yes, that run-on sentence was on-purpose) will fare, and if investing on a card now is worth it when those are coming so shortly with seemingly major improvements, I guess is anyone's guess. I suppose it's also their perogative as well though.

Myself, personally...I'd wait until the 6800gt/x850xt drops a few more bucks after 7800gt comes out and wait out the back-and-forth until it quiets down. They're not that far apart in terms of performance, and they are probably going to end up being great values...prolly at $250-275 price tags pretty quick here. Add the crossfire $100 thing and another, and you've got yourself a cheaper next-gen setup that will probably perform the same. Without ps3/sm3 of course. ;)
07-26-2005, 07:27 PM
Cybercat

Well, maybe I'm biased, but I couldn't justify buying any SM2.0 ATI part right now. :p SM3.0 is becoming a big deal. A good handful of games and applications make use of it, some require it to run high details, or special features like HDR or soft shadows. Not to mention I just find the R4x0 architecture inferior to the efficient shader calculating capabilities of the NV40/G70. *puts on flamesuit*

The R520 and chips based on it will bring them back up of course. 3 ALUs is hot hot hot stuff. ;)

All times are GMT -8. The time now is 07:53 AM.

XtremeSystems