OCCT 3.1.0 shows HD4870/4890 design flaw - they can't handle the new GPU test !

Printable View

Show 100 post(s) from this thread on one page

05-21-2009, 12:39 PM
zerazax

The 82A issue and so on seems to suggest an issue with OCP kicking in
05-21-2009, 12:42 PM
LordEC911

Quote:

Originally Posted by zerazax

The 82A issue and so on seems to suggest an issue with OCP kicking in

There is something strange, ie possibly unknown, still going on though since certain reference cards are not having problems, even some oc'ed cards.
05-21-2009, 12:42 PM
W1zzard

Quote:

Originally Posted by Tetedeiench

:I have no idea. I'd guess the protection is in the BIOS, but that's just a guess.

nope it's configured via a resistor, no software way to control or disable ovp
05-21-2009, 12:45 PM
Tetedeiench
Quote:

Originally Posted by SNiiPE_DoGG

I still dont get why this program crashing cards is a problem, so if applying AF stops the card from crashing then your set - just enable AF in your games and all woes are gone.

I'm not calling this thing a power virus or anything like that, but it is clear to me that this program is loading the card in certain way that no game would ever do - even if it had the best graphics in the world - because games aren't rendering a mostly static image with little geometry and no AF setting

Well, there are still a few problems :
- Take an old game, play it with your brand new card. Not alot of geometry. Simple shaders. Wow, black screen ? That could happen. It's worth a try. Any idea of a game or bench ? I'd say 3dMark2003 or 06... or any game of that era that uses simple shaders. Can't think of one at the moment.
- GPGPU applications. They may reach this value also. They don't have to wait for geometry there. ATI will have to lower the values also. Even if i don't think they'll reach this value, my limited experience in the field cannot answer that question for sure.
- You buy a card that boast it can do X, yet when you have it, it can't. I'd say it's a problem.
Now, right now, what ARE the implications of this problem ?
- Limited overclocking margin. You'll reach it quickly.
- You cannot run any 3d app you want on your card. The list is not that huge, i'd agree, it's not that bad. But morally, ethicaly, this comes into question. Especially since this list can grow in the future.
I have checked the logs, AMD downloaded OCCT yesterday. I guess they must not be taing the problem lightly, or at least, they're taking a look at it ;)

It's true it is not a bug that will make everyone ditch their card, or make ATI/AMD recall their card, IMHO. But still, it's a flaw, a quirk in their design, and worth notice.
05-21-2009, 12:47 PM
LiquidReactor

Quote:

Originally Posted by Bo_Fox

:nono: :argue: or :cord: Enough of this emotional bickering!

What I was doing here is to try to clear things up a little bit in all of this mess that you're stirring up. Like when the OP said that Vsync enabled does not make a difference with power consumption, I said that it is not what I have been experiencing.

I had a HIS 4850 1GB card that died on me after exactly 30 days of use. The display became permanently corrupted after decoding hi-def videos for a couple of hours on multiple screens, and I am wondering if it has anything to do with the cheap VRM's... Ever since I had an X1900XTX from the day it came out, I noticed that the VRM temperatures on ATI cards seemed much higher than those on Nvidia cards. I am geared towards buying either a 4870 1GB or a 4890, and am keenly seeking out this thread for knowing which brand/make is the best quality.

Heh just go for GTX 285 for $300 dell deal. Thats what Im thinking of doing after I play around with this 4870x2 and sell it.
05-21-2009, 12:50 PM
SNiiPE_DoGG

I understand the concern many may have, but as shaders get more complex the problem will be less and less relevant, no? - I have played games like star wars jedi academy @ 1600x1200 with all the settings on high no problem (a great game) its old and has very simple geometry but my card has no problem with it as it barely loads the GPU to make it run great....
05-21-2009, 12:50 PM
LordEC911
Quote:
Originally Posted by Tetedeiench

Well, there are still a few problems :

Take an old game, play it with your brand new card. Not alot of geometry. Simple shaders. Wow, black screen ? That could happen. It's worth a try. Any idea of a game or bench ? I'd say 3dMark2003 or 06... or any game of that era that uses simple shaders. Can't think of one at the moment.
GPGPU applications. They may reach this value also. They don't have to wait for geometry there. ATI will have to lower the values also. Even if i don't think they'll reach this value, my limited experience in the field cannot answer that question for sure.
You buy a card that boast it can do X, yet when you have it, it can't. I'd say it's a problem.

Now, right now, what ARE the implications of this problem ?

Limited overclocking margin. You'll reach it quickly.
You cannot run any 3d app you want on your card. The list is not that huge, i'd agree, it's not that bad. But morally, ethicaly, this comes into question. Especially since this list can grow in the future.

I have checked the logs, AMD downloaded OCCT yesterday. I guess they must not be taing the problem lightly, or at least, they're taking a look at it ;)

It's true it is not a bug that will make everyone ditch their card, or make ATI/AMD recall their card, IMHO. But still, it's a flaw, a quirk in their design, and worth notice.
I'm not so sure GPGPU apps would do the same.
Sure it will stress the shaders but not every part of the chip to the max, i.e. TMUs/ROPs.

The overclocking margin is only limited if you are soley going by OCCT. Most people are going to base their overclocks on their favorite games and benches, since that is where the extra performance might be needed.

What other 3d app can you not run based on the card failing OCCT?
05-21-2009, 12:51 PM
Tetedeiench

Quote:

Originally Posted by W1zzard

nope it's configured via a resistor, no software way to control or disable ovp

Thanks for the precision :) So it's hard-wired... wow.
05-21-2009, 12:51 PM
mongoled

I think one of the senior members here should assist Tetedeiench in contacting Macci. In this way, someone close to the community and who works for AMD can help colaborate with these findings.

Just me 2 cents..................
05-21-2009, 12:53 PM
LordEC911

Quote:

Originally Posted by mongoled

I think one of the senior members here should assist Tetedeiench in contacting Macci. In this way, someone close to the community and who works for AMD can help colaborate with these findings.

Just me 2 cents..................

AMD clearly already knows about the situation...
05-21-2009, 01:29 PM
Manicdan

one question i keep asking and have no response on is, even though this is a potential problem, how does the performance compare to previous ATI cards, and the nvidia competitor cards. to generate that much heat and power, id expect this thing to be demolishing all others.

i think in the future they will go less with shaders, and more in other places, since these things are never being used to their max, they can probably better optimize their cores.
05-21-2009, 01:45 PM
LordEC911

Quote:

Originally Posted by Manicdan

one question i keep asking and have no response on is, even though this is a potential problem, how does the performance compare to previous ATI cards, and the nvidia competitor cards. to generate that much heat and power, id expect this thing to be demolishing all others.

Largon concurred with the results I was seeing from other forum members.

Underclocked 4890(850/850, ~83FPS @ 1680x1050 on setting 3, is quite a bit faster than even a GTX285, ~53FPS @ 1680x1050 on setting 3.

Don't know how much weight the CPU has on the score but the Nvidia card has the CPU advantage as well...
05-21-2009, 01:48 PM
generics_user

i don't see the issue here, it's a single app out of thousands and the first one able to show this "problem"

i think amd did this on purpose to prevent the card form burning in case someone loads the card with so simple shaders that each ALU is fully uitilized which is impossible under normal circumstandes

i never EVER saw my card pulling more than 70A :confused:
05-21-2009, 02:01 PM
Bo_Fox

Quote:

Originally Posted by BababooeyHTJ

You do know that the vrms on these refrence 4870 and 4870x2s are the same as the ones on the GTX280/260 65nm right? The 55nm versions use cheaper vrms.

Oh yeah, the volterra ones.. but why such a limitation on the volterra vrms? I think the volterra ones are more expensive only because they offer more of a digital support (tweaking via RT/bios, etc..) and the 55nm NV cards do just fine with analog VRMs. 55nm GT200 cards have a higher reliability rate than 65nm cards that are more prone to failure, and a GTX285 can eat just as much power as a GTX280. I remember reading somewhere recently that the analog VRM's are actually better than the digital ones.

W1zzard, have you yet done a Vmod on disabling the OVP for 4870/4890 cards? Anybody? I'm curious if those reference cards can handle 100A or so without any problems. It just gives some peace of mind, since there are so many older games out there--literally thousands of them--that we never know which one could actually push the card to the same black-screen crashing scenario (of which we would probably just discard as "buggy" piece of software when it's really the hardware). Also, this issue is interesting, nonetheless.

Strange how RT reports it as 40A when GPU-Z reports it as 80A... which one do you think is true?
05-21-2009, 02:16 PM
largon

Someone should grab a fireblanket and run this on a HD2900XT! That card has some craaazy vDDC current capabilities (6× VT1195SF).
=P

edit:

Quote:

Originally Posted by Bo_Fox

Strange how RT reports it as 40A when GPU-Z reports it as 80A... which one do you think is true?

That's what I was wondering too.
That 80A figure everyone is seeing does not make sense.

Quote:

Originally Posted by largon

I don't get it, what are those amperage figures RT & GPU-Z display?
It can't be the total vDDC phase amperage, nor it can't be a single phase amperage.

And why does RT give totally different amperage figures than GPU-Z? GPU-Z recorded 83.30A while RT reported 48.79A for the same point of time when I had both GPU-Z and RT write a log during the same OCCT test run...
05-21-2009, 02:32 PM
Sparky
Quote:
Originally Posted by Tetedeiench

Well, there are still a few problems :

Take an old game, play it with your brand new card. Not alot of geometry. Simple shaders. Wow, black screen ? That could happen. It's worth a try. Any idea of a game or bench ? I'd say 3dMark2003 or 06... or any game of that era that uses simple shaders. Can't think of one at the moment.
I run a myriad of new and old games. No game comes close to stressing the card as much as this OCCT.
Quote:
GPGPU applications. They may reach this value also. They don't have to wait for geometry there. ATI will have to lower the values also. Even if i don't think they'll reach this value, my limited experience in the field cannot answer that question for sure.
I run Folding@Home on my 4870 all the time that I'm not playing games, ever since I got the card pretty much. GPU-Z reports only 45-ish amps or so, for 96-98% GPU load.
Quote:
You buy a card that boast it can do X, yet when you have it, it can't. I'd say it's a problem.
I don't know about that - so far, everything the card boasts it can do, it has done :shrug:
Quote:
Now, right now, what ARE the implications of this problem ?

Limited overclocking margin. You'll reach it quickly.
You cannot run any 3d app you want on your card. The list is not that huge, i'd agree, it's not that bad. But morally, ethicaly, this comes into question. Especially since this list can grow in the future.
I'm maxed out on my overclock that CCC will allow for. But we all know that overclocking is a gamble, so even if you couldn't OC the card more than 30MHz well you still got what you paid for.
Any 3D app does run. Only OCCT stress test doesn't, but OK...

Quote:

I have checked the logs, AMD downloaded OCCT yesterday. I guess they must not be taing the problem lightly, or at least, they're taking a look at it ;)

It's true it is not a bug that will make everyone ditch their card, or make ATI/AMD recall their card, IMHO. But still, it's a flaw, a quirk in their design, and worth notice.

And if it is a real issue then I'm sure they'll take a look at it. Seems they already are if they downloaded OCCT, which is a good thing.
05-21-2009, 02:42 PM
GAR

So whats the final verdict here, bad hardware or bad software????
05-21-2009, 02:44 PM
Sparky

I wouldn't call it "bad" just something to take note of. I'm thinking of it like the redline of a car's engine, sure you can run up to that, but you can't go over it. Keep it at redline too long you could break it. For whatever reason, OCCT is trying to push it over the redline, and that's a no go.

Doesn't really bother me too much.
05-21-2009, 02:58 PM
Hornet331

Quote:

Originally Posted by SparkyJJO

I wouldn't call it "bad" just something to take note of. I'm thinking of it like the redline of a car's engine, sure you can run up to that, but you can't go over it. Keep it at redline too long you could break it. For whatever reason, OCCT is trying to push it over the redline, and that's a no go.

Doesn't really bother me too much.

Reassembles quite the fuzz about furmark back then, there where people that blew there VRMs with Furmark, yet there have to be a cases where this happens with games or GPGPU apps.
05-21-2009, 03:13 PM
BababooeyHTJ

Quote:

Originally Posted by LordEC911

Largon concurred with the results I was seeing from other forum members.

Underclocked 4890(850/850, ~83FPS @ 1680x1050 on setting 3, is quite a bit faster than even a GTX285, ~53FPS @ 1680x1050 on setting 3.

Don't know how much weight the CPU has an the score but the Nvidia card has the CPU advantage as well...

My GTX280 pulls just short of 80A with the same settings with my core at 648mhz. This is much more stressful than furmark on my card. I was seeing under 60fps, btw. With a Q9650 at 4ghz.

I'm not sure what point you are trying to prove.
05-21-2009, 03:34 PM
LordEC911

Quote:

Originally Posted by BababooeyHTJ

I'm not sure what point you are trying to prove.

I wasn't trying to prove a point, I was simply answering a question.
I can see where he might have been headed by asking that question though.

4890 is what ~10-15% behind a GTX285 with both at stock on average in "normal" games and apps?
Yet with OCCT, the 4890 is ~56% faster, using the 83FPS vs 53FPS.
However this app is programmed it stresses every part of the chip to the max, or at least quite a bit more than other "normal" apps/games.

Also none of the numbers, i.e. FPP, seem to add up.
4890@850mhz= 1.36Tflops
GTX285@1476mhz= 1.06Tflops(MADD+MUL), .708Tflops(MADD)

1.36/1.06= 1.28x greater (1/2 the FPS difference)
1.36/.708= 1.92x greater (amusing since it doesn't mean anything but = largons power draw increase)

Simply using max theorectical FPP is not an accurate way to estimate performance but in this case it seems to be related. Since this app has been said to use simple shaders to completely load the ALUs, you could come to the conclusion that the MUL is only being used ~45% of the time.

Basically, the way this app is programmed it is able to use the 4890's architecture to the max and seems to not fully load Nvidia cards, persay.

Edit- Anyone know what the stock volts for a GTX280 is under load? 1.3-1.4v?
05-21-2009, 05:34 PM
AMDDeathstar

Quote:

Originally Posted by BababooeyHTJ

My GTX280 pulls just short of 80A with the same settings with my core at 648mhz. This is much more stressful than furmark on my card. I was seeing under 60fps, btw. With a Q9650 at 4ghz.

I'm not sure what point you are trying to prove.

Nvidia may have already put a safety in to prevent reaching peak Amperage
Instead of shutting down it restricts the the frames per second

Which would make the GPU stress test meaningless
05-22-2009, 01:26 AM
Olivon

Quote:

Originally Posted by LordEC911

Edit- Anyone know what the stock volts for a GTX280 is under load? 1.3-1.4v?

I think it's 1.19V under load at stock volts, 1.11V at idle.
Link Here
05-22-2009, 01:28 AM
kiwi

Quote:

Originally Posted by AMDDeathstar

Nvidia may have already put a safety in to prevent reaching peak Amperage
Instead of shutting down it restricts the the frames per second

Which would make the GPU stress test meaningless

Good point but AMD should have put safety as well ;)
05-22-2009, 01:45 AM
iddqd

Quote:

Originally Posted by Shintai

Thats pure BS.

There is no way you can damage a VIA/AMD/Intel CPU with any software/power virus.

Because unlike GPUs, CPUs and VRM designs are made to handle anything. Its all about GPU designers going the cheapskater way.

Well, that will be useful to tell to my friend who had to RMA 3 Wolfdales before he realized that he was damaging them with Orthos Prime. Granted, he was overclocking them, (and then using OP to test for errors...), the test would not have caused enough stress under stock settings. In any case, he can keep his 4.2 ghz overclock under normal operation (games, movies, etc..), while after an hour of OP, it will start BSODing until underclocked to 4.1. And so on.
05-22-2009, 01:51 AM
Greg83

Quote:

Originally Posted by iddqd

...

Damage was not caused by Orthos
Damage was caused by overvolting, lack of proper cooling and pebkac :rofl:

I've run 52 hour prime95 25.6 64bit with my q9650 at 4.42Ghz using 1-32v-1.42v multiple times and its still doing great !
05-22-2009, 04:00 AM
halo112358

Quote:

Originally Posted by Greg83

Damage was not caused by Orthos
Damage was caused by overvolting, lack of proper cooling and pebkac :rofl:

I've run 52 hour prime95 25.6 64bit with my q9650 at 4.42Ghz using 1-32v-1.42v multiple times and its still doing great !

I'll second that, orthos doesn't kill chips - voltage kills chips.
05-22-2009, 06:12 AM
Sparky

Quote:

Originally Posted by kiwi

Good point but AMD should have put safety as well ;)

Well they do have a safety. Nvidia's is limit fps to keep amps down, ATI's is an OCP. Both protect the card, just different methods :shrug:
05-22-2009, 07:50 AM
Bo_Fox

Quote:

Originally Posted by SparkyJJO

I wouldn't call it "bad" just something to take note of. I'm thinking of it like the redline of a car's engine, sure you can run up to that, but you can't go over it. Keep it at redline too long you could break it. For whatever reason, OCCT is trying to push it over the redline, and that's a no go.

Doesn't really bother me too much.

If you were an enthusiast, you'd certainly call it a good thing. Install a turbocharger, and you're trying to push it. Install NO2, and you're doing the same. When you race around the track, you're always keeping it at redline. When you go drag-racing, you want to push it as far as you can even at the risk of blowing the engine or tranny.

Guess what? This guy here just gave us a new way of testing engines and trannies to see if they can be pushed without breaking. And we can find out which engines are capable of handling 10000rpm instead, and which ones have a built-in limiter of 8000rpm, so you do not have to waste your time buying the engines and racing them yourselves. All you gotta do is look up the online database, thanks to this program!

That is what being an Xtreme enthusiast is all about, and for that reason, I will not tolerate pansies or threadcrappers here.

Edit: If you guys still want to debate this issue, then it's just the same as saying that LN2 suicide runs are pointless, just like drag-racing is pointless. A mod would probably ban a troller if he didnt stop voicing out against the purpose of enthusiast thrill.
05-22-2009, 08:16 AM
Macadamia

Quote:

Originally Posted by kiwi

Good point but AMD should have put safety as well ;)

When they capped furmark, look what shill nVidia sites like Expreview wrote.
05-22-2009, 12:32 PM
BababooeyHTJ

Quote:

Originally Posted by Greg83

Damage was not caused by Orthos
Damage was caused by overvolting, lack of proper cooling and pebkac :rofl:

I've run 52 hour prime95 25.6 64bit with my q9650 at 4.42Ghz using 1-32v-1.42v multiple times and its still doing great !

Agreed, I ran my Wolfdale C0 steping at 70c+ in IBT for some pretty long runs(hours) and hours of prime at 1.37v under load (more when I picked up a TRUE) and never had a hint of voltage degredation. My q9650 has seen the same sort of abuse and again not a hint of voltage degredation. Whenever I hear stuff like that I think of BenchZowner's VTT thread.

Guys were talking about a chip with 800 shader cores and it's performance (4870) isn't too far off and in some cases better than a GTX280 in games. I'm not surprised that it stresses a 4870/4890 more and gets such a higher framerate. I doubt that Nvidia is limiting framerates in this benchmark. Trust me it stresses my card hard.
05-22-2009, 02:29 PM
Sparky

Quote:

Originally Posted by Bo_Fox

If you were an enthusiast, you'd certainly call it a good thing. Install a turbocharger, and you're trying to push it. Install NO2, and you're doing the same. When you race around the track, you're always keeping it at redline. When you go drag-racing, you want to push it as far as you can even at the risk of blowing the engine or tranny.

Guess what? This guy here just gave us a new way of testing engines and trannies to see if they can be pushed without breaking. And we can find out which engines are capable of handling 10000rpm instead, and which ones have a built-in limiter of 8000rpm, so you do not have to waste your time buying the engines and racing them yourselves. All you gotta do is look up the online database, thanks to this program!

That is what being an Xtreme enthusiast is all about, and for that reason, I will not tolerate pansies or threadcrappers here.

Edit: If you guys still want to debate this issue, then it's just the same as saying that LN2 suicide runs are pointless, just like drag-racing is pointless. A mod would probably ban a troller if he didnt stop voicing out against the purpose of enthusiast thrill.

Chill :rolleyes:

Your comparison isn't quite valid. When you install a turbo, or NO2, etc, you don't do it on a bone stock bottom end. You have to upgrade the bottom end to handle the increased output.

And I am hardly trolling. Seriously dude, just because someone doesn't meet your definition of "enthusiast" doesn't mean they are threadcrapping or trolling or anything like that.
05-22-2009, 02:42 PM
Bo_Fox

Quote:

Originally Posted by SparkyJJO

Chill :rolleyes:

Your comparison isn't quite valid. When you install a turbo, or NO2, etc, you don't do it on a bone stock bottom end. You have to upgrade the bottom end to handle the increased output.

And I am hardly trolling. Seriously dude, just because someone doesn't meet your definition of "enthusiast" doesn't mean they are threadcrapping or trolling or anything like that.

LOL, all right.. but I think mine was more valid than yours. :D:p:
05-22-2009, 02:52 PM
G0ldBr1ck

LOL @ more car analogies. They never end.

I have never seen where ATI has boasted or made claims as to there abilities of running stress tests, they made no such promises to anyone. The only claims I have seen are what the cards will do with games and so far they live up to it.
Dont get me wrong, I look at this as a problem now myself. I just dont feel that ATI has failed to deliver on what they sold me.

Oh, and BTW Sparky. Been running a 150hp shot on bone stock bottom end on 88 mustang for years now ;)
05-22-2009, 03:03 PM
Bo_Fox

Quote:

Originally Posted by G0ldBr1ck

Oh, and BTW Sparky. Been running a 150hp shot on bone stock bottom end on 88 mustang for years now ;)

Ahh, made mine sound better, thank you Goldie! By the way, it's dirt-cheap to replace the exhaust.
05-22-2009, 04:06 PM
initialised

Quote:

Originally Posted by halo112358

I'll second that, orthos doesn't kill chips - voltage kills chips.

Let me correct you there, voltage doesn't kill chips, current kills chips through electro-migration.

If you run silicon cool enough you can pile on the voltage until you start getting too much leakage through the gate oxide for stable operation. Because the resistance is proportional to temperature, current is lower and the rate of electro-migration is reduced.

Even if you have an ATi GPU that can pass this test you may be drastically reducing it's lifespan by running it. It is possible that the non-reference cards that pass also contain higher binned GPUs which have a higher current threshold or have the limiting mechanism disabled. Maybe someone with a 25x16 or 2160p screen could find the limit if it's there.

I've tested the OCCT GPU stress test and agree that the R770 is limited to ~80A as indicated by the upper limit in GPU-Z. With my clocks (775/1100) the test will not run at 1280x1024 on complexity 3, I get a grey screen with a white square where the mouse was left (CCC 9.4).

I am going to stick with furmark as my GPU stress test for two reasons. One is that this test causes an over-current condition that is potentially damaging to the GPU the second is that given the difference in FPS at the same settings on different architechtures with otherwise equivalent performance it is not a fair test as it clearly stresses ATi GPUs far more than nVidia GPUs. However, it may also mean that the quotes FLOPs performance figures are not achievable due to this apparent limitation. I also don't like that crossfire does not kick in it windowed mode and that windowed mode is limited to 1280x1024.

That's not to say it is not a worthwhile finding as it may explain why an overclocked ATi GPU is stable in some games bust crashes in others and armed with this knowledge ATi may be able to introduce, through driver updates, an over-current mechanism that does not require a reboot.
05-22-2009, 04:19 PM
Sparky

Quote:

Originally Posted by G0ldBr1ck

Oh, and BTW Sparky. Been running a 150hp shot on bone stock bottom end on 88 mustang for years now ;)

<mutter mutter>

All right but you get my drift! :p:
05-22-2009, 04:50 PM
[XC] Lead Head

You have to remember the OP has said multiple times that this program stresses different GPUs differently with the same settings. IE. Setting 3 may be the hardest on ATIs, while nVidias might get loaded down the most with Setting 5
05-22-2009, 08:17 PM
G0ldBr1ck

Id realy like to hear what AMD has to say about this, they have been asked many times and even banned my account when I asked in there public forums! If there was nothing to it, they wouldnt try to hide it so hard. Come on AMD, I dont think I was screwed on my cards but I would like to know what is going on and why!!!!
05-22-2009, 08:52 PM
vinister

I run 3 3870's and all 3 pass this test, as well as the 3 together in crossfire. Temps get high! Up to 55C with EK full cover blocks. they NEVER break 40C while gaming. The VRM's get crazy hot, I measured 90C using an IR gun.

I don't know if that matters to anyone but thought it was worth a mention. I use OCCT alot for benching. The gpu test and the power supply test are my favorites, the power supply test in particular really pushes your entire system to its limits.
05-22-2009, 11:53 PM
Nightstar

In an attempt to contribute something more than speculation and admonishment to this thread I thought I'd try to dig up some OCP mods for the 4870/4890 in case some brave soul wanted to test the 82A theory.

The problem therin is that XS is usually my first choice for Vmod info and since largon, initialised and other worthies have posted here and haven't linked to the relevent guide... but WTF eh? Search is said to be my friend.

Well, I searched and wouldnt you know I ended up back here at XS in the Xtreme Graphx Vmods forum, but no love on the OCP fix.

Although I failed in my quest it seems others are looking for this as well.

Quote:

Originally Posted by SebfromSaxony

Hello,
can anyone please help me with the OCP and OVP Mod for my Powercolor 4870 pcs+ rev2.

Quote:

Originally Posted by SebfromSaxony

I flasht the card to 950Mhz Core and take 1420mV for a short Test with LN2. It´s work wonderful since the firts moment. But when i turned up to 1460mV the System crashed under load and the output Voltage goes down to ca. 200mV.

hmmm this sounds familiar(same thread).

Quote:

Originally Posted by Søndergård

Could that be what causes my problem which is as followed:

My Powercolor 4870 is under water with the new EK FC block. Before vGPU mod I ran it at 850/1000.

Today I did the vGPU mod, and where ATI Tool would artifact above 850 before, I was now able to run ATI tool artifact free at 940MHz. (See picture).

My problem is, that eventhough ATI tool runs flawless, no game will run for more than a few secounds. Everything above aprox 1.37v will make the screen go black or the displaydriver to stop working.

It dosent matter if I only uses the same 850mhz as I did before the mod.

Lowering mem speed dosent do anything for me either..

There were many other similar posts.

It would seem that this problem may well manifest itself in games on overvolted/overclocked cards. Not merely while testing OCCT.
05-23-2009, 01:14 AM
P4rD0nM3

Hey, I downloaded the OCCT 3.1.0 app and ran the GPU test on my HIS 4870 1GB (the one with the reference cooler). Is it supposed to black as soon as possible? Mine didn't crash. Oh and I know you said that it has nothing to do with heat but I do have a Thermalright HR-03 GT on it w/ a Noctua fan.
05-23-2009, 03:19 AM
Tetedeiench

Quote:

Originally Posted by P4rD0nM3

Hey, I downloaded the OCCT 3.1.0 app and ran the GPU test on my HIS 4870 1GB (the one with the reference cooler). Is it supposed to black as soon as possible? Mine didn't crash. Oh and I know you said that it has nothing to do with heat but I do have a Thermalright HR-03 GT on it w/ a Noctua fan.

It does. Check if your PCB is the reference one (it does not have to do with temperature, but with VRM). Check also if you forced V-sync in the driver, or Antialiasing, or Anistropic, or anything like that.

Also make sure you followed the configuration stated in the first post (shader complexity 3, fullscreen, high resolution, no error check).
05-23-2009, 03:35 AM
Tetedeiench
OK guys, some news, as we had alot happening yesterday :

Two french websites had the problem happening in their testing labs. One of them has a huge reputation, and cannot be questioned about it. He actually came to the same conclusions as i did : we did hit a limit in the design of the boards. Phew, i'm not a complete moron ;)

He tried a non-reference design board : passed. Reference board : doesn't pass. He reproduced my test : very same conclusion. My test is not at stake, the cards are. Truly, he IS trustable. I'd say he is a demi-god of hardware in France. He is from x86-secret and memtest86.org . Just so you know who he is.

Here is the link to the first news :
http://www.canardpc.com/news-36049-g...et_4890__.html

His conclusion is interesting. If i translate it quickly (i'll let you use google to get an idea) : he says that it is unacceptable that a card cannot handle any code that can be produced with an API it is supposed to support (OCCT fits into that category, as it is pure DirectX9). AMD's answer, according to him, is almost already known : limit OCCT's execution speed, just as furmark. But is this acceptable ?

The interesting part is this : according to him, this defect is sure to have been detected in AMD's quality check process, but have been ignored on purpose. And he goes on by saying that the OCP mecanism is badly implemented, and that this is a true problem on those cards... unless the VRMs aren't at stake here, which is his supposition in the news, but the OCP protection mecanism (which i tend to believe myself).

And the 2nd one, which was reproduced in Yahoo News :
http://www.pcinpact.com/actu/news_multi/51011.htm

I'm actually waiting for AMD's official response now. I'm truly wondering what's going on.

I had an PM on my board waiting for me from AMD, asking me for contact information. I answered yesterday, still waiting for input from them. I'll keep you guys updated.

I can say that right now, we're going out of the "hypothesis" to the "there's a problem for sure", we just don't know if the problem is the VRM themselves, or the OCP, but we do know that the problem are the cards, that the problem occur :
- With OCCT GPU at stock speed
- With Furmark with a slight bump in vGPU/Frequencies (as it is a tad less efficient than OCCT)
And we're starting to get professionnal view on that.

OK, the sources are unknown for you, it's non anandtech, xbitlabs, or alike...

We're starting to get alot of cards touched, alot of testing done, and some professionnal reviews that prove our testing.

I updated the first post to include the first "professionnal" review we had. I'm sure people will question them because they don't know them. Well, i can't do a thing about that :(
05-23-2009, 04:57 AM
kiwi

I hope they are completely independent. Otherwise AMD will just tell them to shut up :) We have already seen their attitude in forums. Sorry, but this time AMD you have failed.
05-23-2009, 05:27 AM
Mechromancer

OCCT 3.1.0 fails on my reference Diamond 4870 that I bought one month after they were released. I just ran Furmark after changing the furmark.exe filename (to disable any driver tweaks) and it ran perfectly fine. I also run games going back the last 11 years! I've had no problems with any software I've run on my GPU, except OCCT. I'm not saying OCCT is flawed, I'm just saying there haven't been any issues on my end while overclocked. Extreme overclocking is where folks will be having their problems so that doesn't apply to me (yet). The next time I buy an ATI GPU, I'll just be sure to get a non-reference design though.

Any idea which manufacturers make the strongest, most robust non-ref boards so far? How about you make a list of the ones that work the best and overclock the highest while remaining stable?
05-23-2009, 05:50 AM
informal

Until we have a real world code(be it GPGPU related or game related) that fails on reference Radeon cards my opinion will be that this test is unrealistic and borderline power virus.There is not one real world workload that can cause the same effect on reference Radeon HD4xxx.

One more thing:whoever runs this test on their systems with Radeon HD4xxx risks either immediate or long term damage to their previously working Radeon cards.The VRM of reference cards is pushed beyond its specifications so no wonder the test fails(and the card could ultimately fail too).
05-23-2009, 06:07 AM
Tetedeiench

We're not saying "this will happen in every game". We're saying "the design is flawed". We're saying "there's a possibility this will happen in another program". Why is that ? Because it happened in a legitimate program (OCCT is just a a DirectX9 scene, i remind you), so it can happen again.

When the pentium bug appeared for Intel (remember : http://en.wikipedia.org/wiki/Pentium_FDIV_bug ), it appeared very rarely, and in very specific applications, and in none that were available publicly. Yet Intel did recall the CPUs Am i saying we do have the exact same thing ? No, the P4 bug wasn't stress related. Was the problem described by the one who discovered the Pentium bug a power virus ? No, of course not. Am i saying AMD should do the very same thing, a huge recall ? That's up to them, i am not knowlegdeable enough to be able to know if they should, or not. I'd say no, personally.

I am not fiddling with frequencies, timings, or anything of the sort. I'm linking with DirectX9 libraries, happily loading a donut mesh, applying the shaders, and seeing the results.

So no, OCCT is not a power virus of any sort. You choose to install it, you choose the settings, you choose when you start it, you stop it, and how to configure it. So no, it is not a power virus. A stressfull stability test, yes, indeed. But a power virus ? No. Let's stop this 18-page old assertion that's adding nothing to the current problem please.

I do wonder how come a legitimate DirectX9 scene could push a VRM beyond its specifications on a card, at stock frequencies. They should have been stronger, more robust, don't you think ? They should be able to handle everything. They are supposed be compatible with DirectX... Yet, here's a completly valid DirectX app they can't handle... Interesting, isn't it ?
05-23-2009, 06:38 AM
Nullack

Instead of posting the whole code, would you consider some psuedocode of the alrogirthm used in the stress test?
05-23-2009, 06:59 AM
Tetedeiench

Quote:

Originally Posted by Nullack

Instead of posting the whole code, would you consider some psuedocode of the alrogirthm used in the stress test?

What would be the point ?

So that people would copy my test ? ;)

No sorry, i'd prefer not to.
05-23-2009, 07:29 AM
W1zzard

Quote:

Originally Posted by Tetedeiench

he says that it is unacceptable that a card cannot handle any code that can be produced with an API it is supposed to support (OCCT fits into that category, as it is pure DirectX9). AMD's answer, according to him, is almost already known : limit OCCT's execution speed, just as furmark. But is this acceptable ?

that's exactly what will happen. a new driver release fixes the issue. the rendering output is unchanged, so isnt that an acceptable solution?
it's basically a software based OCP that results in exactly the same what the canardpc author suggests "the regulator should warn the pilot that the limits of the components are about to be reached, resulting in a secure and automatic underclocking."

Quote:

according to him, this defect is sure to have been detected in AMD's quality check process, but have been ignored on purpose. And he goes on by saying that the OCP mecanism is badly implemented, and that this is a true problem on those cards...

uh what evidence does he have for such an accusation? just wild guesswork? maybe he called fudo and asked for advice how to write a story.
the ocp mechanism implementation is perfectly fine, it's just the ocp limit that is set too low.

i doubt the problem lies in 3 phase vs. 4 phase but ocp set too low vs. ocp set not too low. no data or evidence in the whole article, next please
05-23-2009, 07:57 AM
Solus Corvus

Quote:

Originally Posted by Tetedeiench

We're not saying "this will happen in every game". We're saying "the design is flawed". We're saying "there's a possibility this will happen in another program". Why is that ? Because it happened in a legitimate program (OCCT is just a a DirectX9 scene, i remind you), so it can happen again.

It could happen again, but it seems unlikely. Programs doing a real workload aren't going to only be using such simple shaders and doing nothing else. Even the next closest stress program (furmark) doesn't come close - and the most stressful game or GPGPU app are even further from the limit then that.

It is possible that a program could reach the limit doing ordinary operations, but it's hard to imagine what it'd be doing (besides stress testing).

Quote:

When the pentium bug appeared for Intel (remember : http://en.wikipedia.org/wiki/Pentium_FDIV_bug ), it appeared very rarely, and in very specific applications, and in none that were available publicly. Yet Intel did recall the CPUs Am i saying we do have the exact same thing ? No, the P4 bug wasn't stress related. Was the problem described by the one who discovered the Pentium bug a power virus ? No, of course not. Am i saying AMD should do the very same thing, a huge recall ? That's up to them, i am not knowlegdeable enough to be able to know if they should, or not. I'd say no, personally.

It would be a good move from the PR side of things but a bad move from the business side of things, IMO.

It's not like the chips themselves have a problem. It's just the card implementation details that are a problem. For the 99% of people that don't use OCCT, the reference board is fine. For the rest of us there are 4 phase boards and easy workarounds for 3 phase cards.

The real problem isn't deciding if a recall is a good idea, it's how to address this issue without it becoming a PR disaster. There are people out there who will use this as a chance to smear AMD, no matter how they handle it.

Quote:

Originally Posted by W1zzard

i doubt the problem lies in 3 phase vs. 4 phase but ocp set too low vs. ocp set not too low

But wouldn't the 4 phase cards probably have a higher OCP limit as well?
05-23-2009, 08:02 AM
W1zzard

Quote:

Originally Posted by Solus Corvus

But wouldn't the 4 phase cards probably have a higher OCP limit as well?

physical limit before the card explodes? probably, yes.

but the ocp limit is an artificial limit designed to avoid exactly that. it is chosen by the person who designs the power circuitry of the card. i am sure that there is plenty of headroom left in the 3 phase design
05-23-2009, 08:04 AM
LordEC911

I'm still confused at how people are missing the fact that some reference cards are not failing...
05-23-2009, 08:08 AM
Tetedeiench
Quote:

Originally Posted by W1zzard

that's exactly what will happen. a new driver release fixes the issue. the rendering output is unchanged, so isnt that an acceptable solution?
it's basically a software based OCP that results in exactly the same what the canardpc author suggests "the regulator should warn the pilot that the limits of the components are about to be reached, resulting in a secure and automatic underclocking."

uh what evidence does he have for such an accusation? just wild guesswork? maybe he called fudo and asked for advice how to write a story.
the ocp mechanism implementation is perfectly fine, it's just the ocp limit that is set too low.

i doubt the problem lies in 3 phase vs. 4 phase but ocp set too low vs. ocp set not too low. no data or evidence in the whole article, next please

I fail to see that as a valid mecanism. It is not a valid OCP. Why ? Because of two things. One which is funny, one less :
- Let's imagine that in a future i want to turn OCCT GPU:3D test into a benchmark. That is NOT planned, but let's say i want to (after all, this code is pretty efficient, isn't it ? would be interesting to get some scores out of it). As ATI slowed down some cards because of an engineering problem, i wouldn't be able to do so. That would defeat the purpose of the benchmark. That's the very same problem as 3dmark's optimization, but backwards
- Imagine that in Final Fantasy XXV, PC DX9 version, you encounter the Great Furry Donut god, which is supposed to reveal your future as a hero. Poof, black screen. Yes, i'm joking by giving a silly example, but only midly. They ensure the sole app out there right now that could trigger the bug at stock value is castrated. They would have to redo it for every other app out there... to me, that doesn't sound like an OCP that's viable.
I already used the trick "run as another exe" to defeat limitations in the catalysts (i guess i learned from the furmark story already). I have already decided that if i ever see any limit put in the drivers on OCCT GPU:3D, i'll be working on releasing, very quickly, another version that doesn't have this limit. Just for impartiality purposes, and because i fail to see the point, ethically speaking, of any restraint put in an app due to a engineering bug. Because the card will still boast its capabilities... which it cannot use fully under all conditions. And that is, to me, a problem.

As for the article, i am not the one who wrote it, but yes, they're assumptions.
05-23-2009, 08:11 AM
Tetedeiench

Quote:

Originally Posted by Solus Corvus

It could happen again, but it seems unlikely. Programs doing a real workload aren't going to only be using such simple shaders and doing nothing else. Even the next closest stress program (furmark) doesn't come close - and the most stressful game or GPGPU app are even further from the limit then that.

It is possible that a program could reach the limit doing ordinary operations, but it's hard to imagine what it'd be doing (besides stress testing).

It would be a good move from the PR side of things but a bad move from the business side of things, IMO.

It's not like the chips themselves have a problem. It's just the card implementation details that are a problem. For the 99% of people that don't use OCCT, the reference board is fine. For the rest of us there are 4 phase boards and easy workarounds for 3 phase cards.

The real problem isn't deciding if a recall is a good idea, it's how to address this issue without it becoming a PR disaster. There are people out there who will use this as a chance to smear AMD, no matter how they handle it.

But wouldn't the 4 phase cards probably have a higher OCP limit as well?

Yes it's unlikely it'll happen again, just as the Pentium bug ;) I already stated that.

I don't think a recall is a good idea, personally. I fail to see it a good solution. But people had to know that the design had a problem, that my app wasn't responsible for the black screen bug, and that the overclocking capabilities, which is a marketing argument, is limited on those cards.

I really see the recall being highly unlikely.
05-23-2009, 08:14 AM
HiJon89

Quote:

Originally Posted by LordEC911

I'm still confused at how people are missing the fact that some reference cards are not failing...

That is due to the fact that, for whatever reason, those cards are not drawing greater than 82A. There are a multitude of possible reasons for this outlined throughout the thread (not running the app with correct settings, having AA/AF forced in control panel, etc.) but the bottom line is that under the correct circumstances these cards can draw more than 82A and that all evidence indicates that once this happens the system will lock up instantly.
05-23-2009, 08:15 AM
Tetedeiench
Quote:

Originally Posted by LordEC911

I'm still confused at how people are missing the fact that some reference cards are not failing...

I guess the cards without bugs fell into the following categories :
- A card that was not a reference design
- A setting that was forced in the driver that lowered the load, and thus made the test unable to reach the limit (Anisotropic 16x, FSAA, etc)
- Bad configuration of the test
05-23-2009, 08:17 AM
W1zzard

Quote:

Originally Posted by Tetedeiench

I fail to see that as a valid mecanism. It is not a valid OCP.

it is exactly that, if you purely look at the definition. "overcurrent protection". your app brings the card into an overcurrent state. the driver does not allow that. -> it implements overcurrent protection.

final fantasy devs will notice that issue, talk to ati devrel, they will tell them to change their rendering method/shaders, problem fixed. if the dev isnt willing to do that ati will just add app/scene detection to fix it.

good luck playing cat and mouse with ati.

what do you expect ati to do? recall all cards? send all 4870 owners a 4890 ? leave the bug unfixed ? there are not really many options. they could give you a chunk of money to recode your application :)
05-23-2009, 08:20 AM
W1zzard
Quote:
Originally Posted by Tetedeiench

I guess the cards without bugs fell into the following categories :

A card that was not a reference design
A setting that was forced in the driver that lowered the load, and thus made the test unable to reach the limit (Anisotropic 16x, FSAA, etc)
Bad configuration of the test
another possible idea could be tolerances in components employed. it's not uncommon to see 5% for resistors.
05-23-2009, 08:33 AM
Solus Corvus
Quote:

Originally Posted by W1zzard

physical limit before the card explodes? probably, yes.

but the ocp limit is an artificial limit designed to avoid exactly that. it is chosen by the person who designs the power circuitry of the card. i am sure that there is plenty of headroom left in the 3 phase design

That makes sense. But I imagine that the designer of the 4 phase board would feel comfortable picking a higher limit without running into the artificial safety margin.
Quote:
Originally Posted by Tetedeiench

I fail to see that as a valid mecanism. It is not a valid OCP. Why ? Because of two things. One which is funny, one less :

Let's imagine that in a future i want to turn OCCT GPU:3D test into a benchmark. That is NOT planned, but let's say i want to (after all, this code is pretty efficient, isn't it ? would be interesting to get some scores out of it). As ATI slowed down some cards because of an engineering problem, i wouldn't be able to do so. That would defeat the purpose of the benchmark. That's the very same problem as 3dmark's optimization, but backwards
It wouldn't be a very good benchmark though. It would only be testing how fast any card could process that very specific simple shader. It would be able to test raw shader power but not much else (which is why the R770 whips the G200 in your test).

I guess you could make a whole series of tests with each one running a different simple shader. It would make a good benchmark, though extremely synthetic.
Quote:
Imagine that in Final Fantasy XXV, PC DX9 version, you encounter the Great Furry Donut god, which is supposed to reveal your future as a hero. Poof, black screen. Yes, i'm joking by giving a silly example, but only midly. They ensure the sole app out there right now that could trigger the bug at stock value is castrated. They would have to redo it for every other app out there... to me, that doesn't sound like an OCP that's viable.
No other scene geometry, AA/AF, vsync, or user interface? Seems a bit unrealistic and easy to workaround by just turning up the settings (force AF or whatever).
05-23-2009, 08:39 AM
[XC] Lead Head

I personally wouldn't expect ATI to recall cards, but make newer revisions with a higher OCP or a more robust VRM. Judging by how hot the VRM was getting on some of these cards, I'd question the capability of it.
05-23-2009, 09:19 AM
Tetedeiench

Quote:

Originally Posted by W1zzard

it is exactly that, if you purely look at the definition. "overcurrent protection". your app brings the card into an overcurrent state. the driver does not allow that. -> it implements overcurrent protection.

final fantasy devs will notice that issue, talk to ati devrel, they will tell them to change their rendering method/shaders, problem fixed. if the dev isnt willing to do that ati will just add app/scene detection to fix it.

good luck playing cat and mouse with ati.

what do you expect ati to do? recall all cards? send all 4870 owners a 4890 ? leave the bug unfixed ? there are not really many options. they could give you a chunk of money to recode your application :)

Well, the goal of my app is to put the card into the highest load possible, just as furmark. That's the problem. I worked hard to get to this point. Should i get castrated without reacting ? I don't know.

Playing cat and mouse doesn't appeal me at all. I have to say i don't know what to do.
05-23-2009, 09:30 AM
Solus Corvus

If anything they are going to limit the load in the driver while running OCCT. It's not a big deal, the end user can rename the exe if they want the full effect.
05-23-2009, 09:35 AM
LordEC911

Quote:

Originally Posted by Tetedeiench

Well, the goal of my app is to put the card into the highest load possible, just as furmark. That's the problem. I worked hard to get to this point. Should i get castrated without reacting ? I don't know.

Playing cat and mouse doesn't appeal me at all. I have to say i don't know what to do.

Then maybe you should ask Nvidia why the MUL isn't 100% efficient or maybe why your code doesn't fully load Nvidia cards?

Anyone try running a renamed .exe on Nvidia cards?
05-23-2009, 09:53 AM
eleeter
Tetedeiench here are some questions for you.
1. What would AMD have to do, to satisfy you? Admit to you privately there is a problem? Recall every card affected? Issue a press release that they have sold millions of GPU's with a "fatal" flaw?
2. Why does your test not load down the Nvidia cards with the same severity as the ATI hardware? Have you made any effort to recode your application to address this?
05-23-2009, 09:58 AM
Solus Corvus
Quote:
Originally Posted by eleeter

What would AMD have to do, to satisfy you? Admit to you privately there is a problem? Recall every card affected? Issue a press release that they have sold millions of GPU's with a "fatal" flaw?
I don't know about him, but I'd be happy with: don't make the same mistake on the R800 reference board. :p:
05-23-2009, 11:11 AM
Nightstar

Please shelve the ignorant

Quote:

Originally Posted by informal

borderline power virus

accusations. I thought we'd settled this already.

Quote:

Originally Posted by informal

One more thing:whoever runs this test on their systems with Radeon HD4xxx risks either immediate or long term damage to their previously working Radeon cards.The VRM of reference cards is pushed beyond its specifications so no wonder the test fails(and the card could ultimately fail too).

If teh 82A cuttoff is appropriate to the card design then I agree that running OCCT 3.1.0 GPU test may risk damaging the 4870/4890 series cards. However that being the case, AMD is in for a bit of fun seeing as one could easily break VGA cards without invalidating warranty. Furthermore if 82A is a realistic limit to ensure continued function what does this portend for non-reference designs that allow greater current.

You can't have it both ways. Someone has screwed up a card design here. Either AMD/ATI or their AIB partners. Time will tell.

What is really the issue at hand is that some 487/4890 must be downclocked to pass this test.

What AMD should have done was set a lower clock speed on the reference design. Thus any problems experienced while overclocking would be the fault of he who tweaked rather than the marketing department.

Then again if these cards were clocked 100-150MHz lower how would they stack up to their NVidia competitors at stock frequencies?
05-23-2009, 11:18 AM
Solus Corvus

Quote:

Originally Posted by Nightstar

What AMD should have done was set a lower clock speed on the reference design. Thus any problems experienced while overclocking would be the fault of he who tweaked rather than the marketing department.

Then again if these cards were clocked 100-150MHz lower how would they stack up to their NVidia competitors at stock frequencies?

Downclock the card and reduce performance across the board just to handle a corner case? Horrible idea.
05-23-2009, 11:21 AM
LordEC911

Quote:

Originally Posted by Solus Corvus

Downclock the card and reduce performance across the board just to handle a corner case? Horrible idea.

Yep, to statisfy the .1% using a program that came out after release...
05-23-2009, 11:24 AM
Nightstar

Much better to misrepresent the capabilities of your product and trigger the PR nightmare that is beginning right here and now eh?

Tetedeiench, please superimpose a maze over your GPU test with a little mobile furry donut to navigate said maze. Then we can call it a game and be done with the illogical dismissal of this problem.
05-23-2009, 11:29 AM
Solus Corvus

Uh, it would have been much better to just have a slightly higher OCP limit. AMD isn't misrepresenting anything - these cards run games and gpgpu apps just fine.
05-23-2009, 11:42 AM
iddqd

Quote:

Originally Posted by Greg83

Damage was not caused by Orthos
Damage was caused by overvolting, lack of proper cooling and pebkac :rofl:

I've run 52 hour prime95 25.6 64bit with my q9650 at 4.42Ghz using 1-32v-1.42v multiple times and its still doing great !

as i said, its only orthos that does it. he can run prime all day without any degradation. doing like ~4.2ghz @ 1.4-ish vcore with one of those 120mm heatpipe towers.

(there is also no degradation during "normal use" that some people over here are sneering at)

on the other hand, he is dumb for running it overclocked as high as possible for 24/7 use. Doesn't even need it - games are all about the gpu.
05-23-2009, 11:59 AM
Nightstar

A higher OCP limit may have been appropriate for these designs. But I'm not an EE and I don't work for AMD designing video cards do you?

Your dismissal of OCCT as a valid application is illogical. I asked Tetedeiench to make his test the background for a simple game in order to demonstrate this(I don't really expect him to comply).

What is misrepresented is that this card is able to function in a stable manner at the specified frequency.

If you are reading these forums indeed 1%< you. If you are posting here your part of an even smaller group. I didn't wan't to be the guy to say it but this is "Xtreme systems", not "Good enough systems" or "Mildly defective systems".
05-23-2009, 12:15 PM
Nano2k

In my opinion any component that is running at factory clocks should be able to deal with anything that you throw at it, if you overclock it and it fails then OK you are overspec and that's what you get but at factory clocks you should be able to throw anything at it.
It's not because you don't have issues in games that it's ok, a component has to be able to run at 100% and not fail. If it fails at 100% load then there is an issue.
OCCT is not running over the red line, it is running exactly at the red line and at original specs, underclocking the card fixes the issue which proves that the crashing cards at stock speed are not 100% stable, since they cannot cope with a 100% load.
It is not acceptable if my CPU runs at 100% load at stock speed and the CPU crashes, why would it be acceptable for a GPU? why do some of the reference cards work fine and others not? this again proves that some cards are good and others faulty, it would be another story again if all crashed.

ATI pushed the limits of the design, found a setting that was preventing most of the cards from burning out and yet ran all games fine in most situations, but yet many cards crash at 100% load. They have made a compromise and they know it, that's why they won't comment about it.

I don't think that the manufacturers using stronger power supply circuits and non reference designs are doing it for the fun, they know why they do it and why they spend more money on it. How much money does one of these components cost? 1$ maybe? imagine ATI saving hundreds of thousand of dollars just because they use one component less and it still works in 99.99% of the situations.

I'm not saying Nvidia is better and maybe the software does not load Nvidia cards as much as ATI, yet it's not excuse for ATI playing with design limitations to save $$$.
05-23-2009, 12:16 PM
Solus Corvus

Quote:

Originally Posted by Nightstar

A higher OCP limit may have been appropriate for these designs. But I'm not an EE and I don't work for AMD designing video cards do you?

W1zz doesn't work for AMD designing cards either but I am willing to take his word for it that there is probably enough overhead for a higher OCP limit. And even if he is wrong, there are lots of cards out there to choose from with a beefier power section.

Quote:

Your dismissal of OCCT as a valid application is illogical.

I did no such thing.

Quote:

What is misrepresented is that this card is able to function in a stable manner at the specified frequency.

Feel free to dig up some product literature that says that.

Quote:

If you are reading these forums indeed 1%< you. If you are posting here your part of an even smaller group. I didn't wan't to be the guy to say it but this is "Xtreme systems", not "Good enough systems" or "Mildly defective systems".

The 99% represents normal users. I said "the rest of us" in describing extreme users, clearly including myself in the group of OCCT users and overclockers (ie. the 1%). How was that not clear from my posts?
05-23-2009, 12:37 PM
Tetedeiench

Quote:

Originally Posted by LordEC911

Then maybe you should ask Nvidia why the MUL isn't 100% efficient or maybe why your code doesn't fully load Nvidia cards?

Anyone try running a renamed .exe on Nvidia cards?

I'll look into it. The temps are skyrocketing as well as the VRM usage. I do wonder why the FPS is below, but that's about it.

I'll try to look at it next week. I promise you it'll be done ASAP. But i probably won't have enough time before the end of the next week, looking at my schedule.

I do wonder if this is not caused by an optimization on a function AMD did, and Nvidia didn't do. I'll do some testing.
05-23-2009, 12:40 PM
ghost101

Havent been XS in a while and so missed thread. My reference hd 4890 crashes when heavily overclocked. Only OCCT used to do this and I suspected it was due to the 85A load, something nothing else could. Since no game even come anywhere close to this figure, I don't care.
05-23-2009, 12:49 PM
Solus Corvus

Quote:

Originally Posted by Tetedeiench

I'll look into it. The temps are skyrocketing as well as the VRM usage. I do wonder why the FPS is below, but that's about it.

I'll try to look at it next week. I promise you it'll be done ASAP. But i probably won't have enough time before the end of the next week, looking at my schedule.

I do wonder if this is not caused by an optimization on a function AMD did, and Nvidia didn't do. I'll do some testing.

How the nVidia and ATI architectures handle complex/transcendental functions is a bit different. You could try some shaders that do square roots, exponentials, logs, sine/cosine, etc.
05-23-2009, 01:27 PM
Tetedeiench

Quote:

Originally Posted by Solus Corvus

How the nVidia and ATI architectures handle complex/transcendental functions is a bit different. You could try some shaders that do square roots, exponentials, logs, sine/cosine, etc.

That's what i'm doing, don't worry. I'm looking at the combination i found, and seeing if i can optimize it for both architectures :D

If it comes from the shader that is. And if that part of the shader is at stake, and the more i look at it, the more i doubt it. i'll test it and let you guys know.
05-23-2009, 02:53 PM
JMKS

I have an Asus 4870 with Glaciator heatsink. Non-reference design, no problems with OCP but... it has 1,20V GPU stock and it generates errors on absolutely stock settings.
Assuming this is a valid stability testing program - not good at all :p:.
http://img29.imageshack.us/img29/7366/4870stock.jpg
05-23-2009, 03:28 PM
W1zzard

Quote:

Originally Posted by Nightstar

Tetedeiench, please superimpose a maze over your GPU test with a little mobile furry donut to navigate said maze. Then we can call it a game and be done with the illogical dismissal of this problem.

do that and i doubt you will get anywhere near that load
05-23-2009, 03:34 PM
nfm

Time to build higher quality pwm circuits, were are moving into 1GHz frequencies and yet have such cheap designs, resulting in ringing or unacceptable temperatures.
05-23-2009, 03:39 PM
W1zzard

i dont know, does anyone design a car that can go 1000 miles on a highway in 1st gear ? technically that would be valid driving too
i'm a car noob, so i might be wrong.
05-23-2009, 04:43 PM
bluedevil

I did some tests:
Used latest dx9 updates set catalyst 9.5 to defaults, set fan to 90% to prevent overheating set gpu and mem to 750 /900 default clocks(temps never got above 60 degrees in any tests performed ).
Used Furmark(OpenGL) : renamed it to something else ,set rez 1920 X1200 set full screen and ran benchmarking and stability test and ran without a problem no artifacts no crashes no vpu restarts and vddc curent max was 80 A using gpu-z 0.3.3
So i failed at making it go above 83 A but let's see if it matters because no games ever make it reach that limit.
Next i used ATT (DX9): this lil app has a very small 3d-render test and in less than 3 min it made my card crash(computer restarted) !!! i was watching gpu-z and vddc curent never got above 57 A so that seemed weird?
I then used OCCT(DX9): gpu test in window mode on low rez 1024 x758 shader complexity default 0 and it crashed(computer restarted) even faster then ATT and there is no way it got to 83 A using those settings.
Why does it matter that att and occt never reached 83A and still crashed ?
Because my card Gainward 4870 512 reference design is unstable in games like crysis unreal 3 mass effect dx 9 (winxp 32) even on stock clocks and games never reach that high vddc .
After reading http://www.geeks3d.com/?p=3246 it made me think that maybe the catalyst drivers are buggy or the last dx9 update isn't getting along with the catalyst drivers(ati have been focusing on dx10 optimizations) or contains some shader bug that afects ati cards in some way ,that would explain why some reference cards don't crash (they don't have the latest dx9 update).
Any other explanation as to why only directx stuff crashes?
ps: i am not a 3dguru so don't get mad if i did something wrong
05-24-2009, 12:06 AM
Tetedeiench

Quote:

Originally Posted by bluedevil

I did some tests:
Used latest dx9 updates set catalyst 9.5 to defaults, set fan to 90% to prevent overheating set gpu and mem to 750 /900 default clocks(temps never got above 60 degrees in any tests performed ).
Used Furmark(OpenGL) : renamed it to something else ,set rez 1920 X1200 set full screen and ran benchmarking and stability test and ran without a problem no artifacts no crashes no vpu restarts and vddc curent max was 80 A using gpu-z 0.3.3
So i failed at making it go above 83 A but let's see if it matters because no games ever make it reach that limit.
Next i used ATT (DX9): this lil app has a very small 3d-render test and in less than 3 min it made my card crash(computer restarted) !!! i was watching gpu-z and vddc curent never got above 57 A so that seemed weird?
I then used OCCT(DX9): gpu test in window mode on low rez 1024 x758 shader complexity default 0 and it crashed(computer restarted) even faster then ATT and there is no way it got to 83 A using those settings.
Why does it matter that att and occt never reached 83A and still crashed ?
Because my card Gainward 4870 512 reference design is unstable in games like crysis unreal 3 mass effect dx 9 (winxp 32) even on stock clocks and games never reach that high vddc .
After reading http://www.geeks3d.com/?p=3246 it made me think that maybe the catalyst drivers are buggy or the last dx9 update isn't getting along with the catalyst drivers(ati have been focusing on dx10 optimizations) or contains some shader bug that afects ati cards in some way ,that would explain why some reference cards don't crash (they don't have the latest dx9 update).
Any other explanation as to why only directx stuff crashes?
ps: i am not a 3dguru so don't get mad if i did something wrong

Thanks for your tests ;) However, this bug can be reproduced with furmark if you bump the vGPU a little, as furmark is a tad less efficient than my test. It bumps the A on the VRM (and temps too), and you have the very same crashes.

I'm working on the fps on the Nvidia architecture at the moment. i'll keep you updated on my findings.
05-24-2009, 07:32 AM
Talonman

I would love a new feature in OCCT. :)

A graphics test that generated a 50% Graphics workload, and a 50% PhysX workload, at the same time.

But that's another thread...

http://www.xtremesystems.org/forums/...44#post3806644

I could't resit having the programmers ear, that made such a handy App. :up:

Carry on with your discussion. (Derail end.)
05-24-2009, 01:30 PM
Tetedeiench

I've spent about 4 hours on the FPS "problem", and cannot find the cause. Everything i tried that seem to augment the FPS is simply simplifying the shader code once compiled, and thus augmenting the efficiency on both architecture once used. Truly.

I'd think the difference between the ATI FPS score and the Nvidia Score is due to the way they implemented the Shaders. The latter are still under heavy load, mind you. I asked users to do some testing on their Nvidia cards, and here are the results. Overclocked, they could get a GTX280 to 114A :
http://img199.imageshack.us/img199/4052/aroxx.jpg

I'm trying to get a screenshot @ stock values, which is what you are going to ask next. I do wonder if the latest value is @stock. I asked the user to do it.

Unfortunatly, the card i have is a GTX285, without VRMa readings :(

Mind you, i'm not abandoning that fast the matter, i'll still investigate. But the test is already really stressfull on those cards. That's why i'm starting to think the FPS value is "normal".
05-24-2009, 02:09 PM
W1zzard

tet, install nvidia perfhud. very useful for debugging such things
05-24-2009, 02:22 PM
LordEC911

Quote:

Originally Posted by Tetedeiench

I've spent about 4 hours on the FPS "problem", and cannot find the cause. Everything i tried that seem to augment the FPS is simply simplifying the shader code once compiled, and thus augmenting the efficiency on both architecture once used. Truly.

I'd think the difference between the ATI FPS score and the Nvidia Score is due to the way they implemented the Shaders. The latter are still under heavy load, mind you. I asked users to do some testing on their Nvidia cards, and here are the results. Overclocked, they could get a GTX280 to 114A :

Mind you, i'm not abandoning that fast the matter, i'll still investigate. But the test is already really stressfull on those cards. That's why i'm starting to think the FPS value is "normal".

Hmmm... so this test also shows that you are not receiving 100% the performance of an Nvidia chip.
Why would you purchase a product that you cannot fully utilize?

See, same argument can go both ways...
05-24-2009, 02:37 PM
Tetedeiench

Quote:

Originally Posted by LordEC911

Hmmm... so this test also shows that you are not receiving 100% the performance of an Nvidia chip.
Why would you purchase a product that you cannot fully utilize?

See, same argument can go both ways...

Please, everybody understood you want to defend ATI/AMD by any means, every arguments you want to find, so let's try to quit the fanboyism for a bit...

As the architecture are different, the apps and test will give different results, and, globally, you'll have a ranking. But every test taken one by one will give you a different ranking, or performance rating for the cards.

I'm waiting for the RivaTuner VRMA screenshot @ stock.

More, i said i didn't give up on the issue. I'm following wizz advice right now, as i did hit the limit of what i could do manually. I'm still working on it.
05-24-2009, 02:41 PM
BeepBeep2

Tetedeiench, I've loved OCCT since version 1 but there is not point in declaring this a design flaw. Every single game on the market runs completely fine on the ATI 4800 series, I've got my ATI 4850 @ 715/2150. (I'm debating between 720/2180 and 715/2150)

Does it fail the test? Yes, in less than 5 minutes. Does it run all my games for 6 hrs straight every single day? Yes. It's stable. What you need to do is put an approximate 90% load on the card, and monitor it.
05-24-2009, 02:46 PM
Solus Corvus

Quote:

Originally Posted by LordEC911

Hmmm... so this test also shows that you are not receiving 100% the performance of an Nvidia chip.
Why would you purchase a product that you cannot fully utilize?

See, same argument can go both ways...

Not really. It's probably loading the NV cards 100% also. It's just that the ATI cards whip the nVidia cards when it comes to processing this shader code.

Tetedeiench, I'm curious about the nature of the instructions you are sending these cards. Is it mostly really simple instructions like FP/INT multiplies, adds, etc. Or are you using the complex functions more like sin, cos, log, etc. Or is it a mixture of simple and complex functions? It is interesting that there is such a gap between how well the two architectures handle this code.
05-24-2009, 03:37 PM
LordEC911

Quote:

Originally Posted by Tetedeiench

Please, everybody understood you want to defend ATI/AMD by any means, every arguments you want to find, so let's try to quit the fanboyism for a bit...

As the architecture are different, the apps and test will give different results, and, globally, you'll have a ranking. But every test taken one by one will give you a different ranking, or performance rating for the cards.

I'm waiting for the RivaTuner VRMA screenshot @ stock.

More, i said i didn't give up on the issue. I'm following wizz advice right now, as i did hit the limit of what i could do manually. I'm still working on it.

And you seem to be in Nvidia's pocket...
Get off your high horse and take a look at the numbers, they don't make sense.
You agree they don't make sense and then dismiss there is a problem.

Quote:

Originally Posted by Solus Corvus

Not really. It's probably loading the NV cards 100% also. It's just that the ATI cards whip the nVidia cards when it comes to processing this shader code.

Tetedeiench, I'm curious about the nature of the instructions you are sending these cards. Is it mostly really simple instructions like FP/INT multiplies, adds, etc. Or are you using the complex functions more like sin, cos, log, etc. Or is it a mixture of simple and complex functions? It is interesting that there is such a gap between how well the two architectures handle this code.

As I stated and showed, it is not loading Nvidia cards 100%, Tet even agreed to this.

Also why are you asking him about his code? He obviously has something to hide and doesn't want any to know what it is.
05-24-2009, 03:50 PM
Solus Corvus

Quote:

Originally Posted by LordEC911

And you seem to be in Nvidia's pocket...
Get off your high horse and take a look at the numbers, they don't make sense.
You agree they don't make sense and then dismiss there is a problem.

As I stated and showed, it is not loading Nvidia cards 100%, Tet even agreed to this.

Also why are you asking him about his code? He obviously has something to hide and doesn't want any to know what it is.

He agreed that the NV cards aren't as fast, not that they aren't being loaded 100%.

I'm asking him about his code because I am genuinely curious about how the differences in the architectures play out in real code. Instead of grabbing for straws to find any gotcha that I can use to insult Tetedeiench and dismiss the entire issue.
05-24-2009, 03:56 PM
[XC] riptide

Quote:

Originally Posted by LordEC911

He obviously has something to hide and doesn't want any to know what it is.

Clearly you are no use to this thread looking at your last 5-6 posts. Please stay out of it. Thanks.
05-24-2009, 04:11 PM
Warboy

Quote:

Originally Posted by W1zzard

i dont know, does anyone design a car that can go 1000 miles on a highway in 1st gear ? technically that would be valid driving too
i'm a car noob, so i might be wrong.

Well, Based on Car terminology. It would be like this. In ATi's case.

They designed a very competing and good car. But little does the consumer know that it has a 5 speed transmission. But the 4th gear is missing. This helps cut production costs. Casual people don't know this 4th gear is missing. Only Enthusiast level people know.
05-24-2009, 04:16 PM
Solus Corvus

:rofl::rofl::rofl::rofl:

Car analogies are the worst, lol.
05-24-2009, 04:26 PM
wrangler

Quote:

Originally Posted by nfm

Time to build higher quality pwm circuits, were are moving into 1GHz frequencies and yet have such cheap designs, resulting in ringing or unacceptable temperatures.

Exactly :up:
05-24-2009, 04:35 PM
G0ldBr1ck

Quote:

Originally Posted by Solus Corvus

:rofl::rofl::rofl::rofl:

Car analogies are the worst, lol.

They just keep getting worse and making less and less since :confused:

Show 100 post(s) from this thread on one page

All times are GMT -8. The time now is 04:05 PM.

XtremeSystems