Page 19 of 30 FirstFirst ... 91617181920212229 ... LastLast
Results 451 to 475 of 730

Thread: OCCT 3.1.0 shows HD4870/4890 design flaw - they can't handle the new GPU test !

  1. #451
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by Tetedeiench View Post
    he says that it is unacceptable that a card cannot handle any code that can be produced with an API it is supposed to support (OCCT fits into that category, as it is pure DirectX9). AMD's answer, according to him, is almost already known : limit OCCT's execution speed, just as furmark. But is this acceptable ?
    that's exactly what will happen. a new driver release fixes the issue. the rendering output is unchanged, so isnt that an acceptable solution?
    it's basically a software based OCP that results in exactly the same what the canardpc author suggests "the regulator should warn the pilot that the limits of the components are about to be reached, resulting in a secure and automatic underclocking."

    according to him, this defect is sure to have been detected in AMD's quality check process, but have been ignored on purpose. And he goes on by saying that the OCP mecanism is badly implemented, and that this is a true problem on those cards...
    uh what evidence does he have for such an accusation? just wild guesswork? maybe he called fudo and asked for advice how to write a story.
    the ocp mechanism implementation is perfectly fine, it's just the ocp limit that is set too low.

    i doubt the problem lies in 3 phase vs. 4 phase but ocp set too low vs. ocp set not too low. no data or evidence in the whole article, next please
    Last edited by W1zzard; 05-23-2009 at 07:50 AM.

  2. #452
    Xtreme Addict
    Join Date
    Jul 2007
    Posts
    1,488
    Quote Originally Posted by Tetedeiench View Post
    We're not saying "this will happen in every game". We're saying "the design is flawed". We're saying "there's a possibility this will happen in another program". Why is that ? Because it happened in a legitimate program (OCCT is just a a DirectX9 scene, i remind you), so it can happen again.
    It could happen again, but it seems unlikely. Programs doing a real workload aren't going to only be using such simple shaders and doing nothing else. Even the next closest stress program (furmark) doesn't come close - and the most stressful game or GPGPU app are even further from the limit then that.

    It is possible that a program could reach the limit doing ordinary operations, but it's hard to imagine what it'd be doing (besides stress testing).

    When the pentium bug appeared for Intel (remember : http://en.wikipedia.org/wiki/Pentium_FDIV_bug ), it appeared very rarely, and in very specific applications, and in none that were available publicly. Yet Intel did recall the CPUs Am i saying we do have the exact same thing ? No, the P4 bug wasn't stress related. Was the problem described by the one who discovered the Pentium bug a power virus ? No, of course not. Am i saying AMD should do the very same thing, a huge recall ? That's up to them, i am not knowlegdeable enough to be able to know if they should, or not. I'd say no, personally.
    It would be a good move from the PR side of things but a bad move from the business side of things, IMO.

    It's not like the chips themselves have a problem. It's just the card implementation details that are a problem. For the 99% of people that don't use OCCT, the reference board is fine. For the rest of us there are 4 phase boards and easy workarounds for 3 phase cards.

    The real problem isn't deciding if a recall is a good idea, it's how to address this issue without it becoming a PR disaster. There are people out there who will use this as a chance to smear AMD, no matter how they handle it.

    Quote Originally Posted by W1zzard View Post
    i doubt the problem lies in 3 phase vs. 4 phase but ocp set too low vs. ocp set not too low
    But wouldn't the 4 phase cards probably have a higher OCP limit as well?

  3. #453
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by Solus Corvus View Post
    But wouldn't the 4 phase cards probably have a higher OCP limit as well?
    physical limit before the card explodes? probably, yes.

    but the ocp limit is an artificial limit designed to avoid exactly that. it is chosen by the person who designs the power circuitry of the card. i am sure that there is plenty of headroom left in the 3 phase design

  4. #454
    Xtreme Addict
    Join Date
    May 2007
    Location
    'Zona
    Posts
    2,346
    I'm still confused at how people are missing the fact that some reference cards are not failing...
    Originally Posted by motown_steve
    Every genocide that was committed during the 20th century has been preceded by the disarmament of the target population. Once the government outlaws your guns your life becomes a luxury afforded to you by the state. You become a tool to benefit the state. Should you cease to benefit the state or even worse become an annoyance or even a hindrance to the state then your life becomes more trouble than it is worth.

    Once the government outlaws your guns your life is forfeit. You're already dead, it's just a question of when they are going to get around to you.

  5. #455
    Xtreme Member
    Join Date
    Dec 2006
    Posts
    213
    Quote Originally Posted by W1zzard View Post
    that's exactly what will happen. a new driver release fixes the issue. the rendering output is unchanged, so isnt that an acceptable solution?
    it's basically a software based OCP that results in exactly the same what the canardpc author suggests "the regulator should warn the pilot that the limits of the components are about to be reached, resulting in a secure and automatic underclocking."


    uh what evidence does he have for such an accusation? just wild guesswork? maybe he called fudo and asked for advice how to write a story.
    the ocp mechanism implementation is perfectly fine, it's just the ocp limit that is set too low.

    i doubt the problem lies in 3 phase vs. 4 phase but ocp set too low vs. ocp set not too low. no data or evidence in the whole article, next please
    I fail to see that as a valid mecanism. It is not a valid OCP. Why ? Because of two things. One which is funny, one less :
    • Let's imagine that in a future i want to turn OCCT GPU:3D test into a benchmark. That is NOT planned, but let's say i want to (after all, this code is pretty efficient, isn't it ? would be interesting to get some scores out of it). As ATI slowed down some cards because of an engineering problem, i wouldn't be able to do so. That would defeat the purpose of the benchmark. That's the very same problem as 3dmark's optimization, but backwards
    • Imagine that in Final Fantasy XXV, PC DX9 version, you encounter the Great Furry Donut god, which is supposed to reveal your future as a hero. Poof, black screen. Yes, i'm joking by giving a silly example, but only midly. They ensure the sole app out there right now that could trigger the bug at stock value is castrated. They would have to redo it for every other app out there... to me, that doesn't sound like an OCP that's viable.


    I already used the trick "run as another exe" to defeat limitations in the catalysts (i guess i learned from the furmark story already). I have already decided that if i ever see any limit put in the drivers on OCCT GPU:3D, i'll be working on releasing, very quickly, another version that doesn't have this limit. Just for impartiality purposes, and because i fail to see the point, ethically speaking, of any restraint put in an app due to a engineering bug. Because the card will still boast its capabilities... which it cannot use fully under all conditions. And that is, to me, a problem.

    As for the article, i am not the one who wrote it, but yes, they're assumptions.

  6. #456
    Xtreme Member
    Join Date
    Dec 2006
    Posts
    213
    Quote Originally Posted by Solus Corvus View Post
    It could happen again, but it seems unlikely. Programs doing a real workload aren't going to only be using such simple shaders and doing nothing else. Even the next closest stress program (furmark) doesn't come close - and the most stressful game or GPGPU app are even further from the limit then that.

    It is possible that a program could reach the limit doing ordinary operations, but it's hard to imagine what it'd be doing (besides stress testing).


    It would be a good move from the PR side of things but a bad move from the business side of things, IMO.

    It's not like the chips themselves have a problem. It's just the card implementation details that are a problem. For the 99% of people that don't use OCCT, the reference board is fine. For the rest of us there are 4 phase boards and easy workarounds for 3 phase cards.

    The real problem isn't deciding if a recall is a good idea, it's how to address this issue without it becoming a PR disaster. There are people out there who will use this as a chance to smear AMD, no matter how they handle it.


    But wouldn't the 4 phase cards probably have a higher OCP limit as well?
    Yes it's unlikely it'll happen again, just as the Pentium bug I already stated that.

    I don't think a recall is a good idea, personally. I fail to see it a good solution. But people had to know that the design had a problem, that my app wasn't responsible for the black screen bug, and that the overclocking capabilities, which is a marketing argument, is limited on those cards.

    I really see the recall being highly unlikely.

  7. #457
    Xtreme Addict
    Join Date
    Oct 2004
    Location
    Boston, MA
    Posts
    1,448
    Quote Originally Posted by LordEC911 View Post
    I'm still confused at how people are missing the fact that some reference cards are not failing...
    That is due to the fact that, for whatever reason, those cards are not drawing greater than 82A. There are a multitude of possible reasons for this outlined throughout the thread (not running the app with correct settings, having AA/AF forced in control panel, etc.) but the bottom line is that under the correct circumstances these cards can draw more than 82A and that all evidence indicates that once this happens the system will lock up instantly.

    File Server:
    Super Micro X8DTi
    2x E5620 2.4Ghz Westmere
    12GB DDR3 ECC Registered
    50GB OCZ Vertex 2
    RocketRaid 3520
    6x 1.5TB RAID5
    Zotac GT 220
    Zippy 600W

    3DMark05: 12308
    3DMark03: 25820

  8. #458
    Xtreme Member
    Join Date
    Dec 2006
    Posts
    213
    Quote Originally Posted by LordEC911 View Post
    I'm still confused at how people are missing the fact that some reference cards are not failing...
    I guess the cards without bugs fell into the following categories :
    • A card that was not a reference design
    • A setting that was forced in the driver that lowered the load, and thus made the test unable to reach the limit (Anisotropic 16x, FSAA, etc)
    • Bad configuration of the test

  9. #459
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by Tetedeiench View Post
    I fail to see that as a valid mecanism. It is not a valid OCP.
    it is exactly that, if you purely look at the definition. "overcurrent protection". your app brings the card into an overcurrent state. the driver does not allow that. -> it implements overcurrent protection.

    final fantasy devs will notice that issue, talk to ati devrel, they will tell them to change their rendering method/shaders, problem fixed. if the dev isnt willing to do that ati will just add app/scene detection to fix it.

    good luck playing cat and mouse with ati.

    what do you expect ati to do? recall all cards? send all 4870 owners a 4890 ? leave the bug unfixed ? there are not really many options. they could give you a chunk of money to recode your application
    Last edited by W1zzard; 05-23-2009 at 08:23 AM.

  10. #460
    Xtreme Legend
    Join Date
    Jan 2003
    Location
    Stuttgart, Germany
    Posts
    929
    Quote Originally Posted by Tetedeiench View Post
    I guess the cards without bugs fell into the following categories :
    • A card that was not a reference design
    • A setting that was forced in the driver that lowered the load, and thus made the test unable to reach the limit (Anisotropic 16x, FSAA, etc)
    • Bad configuration of the test
    another possible idea could be tolerances in components employed. it's not uncommon to see 5% for resistors.
    Last edited by W1zzard; 05-23-2009 at 08:30 AM.

  11. #461
    Xtreme Addict
    Join Date
    Jul 2007
    Posts
    1,488
    Quote Originally Posted by W1zzard View Post
    physical limit before the card explodes? probably, yes.

    but the ocp limit is an artificial limit designed to avoid exactly that. it is chosen by the person who designs the power circuitry of the card. i am sure that there is plenty of headroom left in the 3 phase design
    That makes sense. But I imagine that the designer of the 4 phase board would feel comfortable picking a higher limit without running into the artificial safety margin.

    Quote Originally Posted by Tetedeiench View Post
    I fail to see that as a valid mecanism. It is not a valid OCP. Why ? Because of two things. One which is funny, one less :
    • Let's imagine that in a future i want to turn OCCT GPU:3D test into a benchmark. That is NOT planned, but let's say i want to (after all, this code is pretty efficient, isn't it ? would be interesting to get some scores out of it). As ATI slowed down some cards because of an engineering problem, i wouldn't be able to do so. That would defeat the purpose of the benchmark. That's the very same problem as 3dmark's optimization, but backwards
    It wouldn't be a very good benchmark though. It would only be testing how fast any card could process that very specific simple shader. It would be able to test raw shader power but not much else (which is why the R770 whips the G200 in your test).

    I guess you could make a whole series of tests with each one running a different simple shader. It would make a good benchmark, though extremely synthetic.

    • Imagine that in Final Fantasy XXV, PC DX9 version, you encounter the Great Furry Donut god, which is supposed to reveal your future as a hero. Poof, black screen. Yes, i'm joking by giving a silly example, but only midly. They ensure the sole app out there right now that could trigger the bug at stock value is castrated. They would have to redo it for every other app out there... to me, that doesn't sound like an OCP that's viable.
    No other scene geometry, AA/AF, vsync, or user interface? Seems a bit unrealistic and easy to workaround by just turning up the settings (force AF or whatever).

  12. #462
    Xtreme Cruncher
    Join Date
    Nov 2005
    Location
    Rhode Island
    Posts
    2,740
    I personally wouldn't expect ATI to recall cards, but make newer revisions with a higher OCP or a more robust VRM. Judging by how hot the VRM was getting on some of these cards, I'd question the capability of it.
    Fold for XS!
    You know you want to

  13. #463
    Xtreme Member
    Join Date
    Dec 2006
    Posts
    213
    Quote Originally Posted by W1zzard View Post
    it is exactly that, if you purely look at the definition. "overcurrent protection". your app brings the card into an overcurrent state. the driver does not allow that. -> it implements overcurrent protection.

    final fantasy devs will notice that issue, talk to ati devrel, they will tell them to change their rendering method/shaders, problem fixed. if the dev isnt willing to do that ati will just add app/scene detection to fix it.

    good luck playing cat and mouse with ati.

    what do you expect ati to do? recall all cards? send all 4870 owners a 4890 ? leave the bug unfixed ? there are not really many options. they could give you a chunk of money to recode your application
    Well, the goal of my app is to put the card into the highest load possible, just as furmark. That's the problem. I worked hard to get to this point. Should i get castrated without reacting ? I don't know.

    Playing cat and mouse doesn't appeal me at all. I have to say i don't know what to do.

  14. #464
    Xtreme Addict
    Join Date
    Jul 2007
    Posts
    1,488
    If anything they are going to limit the load in the driver while running OCCT. It's not a big deal, the end user can rename the exe if they want the full effect.

  15. #465
    Xtreme Addict
    Join Date
    May 2007
    Location
    'Zona
    Posts
    2,346
    Quote Originally Posted by Tetedeiench View Post
    Well, the goal of my app is to put the card into the highest load possible, just as furmark. That's the problem. I worked hard to get to this point. Should i get castrated without reacting ? I don't know.

    Playing cat and mouse doesn't appeal me at all. I have to say i don't know what to do.
    Then maybe you should ask Nvidia why the MUL isn't 100% efficient or maybe why your code doesn't fully load Nvidia cards?

    Anyone try running a renamed .exe on Nvidia cards?
    Originally Posted by motown_steve
    Every genocide that was committed during the 20th century has been preceded by the disarmament of the target population. Once the government outlaws your guns your life becomes a luxury afforded to you by the state. You become a tool to benefit the state. Should you cease to benefit the state or even worse become an annoyance or even a hindrance to the state then your life becomes more trouble than it is worth.

    Once the government outlaws your guns your life is forfeit. You're already dead, it's just a question of when they are going to get around to you.

  16. #466
    Banned
    Join Date
    Jan 2008
    Location
    Canada
    Posts
    707
    Tetedeiench here are some questions for you.

    1. What would AMD have to do, to satisfy you? Admit to you privately there is a problem? Recall every card affected? Issue a press release that they have sold millions of GPU's with a "fatal" flaw?
    2. Why does your test not load down the Nvidia cards with the same severity as the ATI hardware? Have you made any effort to recode your application to address this?

  17. #467
    Xtreme Addict
    Join Date
    Jul 2007
    Posts
    1,488
    Quote Originally Posted by eleeter View Post
    1. What would AMD have to do, to satisfy you? Admit to you privately there is a problem? Recall every card affected? Issue a press release that they have sold millions of GPU's with a "fatal" flaw?
    I don't know about him, but I'd be happy with: don't make the same mistake on the R800 reference board.

  18. #468
    Xtreme Member
    Join Date
    Nov 2008
    Posts
    238
    Please shelve the ignorant
    Quote Originally Posted by informal View Post
    borderline power virus
    accusations. I thought we'd settled this already.

    Quote Originally Posted by informal View Post
    One more thing:whoever runs this test on their systems with Radeon HD4xxx risks either immediate or long term damage to their previously working Radeon cards.The VRM of reference cards is pushed beyond its specifications so no wonder the test fails(and the card could ultimately fail too).
    If teh 82A cuttoff is appropriate to the card design then I agree that running OCCT 3.1.0 GPU test may risk damaging the 4870/4890 series cards. However that being the case, AMD is in for a bit of fun seeing as one could easily break VGA cards without invalidating warranty. Furthermore if 82A is a realistic limit to ensure continued function what does this portend for non-reference designs that allow greater current.

    You can't have it both ways. Someone has screwed up a card design here. Either AMD/ATI or their AIB partners. Time will tell.

    What is really the issue at hand is that some 487/4890 must be downclocked to pass this test.

    What AMD should have done was set a lower clock speed on the reference design. Thus any problems experienced while overclocking would be the fault of he who tweaked rather than the marketing department.

    Then again if these cards were clocked 100-150MHz lower how would they stack up to their NVidia competitors at stock frequencies?
    Last edited by Nightstar; 05-23-2009 at 11:15 AM. Reason: forgot quote code
    OCZ, where life-time warranty means until we're out of stock!

  19. #469
    Xtreme Addict
    Join Date
    Jul 2007
    Posts
    1,488
    Quote Originally Posted by Nightstar View Post
    What AMD should have done was set a lower clock speed on the reference design. Thus any problems experienced while overclocking would be the fault of he who tweaked rather than the marketing department.

    Then again if these cards were clocked 100-150MHz lower how would they stack up to their NVidia competitors at stock frequencies?
    Downclock the card and reduce performance across the board just to handle a corner case? Horrible idea.

  20. #470
    Xtreme Addict
    Join Date
    May 2007
    Location
    'Zona
    Posts
    2,346
    Quote Originally Posted by Solus Corvus View Post
    Downclock the card and reduce performance across the board just to handle a corner case? Horrible idea.
    Yep, to statisfy the .1% using a program that came out after release...
    Originally Posted by motown_steve
    Every genocide that was committed during the 20th century has been preceded by the disarmament of the target population. Once the government outlaws your guns your life becomes a luxury afforded to you by the state. You become a tool to benefit the state. Should you cease to benefit the state or even worse become an annoyance or even a hindrance to the state then your life becomes more trouble than it is worth.

    Once the government outlaws your guns your life is forfeit. You're already dead, it's just a question of when they are going to get around to you.

  21. #471
    Xtreme Member
    Join Date
    Nov 2008
    Posts
    238
    Much better to misrepresent the capabilities of your product and trigger the PR nightmare that is beginning right here and now eh?

    Tetedeiench, please superimpose a maze over your GPU test with a little mobile furry donut to navigate said maze. Then we can call it a game and be done with the illogical dismissal of this problem.
    OCZ, where life-time warranty means until we're out of stock!

  22. #472
    Xtreme Addict
    Join Date
    Jul 2007
    Posts
    1,488
    Uh, it would have been much better to just have a slightly higher OCP limit. AMD isn't misrepresenting anything - these cards run games and gpgpu apps just fine.

  23. #473
    Xtreme Addict
    Join Date
    Dec 2004
    Location
    Flying through Space, with armoire, Armoire of INVINCIBILATAAAAY!
    Posts
    1,939
    Quote Originally Posted by Greg83 View Post
    Damage was not caused by Orthos
    Damage was caused by overvolting, lack of proper cooling and pebkac

    I've run 52 hour prime95 25.6 64bit with my q9650 at 4.42Ghz using 1-32v-1.42v multiple times and its still doing great !
    as i said, its only orthos that does it. he can run prime all day without any degradation. doing like ~4.2ghz @ 1.4-ish vcore with one of those 120mm heatpipe towers.

    (there is also no degradation during "normal use" that some people over here are sneering at)

    on the other hand, he is dumb for running it overclocked as high as possible for 24/7 use. Doesn't even need it - games are all about the gpu.
    Last edited by iddqd; 05-23-2009 at 11:48 AM.
    Sigs are obnoxious.

  24. #474
    Xtreme Member
    Join Date
    Nov 2008
    Posts
    238
    A higher OCP limit may have been appropriate for these designs. But I'm not an EE and I don't work for AMD designing video cards do you?

    Your dismissal of OCCT as a valid application is illogical. I asked Tetedeiench to make his test the background for a simple game in order to demonstrate this(I don't really expect him to comply).

    What is misrepresented is that this card is able to function in a stable manner at the specified frequency.

    If you are reading these forums indeed 1%< you. If you are posting here your part of an even smaller group. I didn't wan't to be the guy to say it but this is "Xtreme systems", not "Good enough systems" or "Mildly defective systems".
    OCZ, where life-time warranty means until we're out of stock!

  25. #475
    Xtreme Member
    Join Date
    Apr 2004
    Location
    Switzerland
    Posts
    184
    In my opinion any component that is running at factory clocks should be able to deal with anything that you throw at it, if you overclock it and it fails then OK you are overspec and that's what you get but at factory clocks you should be able to throw anything at it.
    It's not because you don't have issues in games that it's ok, a component has to be able to run at 100% and not fail. If it fails at 100% load then there is an issue.
    OCCT is not running over the red line, it is running exactly at the red line and at original specs, underclocking the card fixes the issue which proves that the crashing cards at stock speed are not 100% stable, since they cannot cope with a 100% load.
    It is not acceptable if my CPU runs at 100% load at stock speed and the CPU crashes, why would it be acceptable for a GPU? why do some of the reference cards work fine and others not? this again proves that some cards are good and others faulty, it would be another story again if all crashed.

    ATI pushed the limits of the design, found a setting that was preventing most of the cards from burning out and yet ran all games fine in most situations, but yet many cards crash at 100% load. They have made a compromise and they know it, that's why they won't comment about it.

    I don't think that the manufacturers using stronger power supply circuits and non reference designs are doing it for the fun, they know why they do it and why they spend more money on it. How much money does one of these components cost? 1$ maybe? imagine ATI saving hundreds of thousand of dollars just because they use one component less and it still works in 99.99% of the situations.

    I'm not saying Nvidia is better and maybe the software does not load Nvidia cards as much as ATI, yet it's not excuse for ATI playing with design limitations to save $$$.
    Last edited by Nano2k; 05-23-2009 at 12:18 PM.
    P8Z68 Pro
    2600k@4.5 NH-14
    8GB Corsair Vengeance 1600
    GTX 570 SLI
    Windows 7 x64
    1x Corsair F115 SSD
    WD Black 2T Raid 0
    Xfi Titanium PciX

Page 19 of 30 FirstFirst ... 91617181920212229 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •