Page 1 of 5 1234 ... LastLast
Results 1 to 25 of 111

Thread: The Book of Bulldozer - Revelations: Episode 2 (SuperPI / x87)

  1. #1
    Xtreme Legend
    Join Date
    Nov 2003
    Location
    Helsinki, Finland
    Posts
    1,692

    The Book of Bulldozer - Revelations: Episode 2 (SuperPI / x87)

    Exactly two year ago, when I tested a Bulldozer based Zambesi CPU for the first I was shocked.
    The early sample units were even hotter and slower than the final silicon revision CPUs, which finally were released four months later.
    One of the largest single let-down came from the way back: SuperPI.

    SuperPI mainly uses legacy x87 instructions which have been almost completely superceded.
    SuperPI doesn't show any indication what so ever about SMP performance as it can only utilize a single thread. On top of that it has no real world use or purpose as there are newer programs which can calculate PI almost 100 times faster.

    Still, SuperPI can almost be considered as a industry standard.
    Nowdays it is generally a VERY poor indicator of real world performance, yet it is so addictive for any old school overclocker. It scales very well along with the CPU/NB/DRAM/IO performance and tweaking it is a big challenge. An overclocker who hasn't ever benched SuperPI simply doesn't exist.

    SuperPI has a special place in my heart simply because it was one of the first benchmarks I ever ran... almost 14 years ago...

    So, why are all of the 15h (Bulldozer) based CPU/APU/NPUs performing so bad in SuperPI?
    Some people say it is because 15h family has 50% less FPs per core than the preceeding 10h family.
    In 15h family a compute unit (two cores) share a FP when the 10/12h family had a dedicated FP for each of the cores.

    If this would be the only reason, the issue would be solved when the "slave" core of the CU is disabled, leaving a "private" FP for the "master" (BSC) core. However this is not the case and it even shouldn't be as SuperPI is single threaded, remember?

    The caches on 15h family have higher latency than 10h family for example, and SuperPI happens to love large & low latency caches.
    15h family was initially designed for high frequencies. Just like the F1 engines, they produce no power at low revs. And unfortunately it currently doesn't seem to be possible to build an engine capable reving high enough. We might discuss more about the caches in "Episode 3"... If possible.

    Agner Fog from Copenhagen University College of Engineering has made an excellent document about the instruction latencies of the modern CPUs.

    Values for 10h family start from page 26, while 15h family values are located at page 36.

    Anyway...

    Few days when I was doing some low level testing for other purposes, I found something that didn't make any sense to me.
    Now I roughly know what it is and what it does, but still some questions remain: Why does this "feature" exist in the first place and why it is activated on all 15h family parts.

    I would normally assume it is a workaround for some errata, however no bulletin exists for this one either.
    Also this feature does not exist in any documentation, or it does but only AMD has access to the required level.

    I find it hard to believe that it would be a design issue as the affected instructions work fine (but slowly) and it existed since early Zambesi revisions and, currently is still present in Richland and probably beyond (within family 15h)...

    I'd say it is either a errata fix or a errata fix gone wrong.
    If it is a programming mistake which has gone un-noticed during the last two years...
    That would make me just sad

    Parts affected: AMD Barracuda (Zambesi, Vishera), AMD Comal (Trinity, Richland), AMD Virgo (Trinity, Richland)

    Effect: A massive performance hit in application heavily utilizing x87 instructions.

    Negative effects: TBD, none found yet. The performance in non x87 applications remains the same or improves very slightly. No instability, increased power consumption, reduced overclockability or anything else abnormal has been observed. However the final conclusion requires far more extended testing than I am able to do myself.

    After the fix has been applied SuperPI shows 18-30% improvement in performance.
    Bigger the calculation, bigger the improvement.

    Since this kind of fix is quite unheard of, I knew that I would be crucified if I would make such claims without any providing evidence.

    I generally hate to do videos however this time it was mandatory.

    I apologize the quality, 1080p is available but the quality is quite grainy due poor lightning.
    It was a cloudy day in Helsinki today.

    The video shows few important things:

    - In the video the fix is called as "The Plow of Bulldozer"
    - SuperPI 1.5 XS Mod validated by online MD5 checksum
    - CPU-Z 1.64.3 x32 validated by online MD5 checksum (can be found from Stasio's CPU-Z thread)
    - The clocks are being shown during the calculation (look for the affinity and CPU-Z core selection)
    - An external clock reference is provided (to prove there is no tampering with the timers, i.e. "Lab Burst" by MSI)
    - The air cooled setup is shown and so are the CPU temperatures (HWMonitor)





    For the 32M SuperPI run (time) you might want to look a reference from HWBot Piledriver 5G challenge thread.
    http://hwbot.org/submission/2386335_...n_14sec_718ms/

    39 seconds better time with stock CPU clocks (4.1GHz, NB 2500, MEMCLK DDR-2400) than on 5GHz Trinity with 2777MHz NB and DDR-2666 memory clocks.

    Since I 'happened' to have some LN2 in my disposal, I decided to do some high clock SuperPI runs on Richland.





    AMD 32nm SuperPI 32M record taken easily.
    Tomorrow when I throw in a Vishera, the reign of 10h should be finally over

    All of the runs are either completely or partially on video.
    Will upload them once I have time to edit them. I've been filming around 28GB worth of video during the last 48h hours.

    FAQ (as I would assume):

    Q: Does it help on my grandma's Palomino Athlon or on my Phenom / II or 12h Llano?
    A: NO

    Q: Does it work on my xxx branded motherboard?
    A: It has nothing to do with the motherboard, or the OS or any other component. If the CPU says AMD on it and it is Bulldozer based then it will work.

    Q: When will the fix come available to public?
    A: Once I have started and finished the programming. The program is quite simple really, only some minor hardware detection features required.

    And now...
    I will go and have atleast a proper meal and sleep.
    During the last 72 hours I have worked 45 hours, currently I probably look and smell like a pirate.

    Please don't send me pm regarding the software.
    It will be available to everyone at the same time.
    If you have something to ask, please do. I'll answer when I have time.

    "The OP will surely deliver. Let's just wait."
    Just kidding

    Soon.

    Update on 06/21/2013: Available for download. Please check post #42 for the link.

    [B]Update on 06/26/2013: Version R1.02B available Post #75
    Last edited by The Stilt; 06-26-2013 at 03:18 AM.

  2. #2
    Xtreme Enthusiast
    Join Date
    May 2006
    Posts
    625
    Very interesting read, whoo!
    'He is no fool who gives what he cannot keep to gain what he cannot lose' - Jim Elliot
    Click on the pic to download a free pdf sample of the bestselling book!


  3. #3
    Xtreme Member
    Join Date
    Dec 2012
    Location
    Buenos Aires
    Posts
    306
    O.o

  4. #4
    I am Xtreme
    Join Date
    Mar 2005
    Location
    Edmonton, Alberta
    Posts
    4,594
    I thought FX and such chips sucked at x87 since they didn't have the hardware for it...that it's "software emulated".

    Are you saying the X87 FPU exists again?




  5. #5
    Xtreme Mentor
    Join Date
    Dec 2007
    Location
    State of Confusion, USA
    Posts
    2,513
    I've been wondering why AMD has sucked in SPi going all the way back to the A64 X2's...
    Not sure your findings with BD are relevant, but it was weird (even back then) that X2's crushed P4's in everything except SPi.
    You invest so much time and have awesome skills when it comes to AMD's offerings!

    Can't wait to see your patch/fix bud! I've never been a huge fan of synthetic benchies though.
    It'll be interesting to see if it offers a performance boost in any other areas.

    I officially offer you "GURU Status"!
    AMD should put you on the payroll!
    AMD FX-8350 (1237 PGN) | Asus Crosshair V Formula (bios 1703) | G.Skill 2133 CL9 @ 2230 9-11-10 | Sapphire HD 6870 | Samsung 830 128Gb SSD / 2 WD 1Tb Black SATA3 storage | Corsair TX750 PSU
    Watercooled ST 120.3 & TC 120.1 / MCP35X XSPC Top / Apogee HD Block | WIN7 64 Bit HP | Corsair 800D Obsidian Case








    First Computer: Commodore Vic 20 (circa 1981).

  6. #6
    Xtreme Addict
    Join Date
    Feb 2005
    Location
    OZtralia
    Posts
    2,051
    Verrrry interesting

    I haven't benched in years (since conroe was first released) but I was always a little disappointed and left wondering why AMD cores were fairing so badly when tortured compared to intel

    Was it just bad mArch design, MS windoz/benchmark apps not optimized to work well with AMD parts along with all the conspiracy theories.................maybe we will find out soon
    lots and lots of cores and lots and lots of tuners,HTPC's boards,cases,HDD's,vga's,DDR1&2&3 etc etc all powered by Corsair PSU's

  7. #7
    Xtreme Enthusiast
    Join Date
    Mar 2005
    Location
    Buenos Aires, Argentina
    Posts
    644
    Can you get into direct contact with AMD engineers to tell them your findings? Is hard to believe that a errata fix for around the size of Barcelona TLB bug or P5 x87 wasn't fixed in two entire generations.

    More interesing will be your fix. I'm expecting a run-every-boot application that changes some bit in a internal register that makes Bulldozer x87 to disregard the time-consuming workaround. Do you expect that it would be included as a fix via microcode in newer BIOS upgrades? Besides, can you guess what it was supposed to fix?

  8. #8
    Xtreme Mentor
    Join Date
    Nov 2005
    Location
    Devon
    Posts
    3,437
    Very interesting finding The Stilt!

    Have you tested other x87 code to see if it improves performance there as well?
    RiG1: Ryzen 7 1700 @4.0GHz 1.39V, Asus X370 Prime, G.Skill RipJaws 2x8GB 3200MHz CL14 Samsung B-die, TuL Vega 56 Stock, Samsung SS805 100GB SLC SDD (OS Drive) + 512GB Evo 850 SSD (2nd OS Drive) + 3TB Seagate + 1TB Seagate, BeQuiet PowerZone 1000W

    RiG2: HTPC AMD A10-7850K APU, 2x8GB Kingstone HyperX 2400C12, AsRock FM2A88M Extreme4+, 128GB SSD + 640GB Samsung 7200, LG Blu-ray Recorder, Thermaltake BACH, Hiper 4M880 880W PSU

    SmartPhone Samsung Galaxy S7 EDGE
    XBONE paired with 55'' Samsung LED 3D TV

  9. #9
    Xtreme Cruncher
    Join Date
    Jun 2006
    Posts
    6,215
    Quote Originally Posted by cadaveca
    I thought FX and such chips sucked at x87 since they didn't have the hardware for it...that it's "software emulated".

    Are you saying the X87 FPU exists again?
    How do you think 15h generation can run x87 code then? Of course they have hardware for it, it's part of basic ISA .
    They sucked at it since apparently there is some "un-optimization" going on in hardware when these instructions are in question. That's how the chip was designed, the latencies for x87 instructions are noticeably higher vs 10h generation.

  10. #10
    Xtreme Addict
    Join Date
    Oct 2007
    Location
    Chicago,Illinois
    Posts
    1,182
    Test physx for me, like vantage and mafia 2.



  11. #11
    Xtreme Enthusiast
    Join Date
    Nov 2009
    Posts
    526
    When this fix is applied, does it retain even if cpu is taken out of socket, or do people need to apply it more often than once`?

  12. #12
    Registered User
    Join Date
    Nov 2010
    Posts
    26
    It depends... If it's a microcode patch you'd have to do it each time the system booted up (easily done with a start up task) or if it's an OS patch, once until you re-install the OS.

    You won't have to physically modify the CPU at all.

  13. #13
    Xtreme Legend
    Join Date
    Nov 2003
    Location
    Helsinki, Finland
    Posts
    1,692
    Quote Originally Posted by cadaveca View Post
    I thought FX and such chips sucked at x87 since they didn't have the hardware for it...that it's "software emulated".

    Are you saying the X87 FPU exists again?

    There is no dedicated x87 FPU / co-processor thats for sure.

    AMD documents: #47414 (Software Optimization Guide for AMD Family 15h Processors) & #26569 (AMD64 Architecture Programmer?s Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions) should explain how the x87 instructions are executed on 15h family. The latter one I cannot provide as it is confidential (?).

    Quote Originally Posted by zir_blazer View Post
    Can you get into direct contact with AMD engineers to tell them your findings? Is hard to believe that a errata fix for around the size of Barcelona TLB bug or P5 x87 wasn't fixed in two entire generations.

    More interesing will be your fix. I'm expecting a run-every-boot application that changes some bit in a internal register that makes Bulldozer x87 to disregard the time-consuming workaround. Do you expect that it would be included as a fix via microcode in newer BIOS upgrades? Besides, can you guess what it was supposed to fix?
    I have no interaction what so ever with AMD engineers, unfortunately.

    There are three different ways the fix could be activated permanently.
    AMD could release either a new microcode or AGESA version which reconfigures the necessary registers or a motherboard vendor could add a small piece of custom code to be activated during the CPU initialization.

    The registers can be also controlled 'in flight', which is the way I am doing it.

    Neither AGESA or the microcode seem to have any interaction on this feature.
    I've checked most of the AGESA and microcode versions for 15h family since early 2011.
    I cannot see any indication that any of these has ever taken any action to control this feature.

    This would indicate that it is not a errata workaround of anykind.
    The value has been left to default setting as defined by the RCU/ROM/Fuse of the CPU (i.e. no action taken at any point).

    Since AMD uses encrypted microcodes I could not verify their part on this feature any other way than disabling the microcode completely.
    While it changed several other things, it had no effect to this feature. Just as I expected.

    Quote Originally Posted by Lightman View Post
    Very interesting finding The Stilt!

    Have you tested other x87 code to see if it improves performance there as well?
    Like said, the x87 has been almost completely superceded.
    This makes it actually quite hard to even find a application which would still fully utilize x87 instructions, especially when newer instructions are available (supported). I should be able to disable all of the newer instructions to force the program to execute x87 only, however it might cause some undefined behavior. I haven't had time to try that yet.

    There are even more x87 related options which certainly look interesting, so I will need to try them at some point too.

    Quote Originally Posted by Mechanical Man View Post
    When this fix is applied, does it retain even if cpu is taken out of socket, or do people need to apply it more often than once`?
    The fix needs to be applied each and every time after a power cycle (cold / warm reset, shutdown, etc.).
    Unless there will be a AGESA / microcode update of course. I would not hold my breath on that as only AMD knows entirely what this feature does and why it is turned on, on 15h family in the first place.

  14. #14
    I am Xtreme
    Join Date
    Mar 2005
    Location
    Edmonton, Alberta
    Posts
    4,594
    Quote Originally Posted by informal View Post
    How do you think 15h generation can run x87 code then? Of course they have hardware for it, it's part of basic ISA .
    Don't get mad for what AMD reps told me. I was told this exact statement for 8150 launch, 8350 launch, APU launch reviews I did...I guess you didn't see the sarcasm in my post, however.

    Quote Originally Posted by The Stilt View Post
    There is no dedicated x87 FPU / co-processor thats for sure.

    AMD documents: #47414 (Software Optimization Guide for AMD Family 15h Processors) & #26569 (AMD64 Architecture Programmer?s Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions) should explain how the x87 instructions are executed on 15h family. The latter one I cannot provide as it is confidential (?).

    I already have that, thanks. :p Very interested to see how this works out, although...who's gonna get any use out of it...just benchers? So can you make updated AMD SPi32m?

  15. #15
    Xtreme Legend
    Join Date
    Nov 2003
    Location
    Helsinki, Finland
    Posts
    1,692
    Here are some of the results for other applications.
    As said, most of the results are within the margin of error.

    The only program that shows are larger difference is Linpack, however I find the result as inconclusive for following reason: The Linpack had to be patched in order to make it execute on AMD CPUs in the first place. It also fully optimized for Intel ONLY.

    Note: + = improvement over default
    ME = Margin of error

    Code:
    BarsWF SSE2 (MD5 Brute-Force) - Difference: +0.032% (ME)
    
    Lame V3.99.5 (MP3) - Difference: -0.4% (ME)
    
    Fritz 4.2 (Chess) - Difference: +0.042% (ME)
    
    Euler3D 2.2 (Stars CFD) - Difference: +0.039% (ME)
    
    wPrime 2.09 (1024M) - Difference: -0.308% (ME)
    
    Multicore PI 4.0.0.0 - Difference: -0.002% (ME)
    
    LuxMark 2.0 (Sala Scene) - Difference: 0.000% (ME)
    
    Linpack 11.3.008 (Read the note!) - Difference: -3.093%
    
    Fluidmark 1.5.2 - Difference: -0.850% (ME)
    
    
    
    7-Zip 9.30 Alpha x64 (Compress, 32M Dict.) - Difference: 0.000% (ME)
    
    7-Zip 9.30 Alpha x64 (Decompress, 32M Dict.) - Difference: 0.000% (ME)
    
    7-Zip 9.30 Alpha x64 (Compress, 64M Dict.) - Difference: +1.447 % (ME)
    
    7-Zip 9.30 Alpha x64 (Decompress, 64M Dict.) - Difference: +0.032% (ME)
    
    
    
    WinRar 5.0 Beta5 x64 - Difference: -0.259% (ME)
    
    
    
    Y-Cruncher 0.61 (x86 SSE3 - Single Thread, Chudnovsky 50M) - Difference: +0.689% (ME)
    
    Y-Cruncher 0.61 (x86 SSE3 - 4 Threads, Chudnovsky 50M) - Difference: +0.850% (ME)
    
    Y-Cruncher 0.61 (x86 SSE3 - Single Thread, Ramanujan 50M) - Difference: +1.305% (ME)
    
    Y-Cruncher 0.61 (x86 SSE3 - 4 Threads, Ramanujan 50M) - Difference: +0.820% (ME)
    
    
    
    Y-Cruncher 0.61 (Kasumi x64 SSE3 - Single Thread, Chudnovsky 50M) - Difference: +0.768% (ME)
    
    Y-Cruncher 0.61 (Kasumi x64 SSE3 - 4 Threads, Chudnovsky 50M) - Difference: +1.169% (ME)
    
    Y-Cruncher 0.61 (Kasumi x64 SSE3 - Single Thread, Ramanujan 50M) - Difference: +1.715% (ME)
    
    Y-Cruncher 0.61 (Kasumi x64 SSE3 - 4 Threads, Ramanujan 50M) - Difference: +0.814% (ME)
    
    
    
    Y-Cruncher 0.61 (Nagisa x64 SSE4.1 - Single Thread, Chudnovsky 50M) - Difference: +0.766% (ME)
    
    Y-Cruncher 0.61 (Nagisa x64 SSE4.1 - 4 Threads, Chudnovsky 50M) - Difference: +0.424% (ME)
    
    Y-Cruncher 0.61 (Nagisa x64 SSE4.1 - Single Thread, Ramanujan 50M) - Difference: +1.743% (ME)
    
    Y-Cruncher 0.61 (Nagisa x64 SSE4.1 - 4 Threads, Ramanujan 50M) - Difference: +0.821% (ME)
    
    
    
    Y-Cruncher 0.61 (Ushio x64 SSE4.1 - Single Thread, Chudnovsky 50M) - Difference: +0.702% (ME)
    
    Y-Cruncher 0.61 (Ushio x64 SSE4.1 - 4 Threads, Chudnovsky 50M) - Difference: +0.857% (ME)
    
    Y-Cruncher 0.61 (Ushio x64 SSE4.1 - Single Thread, Ramanujan 50M) - Difference: +1.489% (ME)
    
    Y-Cruncher 0.61 (Ushio x64 SSE4.1 - 4 Threads, Ramanujan 50M) - Difference: -0.190% (ME)
    
    
    
    Y-Cruncher 0.61 (Hina AVX x64 - Single Thread, Chudnovsky 50M) - Difference: +0.835% (ME)
    
    Y-Cruncher 0.61 (Hina AVX x64 - 4 Threads, Chudnovsky 50M) - Difference: +0.891% (ME)
    
    Y-Cruncher 0.61 (Hina AVX x64 - Single Thread, Ramanujan 50M) - Difference: +0.807% (ME)
    
    Y-Cruncher 0.61 (Hina AVX x64 - 4 Threads, Ramanujan 50M) - Difference: +0.210% (ME)
    
    
    
    3DMark03 - Difference: -0.192% (ME)
    
    3DMark05 - Difference: +0.291% (ME)
    
    3DMark06 - Difference: -0.255% (ME)
    
    3DMark Vantage (Performance) - Difference: -0.186% (ME)
    
    3DMark11 (Performance) - Difference: -0.060% (ME)
    
    3DMark Fire Strike - Difference: -0.136% (ME)
    Last edited by The Stilt; 06-13-2013 at 08:04 AM.

  16. #16
    Xtreme Mentor
    Join Date
    May 2008
    Location
    cleveland ohio
    Posts
    2,879
    I'm curious to see if it changes Cinebench 11.5r single thread score?
    I'm doubting it would.
    HAVE NO FEAR!
    "AMD fallen angel"
    Quote Originally Posted by Gamekiller View Post
    You didn't get the memo? 1 hour 'Fugger time' is equal to 12 hours of regular time.

  17. #17
    Xtreme Legend
    Join Date
    Nov 2003
    Location
    Helsinki, Finland
    Posts
    1,692
    Quote Originally Posted by demonkevy666 View Post
    I'm curious to see if it changes Cinebench 11.5r single thread score?
    I'm doubting it would.
    No.
    Neither does x264 r2334.

  18. #18
    Xtreme Mentor
    Join Date
    May 2008
    Location
    cleveland ohio
    Posts
    2,879
    Quote Originally Posted by The Stilt View Post
    No.
    Neither does x264 r2334.
    So it's only for anything that is x87 fpu coding,finding anything that still uses x87 is kind of hard.

    Skyrim I think is one game that still uses it.

    so it's not going to make any thing else faster
    HAVE NO FEAR!
    "AMD fallen angel"
    Quote Originally Posted by Gamekiller View Post
    You didn't get the memo? 1 hour 'Fugger time' is equal to 12 hours of regular time.

  19. #19
    Xtreme Member
    Join Date
    Aug 2004
    Posts
    210
    Hmm I see you have tested yCrunsher, too, but only the "modern" versions. There is also an old x87-one, which however was deleted in the newest 0.61 version (everybody has at least SSE3 these days, hasnt it?). If you have time please download the old 0.5.5 package:
    http://www.numberworld.org/y-crunche...0(fix%202).zip

    and then make a test run with the "x86.exe", I assume that it uses x87. What else can it use ;-)
    The build description is:
    Version: x86

    No Processor Specific Tuning


    Target Systems:
    - Legacy x86.
    Apart from that I wonder what "magic" is switched on. Could it maybe be some FDIV-circuits? I know that the Integer DIV-Units are enabled for the Piledriver cores (even though they were already there at the first Bulldozers, but they were deactivated). Now an FDIV-Unit is announced for Steamroller - I wonder if there is already some test-unit for that purpose integrated in the current core versions.

  20. #20
    Xtreme Addict
    Join Date
    Feb 2008
    Posts
    1,209
    Nice find! It would've had helped Bulldozer not to look so dull in the beginning if it had been activated. SuperPi is SuperPi, we all trust in it. I will install it for sure just to know my x87 is switched on again (dammit, maybe its a licensing issue to microsoft?)
    1. ASUS Sabertooth 990fx | FX 8320 || 2. DFI DK 790FXB-M3H5 | X4 810
    8GB Samsung 30nm DDR3-2000 9-10-10-28 || 4GB PSC DDR3-1333 6-7-6-21
    Corsair TX750W | Sapphire 6970 2GB || BeQuiet PurePower 450w | HD 4850
    EK Supreme | AC aquagratix | Laing Pro | MoRa 2 || Aircooled

  21. #21
    Xtreme Legend
    Join Date
    Nov 2003
    Location
    Helsinki, Finland
    Posts
    1,692
    Quote Originally Posted by Opteron146 View Post
    Hmm I see you have tested yCrunsher, too, but only the "modern" versions. There is also an old x87-one, which however was deleted in the newest 0.61 version (everybody has at least SSE3 these days, hasnt it?). If you have time please download the old 0.5.5 package:
    http://www.numberworld.org/y-crunche...0(fix%202).zip

    and then make a test run with the "x86.exe", I assume that it uses x87. What else can it use ;-)
    The build description is:


    Apart from that I wonder what "magic" is switched on. Could it maybe be some FDIV-circuits? I know that the Integer DIV-Units are enabled for the Piledriver cores (even though they were already there at the first Bulldozers, but they were deactivated). Now an FDIV-Unit is announced for Steamroller - I wonder if there is already some test-unit for that purpose integrated in the current core versions.
    I?ll try that as soon as I can.

    The same fix seems to be required on Steamroller too.

  22. #22
    Xtreme Enthusiast
    Join Date
    Nov 2009
    Posts
    526
    Quote Originally Posted by The Stilt View Post
    I?ll try that as soon as I can.

    The same fix seems to be required on Steamroller too.
    You are such a tease..

  23. #23
    Xtreme Member
    Join Date
    Aug 2004
    Posts
    210
    Quote Originally Posted by The Stilt View Post
    I?ll try that as soon as I can.
    Great, thanks
    The same fix seems to be required on Steamroller too.
    Ok then I honestly have no clue what is going on ... mysterious.

  24. #24
    Registered User
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    63
    The Stilt

    I am pretty sure demonkevy666 is right about Skyrim using x87 code. I remember one of the first big patches included some community modders programming work as he found they had a load of x87 code and he optmised some areas with SSE2 if I remember correctly.
    Not sure how much x87 remains, but it could be interesting. It would also suggest to me that it is likely Bethesda has used x87 code in the previous versions of the elderscroll games.

    Physx by Nvidia also uses x87 code over SSE versions. That could be interesting to test. Games that heavily use physx too could be interesting to test, Batman from 2010.

    Other games heavily using physx I have found, Warframe, The secret world, Arma3, PlanetSide2, Hawken, any U4E game...

  25. #25
    Xtreme Member
    Join Date
    Aug 2004
    Posts
    210
    Quote Originally Posted by Oese View Post
    I will install it for sure just to know my x87 is switched on again (dammit, maybe its a licensing issue to microsoft?)
    It is not switched off, it works just slower than it could. The question is why... I dont think that it is a licensing problem, x87 is more than 20 years old. Yes MS says that nobody should use it any more, but still that is no reason for AMD to handicap their own CPU.

Page 1 of 5 1234 ... LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •