Results 1 to 14 of 14

Thread: Let's Hunt and Seek for the TLB bug? ;)

  1. #1
    Xtreme Member
    Join Date
    Apr 2007
    Posts
    118

    Let's Hunt and Seek for the TLB bug? ;)

    Hi guys.

    Well, first of all, sorry for my poor english.

    I choosen to create this topic here, because a guy here on Brazil tried exactly that: to sistematically hunt down the TLB bug. He had some success on it, but, unfortunatelly, got robbed during the holidays.

    The, I've choosen to translate to here his findings, since there is a lot of people here who have phenom processors, ability to oc them and many differente MBs, chipsets and bios versions available.

    The original topic (in portuguese) is this: http://www.forumpcs.com.br/viewtopic.php?t=227912

    The last update to his findings, translated down here:

    Quote Originally Posted by Schakal
    Locking a Phenom? Tests Results:

    Hi!

    Firstly I would like to acknowledge for all suggestions and say that this testing even surprised me, due to two unnexpected events.

    Another observation:
    In all tests, I've used the configuration below (unavailable anymore), but with differente video card and ammounts of RAM. The video card used was my old X 1950XTX and the system memory was 4Gb (4x1Gb) made of Corsair CM2X1024-6400. In all possible configurations tested, I've didn't used oc to not missinterpretate them as oc issues.

    NelsonK wrote:
    From what I could understand, the error (system lock) happens in specific condition of L3 cache memory usage, and seems to happens with increase frequency when one uses some kind of virtualization program.

    That's right. I won't specify more here, in order to create some "suspense", but indeed, I could not find a "less frequent" bug occurency, what means, I could only find it when using VMs.

    Well friends, let's go to the testing!

    Test 1 - Super PI, Shrink and WinRAR

    In order to flood the CPU totally without having less than 100% usage moments, each core had to executde a 32M superpi calculation Moreover, winrar archiving processes, and dvd shrink video conversions were performed at the same time.

    Result: 100% of cores usage and no bug.

    Test 2 - Test 1 + Cache Benchmark:

    Simulating the previous test with pc wizard, an extra cache bench was performed.

    Result: 100% of cores usage and no bug.

    Test 4 - Prime95

    4 instances of Prime95, one in each core simultaneously.

    Result: 100% of cores usage and no bug.

    Teste 5 - VM 1

    With VMWare, I created and emulated a common VM with Windows Vista HP 64bits. In this VM I've made all previous tests, except fourth.

    Result: 100% of cores usage and no bug.

    Test 6 - VM 2

    With VMWare, I created and emulated a common VM with Windows Vista HP 64bits. In this VM I've made all previous tests, except fourth. A curious event happened: The computer restarted AND one of the VM corrupted. Accordingly with windows events log, something happened with "svchost.exe", what weemed curious cause nothing have ever hppened there.

    Result: No 100% of usage on the 4cores, bug didn't happened and something new appeared: the only single 32bits VM that I can't emulate over 64bits is Vista HP. Already tested installation disk, hadware, etc, and nothing
    Off-Test=Any clue on this one?

    Teste 7 - VM 3

    With VMWare, I created and emulated two common VMs with Windows Vista Ultimate de 64bits. In these VMs emulated all tests up to test 3, but I've defined two cores/VM.

    Result: 100% of cores usage and no bug.

    Teste 8 - VM 4

    With VM Ware, I created and emulated four common VMs with Windows Vista Ultimate 64bits. In these VMs emulated all tests up to test 3, but I've defined one core/VM.

    Result: Is that you, BUG?

    In test 8, the CPU reinitialized sponteously when, according with the system manager, the cores 1 and 4 reached 100% of usage, core 2 50 % and core 3 a bit more than 80%. I've started to emulate the tests on the VM #3.

    Looking for information on the four logs availabel, absolutelly no information on this subject was found.

    For a happy coincidence, I've turned off Cool'n' Quiet and did test9:

    Teste 9 - VM 4 WITHOUT Cool'n'Quiet

    Exactly as in test 8, but with Cool'n'Quiet turned off all time.

    Result: 100% of cores usage and no bug.

    It may seem really odd, but without Cool'n'Quiet the CPU executed all tasks normally.

    Gone sleep.

    In the morning, a few extra tests like 9 using prime95 and all other programs with the 4 VMs. No bug at all. And 100% all 4 cores, as would be expected because emulating 4 VMs is a big job.

    Conclusion:

    4 VMs + Cool'n'Quiet = BUG

    ...
    What do you guys think about trying to systematically try to find the TLB error? I admit, that's an odd thing to try, but now we have an initial "systematic way" of trying to do it.

    Does it worth the effort? I would think so. But, of course, it's up to you all here.

  2. #2
    Banned
    Join Date
    Dec 2006
    Location
    Brazil
    Posts
    165
    you shoud try to "hunt and seek" about on who stole your hardware
    anyway, has some info about the TLB here: http://www.xtremesystems.org/forums/...d.php?t=170462

  3. #3
    Xtreme Member
    Join Date
    Apr 2007
    Posts
    118
    It's not my hardware.

    My house is about to be "unrobbable". :P

    And yes, I've seen that thread, were everybody just say it have never been seen, and that the bios fix is . This thread is somthing differente: Try to see the bug happening, in a reproductible way.

    Anybody?

  4. #4
    Xtreme Mentor
    Join Date
    May 2007
    Posts
    2,792
    Yes, virtualization is what the K10 core TLB is most optimized for since K10 took it into hardware (thus why its better and was one of its key advertised and selling features). Hence why the cores were recalled, since virtualization is the biggest raw server market to cap at the moment.

    I've already tried all reproduction of the bug for my own curiosity. I use Linux quite a bit through Windows Vista nd XP, and have ran 3 VMs (Ubuntu/Slackware/Gentoo), a game, Firefox (68 extensions, 12 sites open, incl. Youtube and YouOS), Thunderbird (~42 extensions, 31 3yr plus email accounts, 12 feeds (plus 400MB each)), POV-Ray 3.7, Office document, (total 2GB full, 1GB into pagefile), 100% core usage but nothing appears.

    Errors are not the bug though. A lockup in extremely heavy load virtulaization will be the bug (more than with Prime95 - check watt meter/temps) and I don't see the need to go hunting for it if you've tried P95 in-place large FFT and EVEREST FPU (tests) because those place the maximum load on your Phenom CPU from my testing possible on the desktop. Orthos max is around 12W lower load. Can you ever get more than that beyond Intel TAT on any day?

    In either case, for me it is the same as any errata you weren't openly told about. Pointless for us unless its afected you. Any normal software won't reproduce it unless its virtualization and you can get a consistent heavier than max P95 load with it and in that case, it'll be reproduible at stock clocks. No need for oc.

  5. #5
    Xtreme Addict
    Join Date
    Jul 2007
    Posts
    2,103
    Hey from what ive read and heard is that Linux is unafected ....It will only bug on Non Linux OS'es...Im not 100% shure of this though...I always though if there was a TLB,,bug it could be present on any Os because its in the Chip?

    Oh well i want a B3 and 3.5ghz ill be happy
    "AMD...Like the perfect Storm,...Everything needs to be just right"
    X555x4SuperCore@4450mhz@1.64v..........

    RYZEN 7 1800x/ ASUS ROG STRIX VEGA64/ =EK NICKEL WB, Feser THC 2x360 1x480
    X470 Gigabyte Aorus7, Patriot 3400mhz 16gb dual2x8
    SSD Samsung 970pro,,860EVO

  6. #6
    Xtreme Mentor
    Join Date
    May 2007
    Posts
    2,792
    Quote Originally Posted by gOtVoltage View Post
    Hey from what ive read and heard is that Linux is unafected ....It will only bug on Non Linux OS'es...Im not 100% shure of this though...I always though if there was a TLB,,bug it could be present on any Os because its in the Chip?
    The workaround applied for Unix based OSes doesn't affect them as much as with Window -> yep.
    Oh well i want a B3 and 3.5ghz ill be happy
    3.5GHz retail or oc?
    Maybe that's a bit too much to ask of Phenom looking at it right now. 3.5GHz stable is like top 6400+ on air, actually most on water do 3500-3600 stable max. So that would indeed be very good. I know 3.5/3.6GHz sounds good () but you should rather base your expectations on performance you're after rather than speed figures. I mean by B3, bear in mind Penryn quads will be running at 3.8-4.2GHz air and performance that comes with it.

    I've never been the crazy type to look at others though. I only change a system if I feel the need to and if it's well worth it to me to keep me throughout 12-18months, not because something better/faster is out. Heck, I'd be in red in every bank account if I did that with just one thing of millions in life and would have to resort to fraud all the time.

  7. #7
    Xtreme Member
    Join Date
    Apr 2007
    Posts
    118
    KTE: the point in trying to find it is: if I can't have it even searching for it, what's the possibility of that happening on a desktop in normal usage?

    It's supposed to be the kind of a test to, if one can find and reproduce the bug "ok, THIS is the ONLY SINGLE THING one should NOT do", and if not finding it, "Com'on, I did ALL THIS and this ing bug didn't appeared, it's absolutelly pointless to take it into consideration".

    For me, it's looking like that it's easier to lock a linux system due to a linux bug in a reproductible way among different distributions than finding the so much mentioned phenom bug once!

  8. #8
    Xtreme Mentor
    Join Date
    May 2007
    Posts
    2,792
    VMWare with Linpack is a b1ch to set up or I would've ran it to test. Theres no point going further than that for me. I'll run Linpack on the BE in Linux and XP and maybe in Vista too (if I can) soon.

    To answer your 1st qs: like all reviewers, testers and AMD officials have stated so far, you won't be able to replicate it on the desktop with normal software max loads. They have all tried to do so too and so have we and ultimately failed as expected. In CPU development QC, the standard stability testing is far far higher and more extreme/lengthy than what we can do on the desktop, and these chips on retail passed that perfectly fine. Hence why it was hard and late to be tracked.

  9. #9
    Xtreme Addict
    Join Date
    Sep 2007
    Location
    Munich, DE
    Posts
    1,401
    Does VMWare workstation use hardware virtualisation? I think it's pure sw-virt.
    VMware's latest ESX/GSX Servers user hardware virtualisation and should also use the latest virtualisation features available in barcelona (think also phenom) prozessors.

  10. #10
    Xtreme Mentor
    Join Date
    May 2007
    Posts
    2,792
    I know VMWare ESX Server 3/3i, Virtual SMP, and VMFS support hardware virtualization but it seems Workstation 6.0, 6.0.1, 6.0.2 also support it: http://kb.vmware.com/selfservice/mic...200%2039351283

  11. #11
    D.F.I Pimp Daddy
    Join Date
    Jan 2007
    Location
    Still Lost At The Dead Show Parking Lot
    Posts
    5,182
    Personally I think this sh**t is blown way out of proportion with these Phenom Processors!
    SuperMicro X8SAX
    Xeon 5620
    12GB - Crucial ECC DDR3 1333
    Intel 520 180GB Cherryville
    Areca 1231ML ~ 2~ 250GB Seagate ES.2 ~ Raid 0 ~ 4~ Hitachi 5K3000 2TB ~ Raid 6 ~

  12. #12
    Banned
    Join Date
    Apr 2006
    Posts
    25
    Maybe, but Phenom is still shít.

    I had mine, within 2 weeks I was sick of it, sent it back.

    AMD have got a cheek to release such a processor, in this day and age, clocked at 2.3Ghz? Nearly 2 Years later than Intel's offerings and it cannot even compete?

    They done a microsoft by ditching support for 939, which at the time was the most stupid, retarded and thick thing to do.

    They now release a processor which is slow, bugged and expensive + it isn't really much faster than K8!

    The future is bleak for AMD and this is from someone which is most defintaly is not a fan of Intel.
    Last edited by Syz`; 01-06-2008 at 08:35 AM.

  13. #13
    Xtreme X.I.P.
    Join Date
    Nov 2002
    Location
    Shipai
    Posts
    31,147
    dont forget k11 is delayed/cancelled and so is 45nm
    ...

  14. #14
    Xtreme Enthusiast
    Join Date
    Aug 2007
    Location
    Warren,MI
    Posts
    561
    yup k12 and 32nm here we come

    i hope....
    cpu- Intel I7 3930K
    Asus P9x79 Deluxe
    2x HD7970
    32gb ddr3-1600
    corsair ax1200
    Corsair 800D
    Corsair H100 lapped
    2x 128gb M4 raid 0

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •