Page 1 of 2 12 LastLast
Results 1 to 25 of 26

Thread: Heads up: WCG update can error out your machines entire queue!

  1. #1
    Xtreme Cruncher
    Join Date
    Jun 2007
    Location
    SK, Canada
    Posts
    836

    Heads up: WCG update can error out your machines entire queue!

    knreed explaining the update: https://secure.worldcommunitygrid.or...d_thread,30943

    My post showing what happened to my machines: https://secure.worldcommunitygrid.or...d_thread,30946

    Sekerob posed a possible cause in post# 5. THere's no need to take action at this point, if you machine hasn't errored out yet then it's likely fine. If it has errored out, let it crunch and return some results and your cache will refill on it's own.
    i7 3970X @ 4500MHz 1.28v
    Asus Rampage IV Extreme
    4x4GB Corsair Dominator GT 2133MHz 9-11-10-27
    Gigabyte Windforce 7970 OC 3-way Crossfire
    Windows 7 Ultimate x64
    HK 3.0-MCP655-Phobya 400mm rad
    Corsair AX1200i
    Sandisk Exrtreme 240GB
    3x2TB WD Greens for storage
    TT Armor VA8003SWA





  2. #2
    version 2.0
    Join Date
    Feb 2005
    Location
    Flanders
    Posts
    3,862
    Haven't noticed anything unusual here , but thanks for the heads up


    EDIT : My 980X has the same issue :-(

    <core_client_version>6.4.5</core_client_version>
    <![CDATA[
    <message>
    app_version download error: couldn't get input files:
    <file_xfer_error>
    <file_name>wcg_hcmd2_maxdo_6.15_windows_intelx86 </file_name>
    <error_code>-120</error_code>
    <error_message>signature verification failed</error_message>
    </file_xfer_error>

    </message>
    ]]>
    Last edited by Jaco; 03-01-2011 at 02:43 AM.

  3. #3
    Registered User
    Join Date
    Feb 2009
    Posts
    470
    i had a strange thing happen to me two days ago. i crunch on my lappie and when i got the lappie was idling. wcg told me i reached the quota and all my wu's errored (10 or so) and went on for like 20 hours.


    Tell it it's a :banana::banana::banana::banana::banana: and threaten it with replacement

    D_A on an UPS and life

  4. #4
    Back from the Dead
    Join Date
    Oct 2007
    Location
    Stuttgart, Germany
    Posts
    6,602
    Epic fail across the board here, all queues effed up.
    Thanks for posting though, I thought at first that maybe something was wrong with my SR-2
    World Community Grid - come join a great team and help us fight for a better tomorrow![size=1]


  5. #5
    Xtreme Cruncher
    Join Date
    Sep 2007
    Location
    PA, USA
    Posts
    1,504
    dang, which I had checked here sooner. checked my email this morning before going to work (don't normally) and saw that my pc was idle and all WU's errored out, figued my overclock became unstable or something so I haven't been crunching all day at least I only lost 50+ WU's. At least I didn't loose the rest of the night running Prime95.
    XS WCG Rules: #1: don't pull fart_plume's finger #2: Dave aka Movieman, don't give him your phone number if you like your hearing
    XS WCG Note: There are 2 sets of points, WCG and Boinc. WCG = 7x Boinc

    Project: Dark Matter (<- link) - Asus Maximus II Formula, Intel X3330 3.4ghz @1.32v under load, corsair ddr2 1066 8gigs, evga gtx260 core 216, pc p&c 750W, EK Supreme HF Nickel, iandh 175 res, Swiftech MCP355, Black Ice GTX G2 240, Lian Li v1200b

    silverstone tj07 build log


  6. #6
    Xtreme crazy bastid
    Join Date
    Apr 2007
    Location
    On mah murder-sickle!
    Posts
    5,878
    Not a blip here. George just luck I guess.

    [SIGPIC][/SIGPIC]

  7. #7
    Xtreme Member
    Join Date
    Jun 2010
    Location
    Crab Nebula
    Posts
    493
    Quote Originally Posted by D_A View Post
    Not a blip here. George just luck I guess.
    None here either. *Big sigh of relief* Wonder why only some were hit?



    You'll never know what you're living for until you know what you're willing to die for.

  8. #8
    Xtreme crazy bastid
    Join Date
    Apr 2007
    Location
    On mah murder-sickle!
    Posts
    5,878
    Something to do with how the security certificates are handled at a guess, but that's about as far as I can work it out.

    [SIGPIC][/SIGPIC]

  9. #9
    c[_]
    Join Date
    Nov 2002
    Location
    Alberta, Canada
    Posts
    18,728
    got hit with that last night, it even managed to crash utorrent when it did it somehow.

    All along the watchtower the watchmen watch the eternal return.

  10. #10
    Worlds Fastest F5
    Join Date
    Aug 2006
    Location
    Room 101, Ministry of Truth
    Posts
    1,615
    Yup it got me as well and trashed 2 machine's queues :/

    I now have over 60 pages pending as a result and my 2 most productive machines are no longer trusted to process DDDT2 WU's (no WU's available for this project)
    X5670 B1 @175x24=4.2GHz @1.24v LLC on
    Rampage III Extreme Bios 0003
    G.skill Eco @1600 (7-7-7-20 1T) @1.4v
    EVGA GTX 580 1.5GB
    Auzen X-FI Prelude
    Seasonic X-650 PSU
    Intel X25-E SLC RAID 0
    Samsung F3 1TB
    Corsair H70 with dual 1600 rpm fan
    Corsair 800D
    3008WFP A00



  11. #11
    Nanoseconds from Permaban
    Join Date
    Mar 2008
    Location
    Del City, OK
    Posts
    2,859
    What project are the errors occurring on?

  12. #12
    c[_]
    Join Date
    Nov 2002
    Location
    Alberta, Canada
    Posts
    18,728
    everything.

    its a security key bug. Notice the "6.40" at the end of the "appication" name? Thats the new key version, everything that crashed was 6.12 or something.

    All along the watchtower the watchmen watch the eternal return.

  13. #13
    Xtreme Cruncher
    Join Date
    Oct 2008
    Location
    Chicago, IL
    Posts
    840
    How come there are so many problems with reliability of the WCG network? It seems like regularly we hear about a failed file server, corrupt file allocation table, etc.

  14. #14
    Linux those diseases
    Join Date
    Mar 2008
    Location
    Planet eta pie
    Posts
    2,930
    Wonder if they started 'testing' this on the 25th, as that's when I started having multiple hcc errors as described with all my linux rigs Thought it could be a virus . Seems to have sorted itself out now

  15. #15
    Xtreme crazy bastid
    Join Date
    Apr 2007
    Location
    On mah murder-sickle!
    Posts
    5,878
    Quote Originally Posted by josh1980 View Post
    How come there are so many problems with reliability of the WCG network? It seems like regularly we hear about a failed file server, corrupt file allocation table, etc.
    Well given that there are several hundred thousand users requesting and returning work daily I'd think they don't do too bad. Yes they might be able to do better in absolute terms, but that's a lot of database transactions and sooner or later you're going to get a bad something no matter what you do.

    [SIGPIC][/SIGPIC]

  16. #16
    Xtreme Cruncher
    Join Date
    Jun 2007
    Location
    SK, Canada
    Posts
    836
    4 out of 6 of my farm rigs blew their caches out. It appears to be random as all 6 are identical software wise and very similar in hardware. Type of WU's didn't matter either as I lost queues of DDDT2, CEP2 and HFCC alike. Didn't lose much crunch time and the rigs are back on the reliable list and receiving repair WU's so nothing to get too upset about here.
    i7 3970X @ 4500MHz 1.28v
    Asus Rampage IV Extreme
    4x4GB Corsair Dominator GT 2133MHz 9-11-10-27
    Gigabyte Windforce 7970 OC 3-way Crossfire
    Windows 7 Ultimate x64
    HK 3.0-MCP655-Phobya 400mm rad
    Corsair AX1200i
    Sandisk Exrtreme 240GB
    3x2TB WD Greens for storage
    TT Armor VA8003SWA





  17. #17
    Xtremely Retired OC'er
    Join Date
    Dec 2006
    Posts
    1,084
    Agree, dont press update!

  18. #18
    c[_]
    Join Date
    Nov 2002
    Location
    Alberta, Canada
    Posts
    18,728
    Quote Originally Posted by Ego View Post
    Agree, dont press update!
    The fix is to press update to manually flush the cache..

    All along the watchtower the watchmen watch the eternal return.

  19. #19
    Xtreme crazy bastid
    Join Date
    Apr 2007
    Location
    On mah murder-sickle!
    Posts
    5,878
    That's kind of the point. If you DON'T hit "update" the it will still happen just at some time that will probably be even less convenient. Better to get it over and done with when you're there to get it sorted.

    [SIGPIC][/SIGPIC]

  20. #20
    Xtreme Enthusiast
    Join Date
    Jun 2008
    Location
    Northern Ohio
    Posts
    664
    I can't tell if I am getting hit by this or if I have an unstable overclock. I have 30 WU's that have just errored on a machine I just put up. It's first two days worth of units were valid, and now all of the new ones are showing up as errored.

    The error:
    autogrid4: WARNING: I just prevented an attempt to take the arccosine of 1, a value greater than 1.

    That sounds COMPLETELY different from what you guys are talking about, and seems like a stability error. HOWEVER I am looking at the quorum for the WU's and there seems to be alot of other people getting errors on the same units.

    I am at work right now so i can't take that machine down, but I am going to take it out and do stability tests. Very sad about 30 errored WU's, it seemed completely stable before.

    EDIT:
    Every single WU that I errored out had another person error on, unless I was the only with finished. Also, about half of my errors look like they errored out instantly with an out of memory error. I have 4Gig of memory on the machine so that is a surprise to me. Do not like.
    Last edited by Tom128; 03-03-2011 at 01:30 PM.


    Work/Game System - ~24/7 WCG
    ASUS P8P67 PRO / i7 2600k @ 4.1Ghz / Gigabyte Radeon HD5870 / 4x4GB Corsair Vengeance @ 1600Mhz 9-9-9

    HTPC -~24/7 WCG
    Gigabyte GA-Z68AP-D3 / i7 2600k @ 4.0Ghz / Sapphire Radeon HD5830 / 2x2GB Mushkin Enhanced Essentials @ 1333Mhz 9-9-9

    XS WCG Team Forum - http://www.worldcommunitygrid.org/

  21. #21
    Registered User
    Join Date
    Feb 2009
    Posts
    470
    only thing i would try is check temps downclock to stock and see what happens. if it persists it likely going to be the wu, if not oc as far as its stable.


    Tell it it's a :banana::banana::banana::banana::banana: and threaten it with replacement

    D_A on an UPS and life

  22. #22
    Xtreme Enthusiast
    Join Date
    Jun 2008
    Location
    Northern Ohio
    Posts
    664
    Well my system was unstable. Apparently very unstable. I've used a program called K10Stat to clock my Phenom II's from windows for a few years now, as it's much faster than using the BIOS. Long story short it does not seem to like my X6's, perhaps just on that motherboard and/or it's BIOS, but the voltages it was apply and reporting were pretty far off, around 0.05v from the settings I was using. Very confused because it was passing prime95 for a few hours, and it did 35K+ WCG points without issue and THEN completely bombed out.

    I stopped using it and went back to good old fashion BIOS clocking and have it running so far at 3.5Ghz @ 1.25v. I'm going to make sure it's ultra super deluxe stable before I put it back in.

    Now the question is what kind of impact the ~15 errored and ~15 insta-canceled WU's is going to have on what WCG will send the machine once I put it back in.


    Work/Game System - ~24/7 WCG
    ASUS P8P67 PRO / i7 2600k @ 4.1Ghz / Gigabyte Radeon HD5870 / 4x4GB Corsair Vengeance @ 1600Mhz 9-9-9

    HTPC -~24/7 WCG
    Gigabyte GA-Z68AP-D3 / i7 2600k @ 4.0Ghz / Sapphire Radeon HD5830 / 2x2GB Mushkin Enhanced Essentials @ 1333Mhz 9-9-9

    XS WCG Team Forum - http://www.worldcommunitygrid.org/

  23. #23
    Xtreme Enthusiast
    Join Date
    Sep 2005
    Location
    Louisiana
    Posts
    1,039
    Quote Originally Posted by Tom128 View Post

    Now the question is what kind of impact the ~15 errored and ~15 insta-canceled WU's is going to have on what WCG will send the machine once I put it back in.
    Not much, if any.

  24. #24
    Xtreme Cruncher
    Join Date
    Apr 2007
    Location
    Western Canada
    Posts
    1,004
    Quote Originally Posted by Ego View Post
    Agree, dont press update!
    Had the same problem on two machines, the only way I could get the client/server to start talking again was to do an update. Working fine since then?

  25. #25
    Xtreme Enthusiast
    Join Date
    Jun 2008
    Location
    Northern Ohio
    Posts
    664
    Quote Originally Posted by David_L6 View Post
    Not much, if any.
    Seems correct, for about an hour the machine got zero work units, and then the flood gates opened and I got several dozen.


    Work/Game System - ~24/7 WCG
    ASUS P8P67 PRO / i7 2600k @ 4.1Ghz / Gigabyte Radeon HD5870 / 4x4GB Corsair Vengeance @ 1600Mhz 9-9-9

    HTPC -~24/7 WCG
    Gigabyte GA-Z68AP-D3 / i7 2600k @ 4.0Ghz / Sapphire Radeon HD5830 / 2x2GB Mushkin Enhanced Essentials @ 1333Mhz 9-9-9

    XS WCG Team Forum - http://www.worldcommunitygrid.org/

Page 1 of 2 12 LastLast

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •