Check systems please, reports of high failure rate with some new 4.97 HBLR_x.x WU's :mad:
Rosetta Link
Printable View
Check systems please, reports of high failure rate with some new 4.97 HBLR_x.x WU's :mad:
Rosetta Link
Another new version of rosetta already? I thought 4.83 was only released like last week lol...
Looks like I have about 40 of those coming up soon - I hope to god you're wrong :(
Yikes !!
My X2 box is loaded with them starting in about 8 units :(
hmm, I'll see what the first one does
Quote:
Originally Posted by gpcola
Since most people had some 4.83 WU's waiting in line, most errors start popping up right now when they start crunching the new ones... and it doesn't look like they'll be able to fix the problem on short notice :(Code:April 7, 2006
The rosetta application was updated to include some new scientific code.
The application version numbers have been changed to be consistent with Ralph@home.
Since error rates have decreased significantly, we decided to increase the default cpu run time to 4 hours.
The previous default was 2 hours.
woke up, and had not gained any credits.
Looked at jobs, 8 had failed, using my CPU all night. #""!!!"#?`?"#¤#"arrrrrrrrrrghhh ffffffffffffffffffffffffffffffffffffffffffffffff
I had claimed 300 credits, didn't get one ####=?)==)(/&/%&%&#%#¤&#¤#¤/aaaaaaaaaaaaaaaaaaargggggggh.
Looked at time, it were up to 4 hourhs, was lost.
Looked for errors. arrrghhhhhhh
I'm really upset here fffffffff########"""""##
Ok so they just screwed it.
I'm calm, not to worry.
Easy now, yes easy now....... ok i'm totally calm........
Just checked this laptops messages, looks like every 4.97 WU failed :(
My Linux rig seemed to only have 1 4.97 WU fail though
every box has em here. and every box has been erroring since they started.
this is bs! i run a lot of machines and pay a lot of money for electricity. i wish these people would wake up to the fact that when they waste people's resources they waste their tolerance and patience.
i aborted all on 4 boxes. hit "retry communications" and just get a bunch more of the same ones.
jeez, this sh*t pisses me off.
I had just begun to feel the hunger for roasted beef, and now i get sick cows.
DAAAAAAAMMMNNNNN
Hmmm, seems I've crunched two of these already without realising - one on my X2 and the other on a P4 - both completed without fault.
[edit] actually maybe not - crunching an HBLR_1.1 right now on the P4 but it's a v4.83 not 4.97.
On a more serious note, i have stopped it, and are awaiting some confirmation about error free crunshing
Damn, I lost all overnight WUs to errors and thought it was my RAM dying - but SP2004 / SuperPi checked out ok this morning. Yes, they're HBLR_* units... http://boinc.bakerlab.org/rosetta/re...?hostid=190981
Got 2 more running on that box now, fingers crossed.
Wow, I've still got about 20 units till I get to the HBLR_1.* units, should I abort them just in case? It'll take this slow pos a few days/weeks to get to them, hopefully there will be a fix.
serlv, you're right.Quote:
Originally Posted by serlv
They should do some error check, before they send new stuff out, we are using electricity here.
two boxs down and off.
HBLRs aborted, queue empty ( aborted all HBLRs queued ) and i don't feel like DLing more garbage work that will error.
On the other two, I aborted all queued work, suspended network communications and if the two that are running error, then they get shut down, too.
Now on to the other nine *&(^$$#!!!!
Somebody PM me when they get this fixed.
Looks like I'm probably going to be shutting down all machines but two. I would have two on even if I didn't run DC projects.
I still have some 4.83 WU's on two rigs... but I'm not shutting down when I get to the 4.96 units... I am however switching all rigs over to 1 hour runs, some 4.96 WU's will surely finish with a limited runtime of 1 hour.
They shouldn't have said that :p:Quote:
The rosetta application was updated to include some new scientific code. The application version numbers have been changed to be consistent with Ralph@home. Since error rates have decreased significantly, we decided to increase the default cpu run time to 4 hours. The previous default was 2 hours.
I have one box with some 4.83s left. The one that failed lots of 4.97s last night just failed another after an hour, I'm aborting and abandoning 4.97s on all boxes until this gets fixed. (I do 4 hour runs, not gonna change that). One failed after 61 seconds. O_O
Bummer. :mad:
THus far my first two failed but after that I haven't had any failures.. hopefully I don't have to deal with alot of problems for that to be fixed.
Ok, running a 4.97 one more time, if this fail, i'll spare the electricity bill untill futher notice.
I have the same problems, was wondering what the hell was going on with my opty. Nothing but errors now while yesterday it was clocked higher at same vcore (testing what rosetta could take) and all was well.
built a new opty 165 box yesterday and here i thought it was the box, all the 4.97 wu's errored out, whew it wasnt just mine, was going to have to troubleshoot
Lucky you, I did troubleshoot, for no reason at all.Quote:
Originally Posted by odb
That makes you twice as mad. :mad: :D
from the rosetta forum, doesnt sound good to me itll be fixed right away
Quote:
Originally Posted by David Baker
Well I can go on for only 24 hours on both of my rigs, time for a pauze :p