Message boards : Number crunching : What's with all the errors???
Previous · 1 · 2
Author | Message |
---|---|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2219 Credit: 42,280,090 RAC: 24,002 ![]() |
I now I made the claim, but it's an assumption until Roger confirms it's the same in the real world. A couple of exception errors on one machine but otherwise still looking good from here. ![]() ![]() |
FernValleyIT Send message Joined: 1 Dec 05 Posts: 7 Credit: 84,334 RAC: 0 |
I now I made the claim, but it's an assumption until Roger confirms it's the same in the real world. Not sure if those exceptions were from past or present. I'm showing 33 successes without error. Looking good to me. Thanks again to everyone. |
![]() Send message Joined: 29 Dec 07 Posts: 3 Credit: 65,405,031 RAC: 0 |
I am not getting any errors but what is happening since the bew year is I am getting WU that should take around 3 hours running for 120 jours or more if i dont catch it and squash it. I have a bunch of boincers diff types/ diff op systems/ diff boinc versions and they are all acting up this way. Anyone else having this problem? |
mikey![]() Send message Joined: 5 Jan 06 Posts: 1896 Credit: 10,138,586 RAC: 20,966 ![]() |
I am not getting any errors but what is happening since the bew year is I am getting WU that should take around 3 hours running for 120 jours or more if i dont catch it and squash it. WOW some of your machines have many, many day caches on them! Anyway do you have the setting checked to leave units in memory when swapping? It is under Your Account, Computing Preferences and is in the top section and says "Leave applications in memory while suspended? (suspended applications will consume swap space if 'yes')" If you have it set to No then change it to Yes and see if the problems clear up. |
![]() Send message Joined: 29 Dec 07 Posts: 3 Credit: 65,405,031 RAC: 0 |
I am not getting any errors but what is happening since the bew year is I am getting WU that should take around 3 hours running for 120 jours or more if i dont catch it and squash it. I'll check that out, Thanks the weird thing is before the rosetta servers went down over the holidays they all worked fine, the problem started after the rosetta servers came back online. |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
I am not getting any errors but what is happening since the bew year is I am getting WU that should take around 3 hours running for 120 jours or more if i dont catch it and squash it. 120 hours!? Are you sure you are not looking at elapsed (wall clock) time rather than cpu time? At some point BOINC manager began displaying wall clock time in addition to cpu time and I think in some configurations of the display only elapsed time is visible. Snags |
![]() Send message Joined: 29 Dec 07 Posts: 3 Credit: 65,405,031 RAC: 0 |
I am not getting any errors but what is happening since the bew year is I am getting WU that should take around 3 hours running for 120 jours or more if i dont catch it and squash it. nope, actual cpu time most of the time i just abort them but I found if i suspend the WU then resume it it starts running normal. I have watched some of the ones that have stalled and the progess is .001% every few minutes on a C2Q machine. at the same time the other WU on that machine are progressing at a normal rate. |
Evan Send message Joined: 23 Dec 05 Posts: 268 Credit: 402,585 RAC: 0 |
nope, actual cpu time I am running two of the cl1... and found that they were going for over ten hours while the cpu time said 2 and 4 hours respectively. After restarting boinc they are running from their last checkpoint and ironically both are working on model 10 |
Evan Send message Joined: 23 Dec 05 Posts: 268 Credit: 402,585 RAC: 0 |
Having restarted these cl1... work units they finished correctly at just over 6 hours. |
Message boards :
Number crunching :
What's with all the errors???
©2025 University of Washington
https://www.bakerlab.org