Message boards : Number crunching : Report Problems with Rosetta Version 5.25
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 12 · Next
Author | Message |
---|---|
NJMHoffmann Send message Joined: 17 Dec 05 Posts: 45 Credit: 45,891 RAC: 0 |
I'm having the same problems with Rosetta@Home "hanging" (it shows "running" but the CPU is at 0%). You don't happen to have Boinc alpha 5.5.10 running? Norbert |
Tino Ruiz Send message Joined: 12 Oct 05 Posts: 13 Credit: 397,392 RAC: 0 |
No, just the regular Rosetta@Home. |
![]() Send message Joined: 1 Apr 06 Posts: 26 Credit: 176,432 RAC: 0 |
Hi, FRA_t322_CASPR_hom001_6_t322_3_1u1zA_IGNORE_THE_REST_341_1079_74_0 did the same on one of my systems. 20.153% progress; running; cpu idle. Shutting down Boinc and then restarting it caused rosetta to start running a second WU and to show FRA_t322_CASPR_hom001_6_t322_3_1u1zA_IGNORE_THE_REST_341_1079_74_0 as preempted ... Running Boinc 5.4.11 |
Tino Ruiz Send message Joined: 12 Oct 05 Posts: 13 Credit: 397,392 RAC: 0 |
So...what does that mean? Sorry if I'm being stupid, but is there anything that can be done? I've tried everything I could, resetting the project etc. Nothing works. If I have to keep aborting all workunits like this I'm afraid I'm not much use to the project and I'll probably have to quit. :-( |
![]() ![]() Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
MonsterTruck I see you are running Linux, and BOINC 5.4.9. How do you have your General preferences set for run while PC is in use? And keep in memory while preempted? Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Tino Ruiz Send message Joined: 12 Oct 05 Posts: 13 Credit: 397,392 RAC: 0 |
Feet1st: I have BOINC set to run 24/7 (which it does, my PC is up 24/7), the unit is set to be removed from memory when it switches to another project. Interval is 1 hour, then it moves to the next project and so on. |
NJMHoffmann Send message Joined: 17 Dec 05 Posts: 45 Credit: 45,891 RAC: 0 |
Feet1st: I have BOINC set to run 24/7 (which it does, my PC is up 24/7), the unit is set to be removed from memory when it switches to another project. Interval is 1 hour, then it moves to the next project and so on. Rosetta still :-( has WUs with checkpoints more then 1 hour apart. So you should - leave in memory or - use a bigger interval or - use a boinc version, that waits for a checkpoint before switching (e.g. 5.5.13) Norbert |
Tino Ruiz Send message Joined: 12 Oct 05 Posts: 13 Credit: 397,392 RAC: 0 |
Oh, I see...thanks! :-D Hopefully that'll fix it. |
![]() ![]() Send message Joined: 11 Oct 05 Posts: 153 Credit: 4,387,904 RAC: 23 |
I have been reporting this problem for over a week but nothing has come of it yet, they are still mucking around with the crediting system I suppose. I have both Windows and Linux machines, the Windows ones are not having the problem (only 2 WU's have stuck) but the Linux machines virtually can't process a WU. Both are Opteron machines and I changed the preferences from 90 minute swap times to 60 minutes but this has made no difference. WU's stop doing anything, CPU goes idle but the Boinc Manager says still running and will not switch to another project (I have more than just Rosetta on each computer) despite the preference setting, it stays locked to the WU and does nothing. Suspending/resuming soes nothing and restarting Boinc does nothing, a machine reboot is the only way to restart the unit which usually then errors out. The current WU's I am doing on my AMD Opteron 275 machine have all failed with computation errors or I have aborted them (about 10 I think over the last 2 days), none have been successful. I am not using the latest 5.5.9 or whatever the current version is I am using the previous version or versions. All my other projects are having no problems. I have preferences set to remove from memory due to the other projects I am running across all my machines. |
![]() ![]() Send message Joined: 7 Oct 05 Posts: 234 Credit: 15,020 RAC: 0 |
Sorry, I forgot to report this WU: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=28166866 Result: https://boinc.bakerlab.org/rosetta/result.php?resultid=32544227 It crashed when I opened the graphics, as I remember. [b]"I'm trying to maintain a shred of dignity in this world." - Me[/b] ![]() |
BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0 |
Conan: Does the problem on your Linux machines go away if you switch to "leave app in memory"? If so.. you can always change the mix of the projects on those two machines so that Rosetta is on only one machine, with the "leave app in memory" and limit the number of other projects running on the second machine. i.e. instead of two machines giving Rosetta 25% - have one machine giving Rosetta 50%. And keep mentioning the "app in memory" problem every so often, until it gets fixed. |
TCU Computer Science Send message Joined: 7 Dec 05 Posts: 28 Credit: 12,861,977 RAC: 0 |
Does the problem on your Linux machines go away if you switch to "leave app in memory"? I leave the app in memory and still have the problem, but it doesn't occur very often. Boinc Manager says Rosetta is running, but the CPU Time is not increasing and the CPU is idle. Usually when the problem occurs and I stop the Boinc process, the Rosetta process remains in the process list. I have to manually kill it or reboot the machine. I have seen the problem on Mac OS X and Linux (CentOS 4.3) but never under Windows. It has occurred on machines running only Rosetta and machines running Rosetta and Einstein but it only effects the Rosetta app. |
![]() ![]() Send message Joined: 11 Oct 05 Posts: 153 Credit: 4,387,904 RAC: 23 |
Conan: Does the problem on your Linux machines go away if you switch to "leave app in memory"? If so.. you can always change the mix of the projects on those two machines so that Rosetta is on only one machine, with the "leave app in memory" and limit the number of other projects running on the second machine. Thanks BennyRop for your reply. impractical to change project mix at this stage. I was having problems with leaving the app in memory as it was causing problems with one of my Windows machines that also runs Rosetta. The preferences are not computer specific so has to change for all. With about 5 projects on 3 of my machines leaving app in memory (i was leaving all apps in memory at one stage) caused problems. I could try it again though. The problem does not affect my Windows machine only the Linux machines. Rosetta is the only app affected, Einstein,Cp,QMC,Predictor all ok. Ralph not affected as much as Rosetta. |
MikeMarsUK Send message Joined: 15 Jan 06 Posts: 121 Credit: 2,637,872 RAC: 0 |
I believe there is an <something>_override.xml file in the boinc directory which can be edited to change the settings on one specific machine (i.e., 10 machines would use the web settings, and one would use the settings from the override file). But I've never looked into it myself (no need), so can't do more than point you in the general direction. ![]() ![]() |
Marky-UK Send message Joined: 1 Nov 05 Posts: 73 Credit: 1,689,495 RAC: 0 |
I believe there is an <something>_override.xml file in the boinc directory which can be edited to change the settings on one specific machine (i.e., 10 machines would use the web settings, and one would use the settings from the override file). Details of the global_prefs_override.xml file can be found here: http://boinc.berkeley.edu/prefs_override.php It needs BOINC 5.4 or later. |
BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0 |
I'm the second one to error out on these two WUs: 1c9oA_BOINC_BACKBONE_HN_PENALTY_ABRELAX_SAVE_ALL_OUT__1175_75 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=29000987 we both get errors at line 401: <core_client_version>5.4.9</core_client_version> <message> Yanl�� i�lev. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # random seed: 1690956 # cpu_run_time_pref: 10800 ERROR:: Exit at: .dock_structure.cc line:401 FRA_t367_CASPR_hom001_6_t367_4_1wolA_IGNORE_THE_REST_568_1076_12 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=28126307 and these two both error out at line 1860 of a different module: <core_client_version>5.4.9</core_client_version> <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # random seed: 2331689 # cpu_run_time_pref: 10800 ERROR:: Exit at: .pack.cc line:1860 |
BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0 |
here's another: 1tit__BOINC_BACKBONE_HN_PENALTY_ABRELAX_SAVE_ALL_OUT__1175_470 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=29074487 <core_client_version>5.4.11</core_client_version> <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # random seed: 1660561 # cpu_run_time_pref: 21600 # cpu_run_time_pref: 7200 ERROR:: Exit at: .dock_structure.cc line:401 |
Tino Ruiz Send message Joined: 12 Oct 05 Posts: 13 Credit: 397,392 RAC: 0 |
Great news guys: it appears that the "CPU hanging issue" has disappeared for me. I've set the interval to 120 minutes (2 hours), and that seems to have worked so far. :-) |
BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0 |
Congratulations, MonsterTruck.. :) |
![]() Send message Joined: 19 Sep 05 Posts: 403 Credit: 537,991 RAC: 0 |
To errors with Incorrect function. (0x1) - exit code 1 (0x1). https://boinc.bakerlab.org/rosetta/result.php?resultid=33558610 https://boinc.bakerlab.org/rosetta/result.php?resultid=33525876 Anders n ![]() |
Message boards :
Number crunching :
Report Problems with Rosetta Version 5.25
©2025 University of Washington
https://www.bakerlab.org