Important message concerning the map predictor application |
Message boards : Malaria Control : Important message concerning the map predictor application
Author | Message |
---|---|
The new version of the mappredictor runs more stable than the first version. We do see quite a few errors, but they are concentrated on a small number of hosts. The error usually occurs right at the beginning of workunit. |
|
ID: 2945 | Rating: 1 | rate: / | |
As we are under a lot of pressure to deliver these results [...] Hope you don\'t mind I ask, what\'s the trouble? I wouldn\'t mind running the Map Predictor application exclusively on my lonely Win32 host, but I don\'t have any Option to select that to help out. ____________ Scientific Network : 44800 MHz - 77824 MB - 1970 GB |
|
ID: 2949 | Rating: 0 | rate: / | |
Bring em on... |
|
ID: 2955 | Rating: 0 | rate: / | |
Try this: 1. Go to Your Account and select Malariacontrol.net Preferences 2. Setup one of the three venues (Home, Work, School) for Map Predictor only 3. Set your Win32 host to that venue and do a manual update on it Your host should then complete any outstanding WUs and only download Map Predictor WUs (as available) from then on. HTH [edit typo] |
|
ID: 2958 | Rating: 0 | rate: / | |
As we are under a lot of pressure to deliver these results [...] Deadlines... When we started this one, it looked all so simple. Wrap the binary of the science app, create a few hundred thousand wus, 2 or 3 weeks later everything would be done. This was a few months ago, and there are a few people here who depend on these results. This is why we need a large proportion of the windows hosts to contribute. If we can deliver, this would be a nice example of volunteer computing solving a hard problem with a relatively small investment (still, despite the unforeseen complications). Nick ____________ Nicolas Maire Swiss Tropical and Public Health Institute http://www.swisstph.ch |
|
ID: 2959 | Rating: 0 | rate: / | |
For those of you who manage separate locations using the \'Combined preferences\' view: There\'s currently a small (cosmetics) issue. You\'ll find numbers instead of Yes/No in the table summary. Non-zero values mean: Get work for this application, zeros mean: No work for this application. Nick ____________ Nicolas Maire Swiss Tropical and Public Health Institute http://www.swisstph.ch |
|
ID: 2960 | Rating: 0 | rate: / | |
There are a few wu\'s where every single host errors out, like this wu. The only issue I have with wu\'s erroring out is that thread hangs until you acknowledge the error. During this time no work is completed. Due to this I know of some people who stear away from this project during this time and then have not come back. |
|
ID: 2964 | Rating: 0 | rate: / | |
I\'m also trying to remember how much credit was granted per wu last time. Was it 1.8? I would expect the next round of these, which have 2-3 time the execution time, to have 2-3 times the credit. I had a few of those fail last time, but not many. The problem is, as stated above, when the hang, they hang BOINC completely and no work gets done until you click the message box. What I was doing was downloading a number of units, then suspending all the map ones letting the regular ones run out, (time slicing with other projects). When I knew I was going to be sitting at the machine for a while, I\'d suspend everything else and run the map wu\'s exclusively. Then if a problem appeared, I could click the box and away she goes again. I have set my local machines to allow these wu\'s. The machines that run at my remote site I have disallowed. I only visit that site 1-2 times a week, sometimes less, and don\'t wish to risk them standing idle. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
|
ID: 2975 | Rating: 0 | rate: / | |
Bring em on... Right on!....moving our Windoze machines over from R@H to help...Cheers, Rog. ____________ |
|
ID: 2979 | Rating: 0 | rate: / | |
I tried setting the Venue of my Win32 box to \"Work\" and set it to run Map Predictor exclusively... Seems that this will work out fine indeed, for as long as there is a constant supply of these WorkUnits :) |
|
ID: 2987 | Rating: 0 | rate: / | |
I\'ve set Home, Work and School to be Yes, Yes, Yes, but have not received any of the Map units. 1 machine in Home group, 1 in Work and 2 in School. When they were circulating before, I got them. |
|
ID: 3029 | Rating: 0 | rate: / | |
I\'ve set Home, Work and School to be Yes, Yes, Yes, but have not received any of the Map units. 1 machine in Home group, 1 in Work and 2 in School. When they were circulating before, I got them. Have you tried setting \"Run malariacontrol simulation\" to No? That works for me. Of course then the map WU are the only ones you get. ____________ BOINC.BE: For Belgians who love the smell of glowing red cpu's in the morning Tutta55's Lair |
|
ID: 3031 | Rating: 0 | rate: / | |
No, I haven\'t, but reading the original post, particularly... We will make use of the total Windows host population to go through the workunits as quickly as possible. ... it sounded like they were pretty keen to get these things done, so would preferentially send them to people not opting out. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
|
ID: 3038 | Rating: 0 | rate: / | |
It doesn\'t work with my computer, i presume. If i am running the mappredictor, i get every second a message like this: |
|
ID: 3044 | Rating: 0 | rate: / | |
It doesn\'t work with my computer, i presume. If i am running the mappredictor, i get every second a message like this: Are you the only user on this machine? If so are you running as Admin or just a User? It looks like a permissions issue. ____________ |
|
ID: 3049 | Rating: 0 | rate: / | |
I\'ve set Home, Work and School to be Yes, Yes, Yes, but have not received any of the Map units. 1 machine in Home group, 1 in Work and 2 in School. When they were circulating before, I got them. I set my preferences for my home venue as follows to force my home windows hosts to only krunch map predictor for now. Run test applications? yes Run malariacontrol simulation application No Run malariacontrol test application Yes Run map predictor application Yes This give me only mappredictor 5.17 and NO malariacontrol 5.50 Before I had lots of mc5.50 and few map5.17 Over 100 done in past 24 hours. Hope this helps expedite things. |
|
ID: 3052 | Rating: 0 | rate: / | |
Shame the wu replication is so high. All the wu\'s with errors keep getting sent out, even after 2 or 3 errors! Basically all these wu\'s are not going to work and just hold up the processing of other wu\'s, both globally and on each host that gets one. |
|
ID: 3068 | Rating: 0 | rate: / | |
Are you the only user on this machine? If so are you running as Admin or just a User? It looks like a permissions issue. I\'m the only user, but i am using Windows 98 SE, not XP or Vista... ____________ |
|
ID: 3069 | Rating: 0 | rate: / | |
Shame the wu replication is so high. All the wu\'s with errors keep getting sent out, even after 2 or 3 errors! Basically all these wu\'s are not going to work and just hold up the processing of other wu\'s, both globally and on each host that gets one. The replication is only 2. I can\'t see them dialing it back to 1. Or am I missing something? Alain posted in another thread yesterday that they have corrected a bug in the WUs themselves. He also says we won\'t get more buggy WUs which makes me think they deleted all the buggy ones. That should help a lot. Still, I get WUs that have 2 to 4 compute errors against them. 99% of them come from hosts running older versions of BOINC. They crunch error free on my system and validate and yield a canonical result if there is one other successful crunch on them. Then they\'re done and won\'t be replicated again. Also, the crunch reports indicate the WUs crash immediately on startup so next to zero CPU time is wasted. I\'ve been sending the following text in a private message to all my quorum partners who have failed a WU due to running an old version of BOINC...
____________ -- |
|
ID: 3071 | Rating: 0 | rate: / | |
Japp, there are two different errors. |
|
ID: 3072 | Rating: 0 | rate: / | |
Japp, there are two different errors. Yeah, but when I get an _4 or a _3 wu the odds are pretty high that the wu is buggy and going to fail out leaving my computer doing nothing until I acknowledge the error and since I do not sit at my computer anywhere near 24/7 it could be sitting idle a long time, just like yesterday. Plus there is no way to find the _4 and _3 wu\'s in my without having to go through them one by one, which obviously is not something I\'m too willing to do. @Dagorath said... The replication is only 2. I can\'t see them dialing it back to 1. Or am I missing something? Yes initial replication is 2, but on error they will replicate much higher. That is waht the max # of error/total/success results 7, 20, 10 is all about. Alain posted in another thread yesterday that they have corrected a bug in the WUs themselves. He also says we won\'t get more buggy WUs which makes me think they deleted all the buggy ones. That should help a lot. I will take this on board and not abort any further _3 or _4 wu\'s. Thanks for the feedback guys. Paul. |
|
ID: 3074 | Rating: 0 | rate: / | |
Hmm... I don\'t really have time or will to read through all this but from what I can gather I\'m using the latest test version of BOINC which is quite stable at present, and I\'m still getting about a 40-50% fail rate with these errors... Am I missing something? |
|
ID: 3075 | Rating: 0 | rate: / | |
Hmm... I don\'t really have time or will to read through all this but from what I can gather I\'m using the latest test version of BOINC which is quite stable at present, and I\'m still getting about a 40-50% fail rate with these errors... Am I missing something? No, you\'re not missing anything. It looks like all the errors you had, had errored out on other hosts as well. So it is nothing to do with you. In fact I\'ve had a wu error out recently as well that all other hosts had errored out on. Based on this, I\'m still tempted to abort any wu with an _3 or _4 suffix and will definitely abort any I get that are _5, _6 or _7. |
|
ID: 3079 | Rating: 0 | rate: / | |
Based on this, I\'m still tempted to abort any wu with an _3 or _4 suffix and will definitely abort any I get that are _5, _6 or _7. I shall join you in this. Will we continue to get them or are they forever gone now? ____________ |
|
ID: 3081 | Rating: 0 | rate: / | |
I shall join you in this. Will we continue to get them or are they forever gone now? Still krunching. I have 5 running now (3 Hosts) and 51 stading by to go, last download was 5 minutes ago, no wait make that now, another completed and the host is grabbing another to refresh its queue. So yes, they still have some. |
|
ID: 3082 | Rating: 0 | rate: / | |
I finally had one crash too and got the \"application had a problem, want to report this to Microsoft?\" popup which halted all crunching which sucks. Can that dirty crunch halting popup be turned off somehow? I don\'t see a way to turn it off via Windows Control Panel. Maybe tweaking some registry setting turns it off? |
|
ID: 3085 | Rating: 0 | rate: / | |
I finally had one crash too and got the \"application had a problem, want to report this to Microsoft?\" popup which halted all crunching which sucks. Can that dirty crunch halting popup be turned off somehow? I don\'t see a way to turn it off via Windows Control Panel. Maybe tweaking some registry setting turns it off? On Windows XP SP2: Control Panel -> System -> Advanced -> Error reporting button (bottom - right) -> disable. More detailed instructions: http://support.microsoft.com/kb/310414 ____________ |
|
ID: 3086 | Rating: 10 | rate: / | |
I finally had one crash too and got the \"application had a problem, want to report this to Microsoft?\" popup which halted all crunching which sucks. Can that dirty crunch halting popup be turned off somehow? I don\'t see a way to turn it off via Windows Control Panel. Maybe tweaking some registry setting turns it off? Now that is a useful piece of info I didn\'t know yet. If I could click 100 times on the \'+\' icon, I would ;) |
|
ID: 3088 | Rating: 0 | rate: / | |
I finally had one crash too and got the \"application had a problem, want to report this to Microsoft?\" popup which halted all crunching which sucks. Can that dirty crunch halting popup be turned off somehow? I don\'t see a way to turn it off via Windows Control Panel. Maybe tweaking some registry setting turns it off? I\'ll second that! |
|
ID: 3089 | Rating: 0 | rate: / | |
Me three! Thanks Kabal :) |
|
ID: 3103 | Rating: 0 | rate: / | |
Me three! Thanks Kabal :) Me four! Thanks from me too! ____________ |
|
ID: 3107 | Rating: 0 | rate: / | |
You are correct when using 5.8.x or later clients. There is one thing to consider though, the client will wait no longer than double the switch interval before forcing a switch. Even in that case it should suspend to memory if the app has never checkpointed. ____________ BOINC WIKI BOINCing since 2002/12/8 |
|
ID: 3109 | Rating: 0 | rate: / | |
Thanks, John. Will it override and suspend to memory even if \"leave apps in memory while suspended\" setting = no? |
|
ID: 3112 | Rating: 0 | rate: / | |
Thanks, John. Will it override and suspend to memory even if \"leave apps in memory while suspended\" setting = no? Yes, but only if the app has never checkpointed. ____________ BOINC WIKI BOINCing since 2002/12/8 |
|
ID: 3113 | Rating: 0 | rate: / | |
|
|
ID: 4297 | Rating: 0 | rate: / | |
Message boards : Malaria Control : Important message concerning the map predictor application