Status update -- june 2010 |
Message boards : Malaria Control : Status update -- june 2010
Author | Message |
---|---|
Update June 2010 |
|
ID: 13055 | Rating: 0 | rate: /
|
|
Getting significant errors in v6.41. See thread here. |
|
ID: 13064 | Rating: 0 | rate: /
|
|
Hi, |
|
ID: 13068 | Rating: 0 | rate: /
|
|
thanks for the update. But one question remains: Linux? Hi Michael, Linux installations can vary quite a bit with regards to installed library versions, and unfortunately we're finding it a bit difficult making linux binaries that work on all systems. In any case, linux binaries are available for the openmalariaBeta application, and when we've managed to solve the incompatibility problems somehow, we'll release linux versions for the non-testing applications too. |
|
ID: 13073 | Rating: 0 | rate: /
|
|
On of the issue that I have is that there is very little time for the WU to complete. I usually want store 2 days worth of WU (in case the server is down for maintenance or the internet conx is down). When Malaria control see that I share my computer with any other BOINC project, it will dispatch Malaria control in HIGH PRIORITY because the deadline to return the WU result is too short. |
|
ID: 13340 | Rating: 0 | rate: /
|
|
On of the issue that I have is that there is very little time for the WU to complete. I usually want store 2 days worth of WU (in case the server is down for maintenance or the internet conx is down). When Malaria control see that I share my computer with any other BOINC project, it will dispatch Malaria control in HIGH PRIORITY because the deadline to return the WU result is too short. This question has come up a few times. Basically, when we use openmalaria for parameter fitting, we iteratively create work units based on previously completed sets (of workunits). Thus, if we allow workunits more time and sets take longer to complete, our server will have to use older completed sets when creating new work, and the fitting actually becomes less efficient. As far as I understand, however, the BOINC client should still attempt to balance the overall workload between projects as in your settings, however it may do a whole batch from one project before doing another batch from a different project. |
|
ID: 13350 | Rating: 0 | rate: /
|
|
Thanks for the information. It's nice to know that our crunching is contributing to a worthy cause. |
|
ID: 13454 | Rating: 0 | rate: /
|
|
Thanks for the information. It's nice to know that our crunching is contributing to a worthy cause. Yes and maybe: Yes, they're getting ready to test a BOSSA (BOINC-extension for volunteer thinking) project called africaMap. This should be a worthy project, and for this they have the server up and running. In addition, there's a possibility that there will be another BOINC project run out of UCT (currently in early planning). Maybe, I'm not sure the workunits they're currently sending out are of any use. I asked their project admin to switch workunit creation off, should this not be the case. Nick ____________ Nicolas Maire Swiss Tropical and Public Health Institute http://www.swisstph.ch |
|
ID: 13457 | Rating: 0 | rate: /
|
|
Thanks for the prompt response, Nick. |
|
ID: 13460 | Rating: 0 | rate: /
|
|
I don't suppose you can get them to turn the servers back on for a bit? I have several dozen dangling wu's O_0 |
|
ID: 13511 | Rating: 0 | rate: /
|
|
Yes, please turn the project back on long enough for us to report the work in progress. |
|
ID: 13512 | Rating: 0 | rate: /
|
|
Thanks for the prompt response, Nick. I agree with the sentiments expressed by others: that it would (at least) be polite if the UCT : malariacontrol.net admins could switch their site back on, with work creation switched off, to allow users to return work that they have completed. It would also allow users to set their resource share on that project to 0. Nick, you seem to have some influence with the UCT Computer Science Department ... any chance you could use it one more time? ____________ |
|
ID: 13516 | Rating: 0 | rate: /
|
|
I'll check with them and see what I can do. Thanks for the prompt response, Nick. ____________ Nicolas Maire Swiss Tropical and Public Health Institute http://www.swisstph.ch |
|
ID: 13528 | Rating: 0 | rate: /
|
|
Thank you, Nick. I'll check with them and see what I can do. ____________ |
|
ID: 13530 | Rating: 0 | rate: /
|
|
I'll check with them and see what I can do. Any progress? We are all still sitting on a bunch of tasks that need to be returned. Thanks! ____________ Dublin, CA Team SETI.USA |
|
ID: 13577 | Rating: 0 | rate: /
|
|
The problem with the behaviour of the Malaria Control server at the moment seems to be that it is delivering WU's assuming that it has 100% of the BOINC resources. The BOINC client works out that it needs to run those WU at a high priority to make the dead line. |
|
ID: 13581 | Rating: 0 | rate: /
|
|
The problem with the behaviour of the Malaria Control server at the moment seems to be that it is delivering WU's assuming that it has 100% of the BOINC resources. The BOINC client works out that it needs to run those WU at a high priority to make the dead line. What BOINC does is run projects that it thinks won't finish in time based on activity. When you suspend a project it throws its computatuions off as the other projects continue to count up or down, but the number for the suspend project does not change as it is suspended. As each project is run, it builds up a Long Term Debt, also a Short Term Debt in some case, especially when these are run in High Priority out of the normal time slice the project should be allowed. After a while boinc will settle down and give more time back to the other projects and slack off running the project which it over ran before, in this case MC. You just have to be patient and let it run, it does this over a longer term, like a week to a month or more, not hour to hour, that is how the long term debt is suppose to work. |
|
ID: 13591 | Rating: 0 | rate: /
|
|
The problem with the behaviour of the Malaria Control server at the moment seems to be that it is delivering WU's assuming that it has 100% of the BOINC resources. The BOINC client works out that it needs to run those WU at a high priority to make the dead line. Also what is your cache size for Boinc set to? |
|
ID: 13593 | Rating: 0 | rate: /
|
|
I'll check with them and see what I can do. Will let you know when I hear back... Nick ____________ Nicolas Maire Swiss Tropical and Public Health Institute http://www.swisstph.ch |
|
ID: 13606 | Rating: 0 | rate: /
|
|
I'll check with them and see what I can do. Sorry to say that there seems to be no one in Cape Town at the moment in a position to take the project back online. I understand it is likely that the server will come back at the time they are ready for production hosting. At this point there should be a possibility to return tasks. Don't know when this will happen though.. Nick ____________ Nicolas Maire Swiss Tropical and Public Health Institute http://www.swisstph.ch |
|
ID: 13692 | Rating: 0 | rate: /
|
|
Okay. Thanks for the update. |
|
ID: 13701 | Rating: 0 | rate: /
|
|
While I like your theory, it is not what Malariacontol is doing, I told Boinc to not accept new tasks from Malariacontrol but to finish the ones it had. |
|
ID: 13862 | Rating: 0 | rate: /
|
|
While I like your theory, it is not what Malariacontol is doing, I told Boinc to not accept new tasks from Malariacontrol but to finish the ones it had. As soon as you told Boinc not to accept any Malaria units it went negative in the time it 'owed' Malaria for the overall crunching, it is now making it up. Boinc is always to use 10gb of disk and all of the processors, it had been only 30% of the processors but for this test I have it using all of the processor (and it had been runing in that state for 12 hours before bringing the MC data down) I can only say what has been said in the past, this Project uses the returned data in a timely manner to adjust the actual treatments being given in the field. Extending the deadlines can mean more people die. |
|
ID: 13865 | Rating: 0 | rate: /
|
|
We are currently fitting some parameters, that's why the deadline is so short. For further information, please have a look on this post. Thank you for your feedback Guillaume ____________ Guillaume Gnaegi Swiss Tropical and Public Health Institute http://www.swisstph.ch |
|
ID: 13889 | Rating: 0 | rate: /
|
|
Sorry to say that there seems to be no one in Cape Town at the moment in a position to take the project back online Thanks Nick for that precision. But, in fact, there should still be an appeal to the university management, or executive research services. Even if UCT-Malaria was a test for Malaria, despite it's just not polite to switch off a Boinc server like this, I don't get while it wasn't possible for a such university to have a "left-behind" server (even not a machine, maybe a virtual server) which can handle last requests of Boinc volunteers. Today, it's even not possible to detach from UCT-Malaria! When people give their time and computer ressources to a project, the lesser of politeness is that they can terminate the WUs, or detach properly from the project. When correct information is given, when a deadline for the final shutdown is given, people may act in consequences. For UCT-Malaria, it looks like someone just pull out the plug one morning... Actually when you have dozens of machines stucked with UCT-Malaria, the only possibility left now is to clean manually, one by one, all previously attached computers... Nice... If UCT university is not willing to do anything, do you think that people at MalariaControl could put online a server just to respond in place of UCT-Malaria, so we can detach and clean our Boinc clients? (I understand this could be delicate even due to domain name properties) |
|
ID: 14167 | Rating: 0 | rate: /
|
|
When people give their time and computer ressources to a project, the lesser of politeness is that they can terminate the WUs, or detach properly from the project. When correct information is given, when a deadline for the final shutdown is given, people may act in consequences. For UCT-Malaria, it looks like someone just pull out the plug one morning... I do agree. The funny thing is, in July this year, someone did manage to switch work creation for this project on for a few weeks ... just before they pulled out the plug. ____________ |
|
ID: 14171 | Rating: 0 | rate: /
|
|
Hi, someone did manage to switch work creation for this project on for a few weeks ... just before they pulled out the plug. I'm truly sorry to read this, because I obviously didn't follow (enough) the project news in order to terminate it properly... Too bad. Still, I continue thinking that, nowadays, set up a VirtualMachine in a "corner" of the network, just leaving the scheduler service running, is a minimal cost for IT services, and it would have let us the ability to detach quietly from UCT-Malaria, leaving a longer period to do it. Regards |
|
ID: 14179 | Rating: 0 | rate: /
|
|
I'm truly sorry to read this, because I obviously didn't follow (enough) the project news in order to terminate it properly... Too bad. There was no project news to follow. All that happened was that work creation was switched on in July (after a long hiatus), and then in August (after Nicolas Maire contacted UCT) the plug was pulled, completely and without warning. One can follow this by looking at the monthly Total Credit graph for UCT Malaria on BOINCstats. ____________ |
|
ID: 14186 | Rating: 0 | rate: /
|
|
Thanks for the information. It's nice to know that our crunching is contributing to a worthy cause. Any news on when or if UCT will return ____________ |
|
ID: 15145 | Rating: 0 | rate: /
|
|
Message boards : Malaria Control : Status update -- june 2010