A THIRD science application for malariacontrol

Message boards : Malaria Control : A THIRD science application for malariacontrol

Author Message
Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

This post was last updated on 9. of May 08.


Latest update is here


A new science application called \"optimizer\" will be launched from Monday 17.September onwards, or shortly there after. Watch this thread for news, it will be announced when we start.

At first, it will be run as a test application, meaning that only users who have \"run test applications\" and \"run optimizer application\" checked in their account settings (under malariacontrol.net preferences) will get work.



In addition, only windows hosts will get work.


Work units will take from 5 min up to one hour, depending on the model parameters. No checkpointing is done, and progress will not be indicated, so wait at least for an hour before thinking the work unit could be stuck..
Calculation is done by a java program, contained within the standard boinc-\"wrapper\" application. You need to have java installed.. if not, you will be prompted to do so.
Deadlines: in the testing phase 48h, after that three days.

The name \"optimizer\" for this application was chosen because the server side components are essentially a \"general use\" optimization framework to be used by scientists in our group to work on more specific questions. E.g. to fit simpler models for which the \"big\" malaria model would just not be what you want. The insights from those calculations will help us to improve the main malariacontrol application in the future.


On the science of the project:


To make quantitative predictions of malaria transmission, it is very important to know how long an infection lasts in an infected human. Because the longer it lasts, the more mosquitoes can get infected, the more infected mosquitoes you have, the more humans are being infected etc, etc, etc..
It may at first seem very straightforward to measure this: you just look when somebody gets infected, and then you keep taking blood samples until that person is not infected anymore.
Unfortunately, you only have a chance of about 50% percent to detect an infection, given that it is there. So you already have a problem: you don\'t know when the infection started, and you don\'t know exactly when it ended.
In addition: In areas of high malaria transmission people are very often infected with up to ten or more infections simultaneously... so you never know if what you\'re seeing is still the same infection or a new one..
Recently some work at our institute has used new dna-based methods (which allow distinction of different infections), together with a mathematical approach, to estimate the average duration of an untreated p. falciparum infection.


see Sama etal. 2006
(sorry, only the abstract is freely available to the public)


So far so good, this was an important step forward. The problem that remains is: how are the durations distributed? In other words: do all of the infections last exactly 200 days and then all of them stop? Or does an infection have a constant probability to disappear, which remains constant no matter how old the infection is? Probably none of the two is true, but we need to describe the shape of that distribution of durations somehow, in order to make sensible predictions.


for more on that, see Sama etal.2006b


That\'s almost where we want to go, except for one thing: the above paper measures the distribution in people living in the US who had never experienced malaria before. They were infected on purpose, to cure their syphilis (the method of choice at that time..) We don\'t know what the picture looks like in people living in areas of high transmission, with multiple infections at a time and after decades of being constantly infected...

Attempts to find a mathematical solution to this problem did not work out.. the equations become unsolvable. But there is a way out: instead of using equations, we can use individual based simulations, that means we simulate every single infection in a computer program, and see what parameters can best produce the data we have. The big drawback there is, this just takes too long to calculate on a single computer.
That\'s what we need you guys and girls for, and thanks a lot for making this possible!!

P.S.: Something about the data collection mentioned above, to prevent misunderstandings: There are strict ethical guidelines on how one is allowed to obtain such data. Since most malaria infections in high transmission areas don\'t cause any symptoms, being infected with malaria doesn\'t mean you are sick (because of acquired immunity). People who did have symptoms were of course given treatment.
____________
Michael

Tom Philippart
Send message
Joined: Jun 25 06
Posts: 29
Credit: 220,888
RAC: 0

Thanks for the update and of course the work \"behind the scenes\"! This really sounds interesting and I\'m lloking forward to run the new app!

The insights from those calculations will help us to improve the main malariacontrol application in the future.


Will this only be an intermediate application to further improve the main app or will it stay as a stand alone application?

Thanks
____________

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0


Will this only be an intermediate application to further improve the main app or will it stay as a stand alone application?

Thanks


thanx 4 the thanx:) the app will probably stay for quiet some time, a year or so, depends on what comes out of it.. there might be no work for some time and then it starts again.. the actual task it performs might also change at some point later, but we would announce that..
____________
Michael

Profile KSMarksPsych
Avatar
Send message
Joined: Mar 7 06
Posts: 28
Credit: 22,974
RAC: 0

Neat!

Is this the first time a java app has been used in the BOINC framework (across all projects, not just within MCDN)?
____________
Kathryn :o)
The BOINC FAQ Service
The Unofficial BOINC Wiki
The Trac System
More BOINC information than you can shake a stick of RAM at.

Profile The Gas Giant
Avatar
Send message
Joined: Mar 7 06
Posts: 1213
Credit: 3,503,340
RAC: 1,667

Looking forward to getting a few of the Java wu\'s, just made sure I\'ve got the latest Java installed!

Profile maire
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: Nov 7 05
Posts: 438
Credit: 118,258
RAC: 0

Neat!
Is this the first time a java app has been used in the BOINC framework (across all projects, not just within MCDN)?

Java has been on the BOINC agenda for some time, but as far as I know there\'s currently no public project that\'s running a Java app (please correct me if I\'m wrong). A few BOINC-Java related activities:

  • There was a discussion group at the recent BOINC workshop in Geneva, mainly on how to ensure the presence of a virtual machine (JRE) on a client.
  • At the same workshop, members of the SZTAKI group presented some preliminary work on distributing a JRE together with the science application
  • There\'s an uppercase sample application in the BOINC SVN repsitory that demonstrates how to call the BOINC-API from Java.


As Michael said, we for now assume a Java runtime on the client, and use the wrapper approach.
Nick

____________
Nicolas Maire
Swiss Tropical and Public Health Institute
http://www.swisstph.ch

Scott Brown
Send message
Joined: Jul 14 06
Posts: 37
Credit: 2,661,753
RAC: 0


for more on that, see Sama etal.2006b


That\'s almost where we want to go...

Attempts to find a mathematical solution to this problem did not work out.. the equations become unsolvable. But there is a way out: instead of using equations, we can use individual based simulations, that means we simulate every single infection in a computer program, and see what parameters can best produce the data we have. The big drawback there is, this just takes too long to calculate on a single computer.


This sounds like a very interesting approach, but I am curious if you would be willing to provide more details regarding the unsolvable equations. Given the MLE framework in the modeling of the article cited above, the individual simulations make sense. I am curious, however, whether you considered Bayesian approaches to the problem since a set-up with strong priors would seem to make sense (at least given what I was able to get from a cursory skim of the article and Metroplis-Hastings approaches to Bayesian MCMC models).

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

Calculation is done by a java program, contained within the standard boinc-\"wrapper\" application. You need to have java installed.. if not, you will be prompted to do so.

Will this stop BOINC dead in its tracks though, or are we to test that as well?
I\'ve got the latest Java installed, but could try to uninstall it, if needed.
____________
Jord.

BOINC FAQ Service

j2satx
Send message
Joined: Jan 4 07
Posts: 12
Credit: 2,063,480
RAC: 745

Calculation is done by a java program, contained within the standard boinc-\"wrapper\" application. You need to have java installed.. if not, you will be prompted to do so.

Will this stop BOINC dead in its tracks though, or are we to test that as well?
I\'ve got the latest Java installed, but could try to uninstall it, if needed.


I\'ve had a hard time in the past installing Java on Ubuntu....doesn\'t want to get past the \"license\". If I can\'t solve that, I\'ll have to remove my Linux machines from Malaria to run Java apps on my Windows machines. I do not know of a way to select running your test apps on a per OS basis.

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

I do not know of a way to select running your test apps on a per OS basis.

Ever thought about using the different venues?

Set your Linux machines up with a different venue than your Windows machines. (default home location)
____________
Jord.

BOINC FAQ Service

j2satx
Send message
Joined: Jan 4 07
Posts: 12
Credit: 2,063,480
RAC: 745

I do not know of a way to select running your test apps on a per OS basis.

Ever thought about using the different venues?

Set your Linux machines up with a different venue than your Windows machines. (default home location)


Does that allow some computers to process \"test\" WUs and others not to?

I thought the option to run \"test\" WUs was global to the project.

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

No, the test option is set by the project preferences, which works on a venue basis. So you can set a separate venue without the test application for those machines that can\'t run them and one (the default venue for instance) for those machines that can run the test applications.



____________
Jord.

BOINC FAQ Service

j2satx
Send message
Joined: Jan 4 07
Posts: 12
Credit: 2,063,480
RAC: 745

No, the test option is set by the project preferences, which works on a venue basis. So you can set a separate venue without the test application for those machines that can\'t run them and one (the default venue for instance) for those machines that can run the test applications.




I do not see that the project preferences work on a venue basis. Where am I missing that setup. I go to my account and can either set general prefs or project prefs. I do not see any setting under general prefs that will let me use work, home or school and set project prefs differently for one of those venues.

Thanks.

j2satx
Send message
Joined: Jan 4 07
Posts: 12
Credit: 2,063,480
RAC: 745

No, the test option is set by the project preferences, which works on a venue basis. So you can set a separate venue without the test application for those machines that can\'t run them and one (the default venue for instance) for those machines that can run the test applications.




I do not see that the project preferences work on a venue basis. Where am I missing that setup. I go to my account and can either set general prefs or project prefs. I do not see any setting under general prefs that will let me use work, home or school and set project prefs differently for one of those venues.

Thanks.


Edit........found what you were saying. I did not have the venue settings set up under project prefs..........not sure I have ever been aware that was avail.......I\'ll rethink the way I\'m using venues.

Thanks for your help.

j2satx
Send message
Joined: Jan 4 07
Posts: 12
Credit: 2,063,480
RAC: 745

Calculation is done by a java program, contained within the standard boinc-\"wrapper\" application. You need to have java installed.. if not, you will be prompted to do so.

Will this stop BOINC dead in its tracks though, or are we to test that as well?
I\'ve got the latest Java installed, but could try to uninstall it, if needed.


I did get the JRE 6 set up on my Linux boxes.......I couldn\'t see how to \"test\" the Java, but I\'ll watch for some of the Malaria \"test\" WUs to see if they work.

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

Thanks for your help.

You\'re welcome. Always glad to be of some help. :-)
____________
Jord.

BOINC FAQ Service

AnRM
Send message
Joined: Mar 7 06
Posts: 54
Credit: 2,130,571
RAC: 0

I do not know of a way to select running your test apps on a per OS basis.

Ever thought about using the different venues?

Set your Linux machines up with a different venue than your Windows machines. (default home location)


Jord, thanks for the tip! Will re-venue some of our boxes as well. Cheers, Rog.
____________

Profile The Gas Giant
Avatar
Send message
Joined: Mar 7 06
Posts: 1213
Credit: 3,503,340
RAC: 1,667

I haven\'t seen a java wu as yet. Has anyone?

AnRM
Send message
Joined: Mar 7 06
Posts: 54
Credit: 2,130,571
RAC: 0

I haven\'t seen a java wu as yet. Has anyone?

Not yet.....we were getting worried that we had a problem so thanks for the post. We haven\'t seen any mappredictors either.....Cheers, Rog.
____________

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

I haven\'t seen a java wu as yet. Has anyone?

Paul, they will be release on the 17th or there after.

See top post.
A new science application called \"optimizer\" will be launched from Monday 17.September onwards, or shortly there after.


Hmmm, I could have re-read this last night before telling j2satx about venues.
In addition, only windows hosts will get work.


Oh well. :-)
____________
Jord.

BOINC FAQ Service

j2satx
Send message
Joined: Jan 4 07
Posts: 12
Credit: 2,063,480
RAC: 745

I haven\'t seen a java wu as yet. Has anyone?

Paul, they will be release on the 17th or there after.

See top post.
A new science application called \"optimizer\" will be launched from Monday 17.September onwards, or shortly there after.


Hmmm, I could have re-read this last night before telling j2satx about venues.
In addition, only windows hosts will get work.


Oh well. :-)


Missed that myself...getting old eyes......

The venue info was/is still valuable.......I\'m considering using it to differentiate between my 32-bit and 64-bit puters.

Thanks again.

Profile The Gas Giant
Avatar
Send message
Joined: Mar 7 06
Posts: 1213
Credit: 3,503,340
RAC: 1,667

Could have sworn it originally said the wu\'s would be released from Friday onwards......maybe I\'m just too eager! Thanks for pointing it out though Jord.

Live long and BOINC!

____________
Paul
(S@H1 8888)

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

hi all,
we have now started sending out workunits of the new application. their names start with opt_ ... We had to resolve a problem with multicore processors first.. which we did.. then there was/is a second problem, which we have not yet been able to resolve:
- for every workunit, a temporary file temp<number>.jar is deposited in your c:\\Documents and Settings\\<username>\\local settings\\temp\\ directory (or whatever your temp directory for applications is - it may depend on your windows configurations).
The file is only 3kb big, and will be deleted on next reboot. For most people this should not be a problem, since we are currently only sending work to hosts with at least 750Mb disk space free.. that means you can crunch 250000 workunits before your disk gets full, and if you have one reboot before that, you\'ll start again at zero.. still, we are not quiet happy with this, please send us your comments, if this is a problem for you :)
Otherwise, maybe there are people out there who have experience with Jsmooth for launching java apps. Have you come across this problem and found a solution for it?
thanks
Michael
____________
Michael

Profile The Gas Giant
Avatar
Send message
Joined: Mar 7 06
Posts: 1213
Credit: 3,503,340
RAC: 1,667

Just noticed I have some in my cache....looking forward to crunching them! How long will they take?

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0



This sounds like a very interesting approach, but I am curious if you would be willing to provide more details regarding the unsolvable equations. Given the MLE framework in the modeling of the article cited above, the individual simulations make sense. I am curious, however, whether you considered Bayesian approaches to the problem since a set-up with strong priors would seem to make sense (at least given what I was able to get from a cursory skim of the article and Metroplis-Hastings approaches to Bayesian MCMC models).



hi, we were looking into bayesian alternatives first, but i don\'t see a way how to do it.. first,introducing prior knowledge about the shape of the survival distribution of infections would be dangerous, because we know that the duration of infections depends on other factors (e.g. age -> more acquired immunity), but do not know exactly how (since we don\'t have data that could give prior information about immune people). You might get a result, but you\'ll have no idea, whether it is true.. so you can as well let it be. Second,we have not found a way to introduce a shape parameter (which you could use to incorporate such prior knowledge), which still allows us to calculate the frequency of observed sequence patterns..
suggestions always welcome..
More detail about the Mathematics of what we\'re doing is here.

have a nice weekend..

____________
Michael

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

Nice, checking in Task Manager reveals that not optimizer_1.18 is using the CPU, but something called transmission_11 .. How\'s that for confusion. :-)

First glances:
1. The estimated time to completion is way off (22m 1 sec and sticking there).
2. The progression bar isn\'t used, but perhaps that\'s because optimizer isn\'t doing anything, but the other app? (edit: again, this was mentioned in the first post, read up Jord... lol)
3. You say a *.jar file is made. I have to ask when, only at the end of the result? Nothing was made thus far.
4. \"Deadlines: in the testing phase 48h, after that three days.\" .. mine has a 3.5 day deadline.

I\'ll report back when this one has finished its run, whenever that is. :-)
____________
Jord.

BOINC FAQ Service

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

So it ran for all of 10 minutes. Finished all right.
ResultID

And I\'m sorry, but there\'s no *.jar file anywhere on my system.
Perhaps just as well.
____________
Jord.

BOINC FAQ Service

Kabal
Send message
Joined: Jun 20 06
Posts: 4
Credit: 176,273
RAC: 187

Just noticed I have some in my cache....looking forward to crunching them! How long will they take?

My box has completed 2 wu so far: the first took 36 seconds, the other 5774 seconds.
____________

AnRM
Send message
Joined: Mar 7 06
Posts: 54
Credit: 2,130,571
RAC: 0

Processed 10 \'opty\' WUs: no problems encountered. Run time varied from 325 to 7183 secs. Ten corresponding *.jar found as described....Cheers, Rog.
____________

ksba
Send message
Joined: Jul 13 06
Posts: 31
Credit: 13,957,650
RAC: 274

more than 30 PC run since more than 6 month. I hope you don\'t make more of this \"delete after next reboot\" things with bigger temps.
I really don\'t want that this pcs run out of hdd space ...
I hope you don\'t try to flood my hdd with this ;) Why should i reboot a pc if it can run without downtime :)
____________

Profile The Gas Giant
Avatar
Send message
Joined: Mar 7 06
Posts: 1213
Credit: 3,503,340
RAC: 1,667

I\'m getting a quite few going into computational error due to max cpu time being exceeded......

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

more than 30 PC run since more than 6 month. I hope you don\'t make more of this \"delete after next reboot\" things with bigger temps.
I really don\'t want that this pcs run out of hdd space ...
I hope you don\'t try to flood my hdd with this ;) Why should i reboot a pc if it can run without downtime :)


ksba, we\'re really sorry about this.. in your case I understand that its kind a tiresome to reboot 30 pc\'s if you don\'t have to.. We will certainly fix this, before leaving the testing state, and might have found a way to do it. We\'re working on it.. In the meantime, I would recommend to opt out of running the optimizer app, by checking the \"no\" box (for \"run optimizer app\") in your account-> project settings. So you can still get other testing work, just not from this application... and long live kantonsschule baden.. although, I was in Zofingen, that\'s also not bad.. maybe one day ksba will enjoy a similarly high international reputation as zofingen, and this will then be highly merited:)


____________
Michael

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

I\'m getting a quite few going into computational error due to max cpu time being exceeded......


gas giant, thats strange, but I saw them. We raised the maximum cpu time a bit.. please check if it helps.. the application terminates normally after 2 hours maximum(physical time), and you get credit for it, even if the calculation did not come to an end by then (explanation below). You must have extremely fast computers to \"use up\" so much cpu time whithin so little physical time.. :)

To all others, thank you for the feedback!

note: Some of you noticed the huge variation in calculation time. This is normal, and is because certain parameter combinations result in a lot of work to do, and others not. E.g., if you have a very high rate of new infections, and at the same time your infections last for a long time, individuals will accumulate a high number of infections which have to be kept track of. In the opposite case, if the infection rate is very low, and clearing rate of infections is high, then there will be almost no infections in a given individual.. thus it is very quick to calculate.


The workunits which take very very long are likely to be outside realistic parameter combinations, therefore we set a cutoff, currently at two hours - which we might reduce if possible. Those should appear mainly in the beginning of a new run, and when the search algorithm comes closer to the solution, they get less and less..
____________
Michael

Profile Rebirther
Avatar
Send message
Joined: Mar 7 06
Posts: 22
Credit: 13,176
RAC: 0

I will post it here because no answer yet
https://malariacontrol.net/forum_thread.php?id=524

darkpella
Send message
Joined: Jun 23 06
Posts: 7
Credit: 47,400
RAC: 27

Hi,

one of my 2 hosts is running optimizer application on this wu since some hours, but the CPU time counter is not progressing any single second. Also, in task manager, the total CPU time devoted to the optimizer process is 0.

It\'s a 3,00 GHz Xeon Server, so it has 2 cores and both are running BOINC processes concurrently, but, with other BOINC projects, this never led to one of the processes getting stuck.

Please tell me if I should abort this WU.

I\'ve got another optimizer WU within the cache of my other host running BOINC, which is a single core one w/o hyperthrading, so it will run alone, when it will run. Will see if this issue replicates on this one as well

darkpella
____________

Rene Oskam
Avatar
Send message
Joined: Mar 4 07
Posts: 7
Credit: 11,998
RAC: 0

Have had a couple opt-wu\'s on my Athlon (windows) host now and all seemed to run fine.
Completion in about 160 seconds.

;-)
____________

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

Hi,

one of my 2 hosts is running optimizer application on this wu since some hours, but the CPU time counter is not progressing any single second. Also, in task manager, the total CPU time devoted to the optimizer process is 0.

It\'s a 3,00 GHz Xeon Server, so it has 2 cores and both are running BOINC processes concurrently, but, with other BOINC projects, this never led to one of the processes getting stuck.

Please tell me if I should abort this WU.

darkpella


hi, abort it, if it took longer than 2 hours. All the things you mention do not indicate a failure though.. the \"optimizer\" process is the boinc wrapper, transmission_<version_num>.exe is the java process.. so its normal that optimizer doesn\'t use cpu. There is no progress bar, because the application doesn\'t do checkpointing.
____________
Michael

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

I will post it here because no answer yet
https://malariacontrol.net/forum_thread.php?id=524


try to reset the project. if it doesnt work, reinstall java. That the output file is being absent means that the java app was not running for some reason - and didn\'t produce an output file..
____________
Michael

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

ksba, we\'re really sorry about this.. in your case I understand that its kind a tiresome to reboot 30 pc\'s if you don\'t have to..

Well, you don\'t have to. I just tested deleting the temp*.jar files from my WINNT\\Temp folder by hand and it didn\'t hurt my system one bit. I had accumulated around 10 of them. Deleted them, cleaned out my recycle bin.

I don\'t see the need to reboot, if there\'s no need to reboot. Just doing a little housecleaning by hand once in a while.:-)
____________
Jord.

BOINC FAQ Service

Profile Rebirther
Avatar
Send message
Joined: Mar 7 06
Posts: 22
Credit: 13,176
RAC: 0

I will post it here because no answer yet
https://malariacontrol.net/forum_thread.php?id=524


try to reset the project. if it doesnt work, reinstall java. That the output file is being absent means that the java app was not running for some reason - and didn\'t produce an output file..


You say I must install the java platform first? I havent any java installed, thought that BOINC is doing all alone and the app is integrated within?!

Profile mikey
Avatar
Send message
Joined: Mar 23 07
Posts: 4120
Credit: 5,299,282
RAC: 1,702

I will post it here because no answer yet
https://malariacontrol.net/forum_thread.php?id=524


try to reset the project. if it doesnt work, reinstall java. That the output file is being absent means that the java app was not running for some reason - and didn\'t produce an output file..


You say I must install the java platform first? I havent any java installed, thought that BOINC is doing all alone and the app is integrated within?!


This is a new set of data that Malaria is sending out. It auto senses if you have Java installed and is run only if you have the \'run test apps\' turned on. It also only runs on Windows machines. In another section they just said that this run is over and the results have been sent to the scientists.
____________

RandyC
Avatar
Send message
Joined: Jun 23 06
Posts: 2695
Credit: 850,101
RAC: 1,184

I will post it here because no answer yet
https://malariacontrol.net/forum_thread.php?id=524


try to reset the project. if it doesnt work, reinstall java. That the output file is being absent means that the java app was not running for some reason - and didn\'t produce an output file..


You say I must install the java platform first? I havent any java installed, thought that BOINC is doing all alone and the app is integrated within?!


This is a new set of data that Malaria is sending out. It auto senses if you have Java installed and is run only if you have the \'run test apps\' turned on. It also only runs on Windows machines. In another section they just said that this run is over and the results have been sent to the scientists.


That is not correct. Map Predictor run 5.20 just closed out.

The Optimize app is described in the first post of this thread and it requires the user to have a valid Java environment available to run.

Profile Rebirther
Avatar
Send message
Joined: Mar 7 06
Posts: 22
Credit: 13,176
RAC: 0

I will post it here because no answer yet
https://malariacontrol.net/forum_thread.php?id=524


try to reset the project. if it doesnt work, reinstall java. That the output file is being absent means that the java app was not running for some reason - and didn\'t produce an output file..


You say I must install the java platform first? I havent any java installed, thought that BOINC is doing all alone and the app is integrated within?!


This is a new set of data that Malaria is sending out. It auto senses if you have Java installed and is run only if you have the \'run test apps\' turned on. It also only runs on Windows machines. In another section they just said that this run is over and the results have been sent to the scientists.


Ok, thx, I have overread this below ^^

Profile Rebirther
Avatar
Send message
Joined: Mar 7 06
Posts: 22
Credit: 13,176
RAC: 0

It is working now but found something odd, suspend a wu and after some time the WU finished successful.

RandyC
Avatar
Send message
Joined: Jun 23 06
Posts: 2695
Credit: 850,101
RAC: 1,184

First opt WU run here. It ran 0.05 secs and quit with 0 credit claimed.

[update]Looks like it may have been a firewall problem with the Java Client. I\'ve got another one queued to crunch in a while...so we\'ll see.

RandyC
Avatar
Send message
Joined: Jun 23 06
Posts: 2695
Credit: 850,101
RAC: 1,184


[update]Looks like it may have been a firewall problem with the Java Client. I\'ve got another one queued to crunch in a while...so we\'ll see.


Yup, it looks like it was.

Notes:
1. Javaw.exe runs at normal priority instead of low...according to task manager.
2. No progress indicator (has been noted before) and no Time to Completion

Still running...will continue to monitor.

RandyC
Avatar
Send message
Joined: Jun 23 06
Posts: 2695
Credit: 850,101
RAC: 1,184


Still running...will continue to monitor.


It finished after about 45 mins. The stderr.out shows the same contents as the WU that ran for 5/100s of a second...claims 0 credit.

This is from the log:

Project Date Message
malariacontrol.net beta 9/25/2007 9:22:22 PM Starting opt_1_-44_5_390943439_3
malariacontrol.net beta 9/25/2007 9:22:22 PM Starting task opt_1_-44_5_390943439_3 using optimizer version 118
malariacontrol.net beta 9/25/2007 10:07:39 PM Computation for task opt_1_-44_5_390943439_3 finished
SETI@home 9/25/2007 10:07:39 PM Resuming task 07mr07aa.4995.15614.15.6.190_1 using setiathome_enhanced version 528
--- 9/25/2007 10:10:45 PM Resuming network activity
malariacontrol.net beta 9/25/2007 10:10:46 PM [file_xfer] Started upload of file opt_1_-44_5_390943439_3_0
malariacontrol.net beta 9/25/2007 10:10:51 PM [file_xfer] Finished upload of file opt_1_-44_5_390943439_3_0
malariacontrol.net beta 9/25/2007 10:10:51 PM [file_xfer] Throughput 696 bytes/sec
malariacontrol.net beta 9/25/2007 10:11:08 PM Sending scheduler request: Requested by user
malariacontrol.net beta 9/25/2007 10:11:08 PM Reporting 1 tasks
malariacontrol.net beta 9/25/2007 10:11:12 PM Scheduler RPC succeeded [server version 601]
malariacontrol.net beta 9/25/2007 10:11:12 PM Deferring communication for 11 sec
malariacontrol.net beta 9/25/2007 10:11:12 PM Reason: requested by project

Matthias Lehmkuhl
Send message
Joined: Jan 4 07
Posts: 4
Credit: 240,572
RAC: 367

9/26/2007 11:15:45 CEST |malariacontrol.net beta|Starting task opt_3_-23_5_110673634_2 using optimizer version 118

CPU Time is now at 1:30 hours
Progress bar is 0% (as described above)
Time to Completion is 1:26 hours
report deadline 09/29/2007
running state on BOINC 5.10.20 \"running, High priority\"

its the first i see running, but can it cause on progress bar 0%?

Runtime for the other results are from 0:00:17 to 1:43:34 hours
They are not reported yet.
My result table from the involved client
hostid=20173

Matthias

edit: i forgot - found 8 jar-files in my temp-dir
____________
Matthias

Profile The Gas Giant
Avatar
Send message
Joined: Mar 7 06
Posts: 1213
Credit: 3,503,340
RAC: 1,667

The application uses the BOINC wrapper which means that there is no progress shown. If you want to report the results then just press the update button.

Live long and BOINC!

____________
Paul
(S@H1 8888)

Profile [EK] joe carnivore
Send message
Joined: Jan 6 07
Posts: 1
Credit: 409,253
RAC: 0

My Sempron 3000 stop by 0,03 claimed 0,00 but the Time is allright, not the 0,03 and Client state done.

My Laptop running good ,but threre is no check point in Estimation.
I want to make off when I on work but the Estimation run some times over 2 hours.
This time is wastet.

My X2 3800 don´t stop jawa when Boinc stop. My Wife don´t play games.

This PC´s are not running at Estimation. To many errors.

Sorry for the bad english.

joe

RandyC
Avatar
Send message
Joined: Jun 23 06
Posts: 2695
Credit: 850,101
RAC: 1,184

hi all,
we have now started sending out workunits of the new application. their names start with opt_ ... We had to resolve a problem with multicore processors first.. which we did..

Michael


Hi Michael,

Perhaps I\'ve come to a wrong conclusion. My first 3 opt_ WUs each claimed close to 0.05 secs (although two of them ran 45 mins or longer) run time and 0.00 credit. When their wingmen came in I was awarded credit anyway.

I didn\'t see anything valid in the stderr_out section of each WU, so I assumed they returned garbage and I turned off the option to receive these WUs.

Do you want me to continue crunching opt_ for any debugging purposes? or did these actually return valid data I\'m unaware of?

WU1
WU2
WU3

Since WU1 only ran a few seconds and I had a Firewall message pop-up asking for permission for Java, it is probably no good. The others I just don\'t know about.

Randy

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

hi all,
the bug with the temporary jar files is now fixed with version 1.19.
Further, we had some problems on the server side, which should also be solved now.

Because of those problems (one server side component would crash every 10 hours or so), we have not been able to fully asses how long we would need to run the models we have now, in order to fit them all. But I have to say, I\'m quiet impressed by the computing power you are generating only within the group of people who are crunching testing applications!
From what I have seen up to now, this is far more than what I expected.
We agreed that in order to have the application leave the testing status, we would definitely need to bundle a java virtual machine with it - as part of the application. Otherwise this would presumably lead to a lot of confusion, especially among crunchers who don\'t have java installed, don\'t want to install it, don\'t know what java is, don\'t know how to opt out, don\'t know that this forum exists (or maybe are not so aware of it)..
apart from that, we find that its really hard to track errors due to the very heterogeneous software environment.. we can never be sure that its not a certain java version, or a damaged java installation, custom-made stripped jvm or whatever.. which is causing them.
So we decided to leave it as a testing application for the moment, because we seem to have enough computing power to get the results, at least for now, and that we might as well improve the science part of the application instead..

We do get errors, but not too many.About three to five machines are responsible for about ninety percent of them, so I strongly suspect some misconfiguration there. Either reinstall java, or if it doesn\'t help and you keep getting errors, just opt out of crunching the optimizer app.
Of course you are also welcome to post your problem in this forum, maybe we can find a solution.


____________
Michael

Profile mikey
Avatar
Send message
Joined: Mar 23 07
Posts: 4120
Credit: 5,299,282
RAC: 1,702

hi all,

We do get errors, but not too many.About three to five machines are responsible for about ninety percent of them, so I strongly suspect some misconfiguration there. Either reinstall java, or if it doesn\'t help and you keep getting errors, just opt out of crunching the optimizer app.
Of course you are also welcome to post your problem in this forum, maybe we can find a solution.


Since you have our email addresses wouldn\'t it be helpful to email those owners and give them the chance to fix that or perhaps stop sending units to those machines. If one of my machines has a problem and no one tells me, how would I know? What if I didn\'t check them every day? They could just eb set to do their own thing and I could go months without doing anything to them. I have 19 mahcines on line and then take about 30 minutes or so everyday to check and send the units on all of them. I have the time and ability, some don\'t, or chose not to take the time. I had one machine a while back returning nothing but junk results, a fellow user pm\'d me and I took it offline for a couple of days and then brought it back online. It now seems to be fine. If that other user hadn\'t contacted me I might not have noticed for quite a while. It was returning units and getting new ones, it was just returning unusable results.
Just a thought.
____________

Profile Krunchin-Keith [USA]
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: Nov 10 05
Posts: 3047
Credit: 5,330,818
RAC: 4,054

From what I have seen up to now, this is far more than what I expected.

Glad to be helping.

I set my account for my home machines to run the test applications only to help put more power into testing and leave the other work that can be run by every one for them. I\'ve had a few machines that seem to be more of a problem so I just suspended MC.N on them as they have other projects they can contribute too. It seems like we are doing well enough so I will just leave them alone. I\'m happy either way.

Sadir
Send message
Joined: Mar 8 06
Posts: 2
Credit: 3,589
RAC: 0

2 WU (4492943 and 4492975) timed out within less than 2 hours, and another one eded in 36 seconds (4492945)
____________

diederiks
Send message
Joined: Jan 13 07
Posts: 4
Credit: 82,691
RAC: 61

23-10-2007 18:54:39|malariacontrol.net beta|[error] Can\'t rename output file opt_19_-26_5_9235911_2_0

Keep ketting these error\'s after reinstalling the new java 6.3

rbpeake
Send message
Joined: Oct 22 07
Posts: 5
Credit: 9,742
RAC: 0

Keep ketting these error\'s after reinstalling the new java 6.3

I just installed the new Java 6.3 and it works fine. Maybe there is another issue at work here....

____________
Regards,
Bob P.

Chris Sutton
Send message
Joined: Nov 10 05
Posts: 297
Credit: 4,941,683
RAC: 0

Keep ketting these error\'s after reinstalling the new java 6.3

I just installed the new Java 6.3 and it works fine. Maybe there is another issue at work here....

Something I forgot about....
You also need a BOINC client of version 5.50 or newer, because of the wrapper mechanism that is being used by MCDN.

This is just one issue, there could be others too.

diederiks
Send message
Joined: Jan 13 07
Posts: 4
Credit: 82,691
RAC: 61

Fixed the problem by a reboot of the machine.(didn\'t know java needed this!)
The other strange thing is the work units did worked fine and i recieved credit for it.

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

These optimizers shouldn\'t get any faster: https://malariacontrol.net/workunit.php?wuid=4533448 :-)
____________
Jord.

BOINC FAQ Service

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

hi all,
I just launched optimizer version 1.25 . It contains a bundled java runtime environment, so if everything goes well, you can run the optimizer app now without having java installed. That should pave the way to leave the \"testing\" status sooner or later.
regards
Michael
____________
Michael

[B^S] bigt0242000
Send message
Joined: Mar 7 06
Posts: 1
Credit: 27,241
RAC: 7

hi all,
I just launched optimizer version 1.25 . It contains a bundled java runtime environment, so if everything goes well, you can run the optimizer app now without having java installed. That should pave the way to leave the \"testing\" status sooner or later.
regards
Michael


So I am guessing it checks to see if java is installed? If java is not installed, how does the bundled jre run?
____________

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

So I am guessing it checks to see if java is installed?

No, the Java Runtime Environment file will run Java for you without you needing to install Java separately. Sort of a virtual Java for that application only.
____________
Jord.

BOINC FAQ Service

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

Dear all,
the new application version of \"optimizer\" runs so well (ca. 1% error rate) that we decided to leave the testing state by mid this week. From Wednesday 21.Nov on we will start sending out workunits to everybody, not only the testing volunteers.

For more information about what the application is doing, see here

The java runtime environment is now being distributed as part of the application, and thus you will not need to have java installed in order to run \"optimizer\" workunits.


If you do not want to cruch workunits from this application, proceed as follows:

go to
Your Account -> malariacontrol.net preferences

set \"Run optimizer application\" to \"No\"

note that once the application has left the testing state, the setting \"run test applications\" will have no effect on whether you receive workunits from it anymore (though it does for the remaining testing applications).

thank you for staying with us
Michael

____________
Michael

Profile Krunchin-Keith [USA]
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: Nov 10 05
Posts: 3047
Credit: 5,330,818
RAC: 4,054

Dear all,
the new application version of \"optimizer\" runs so well (ca. 1% error rate) that we decided to leave the testing state by mid this week. From Wednesday 21.Nov on we will start sending out workunits to everybody, not only the testing volunteers.

The java runtime environment is now being distributed as part of the application, and thus you will not need to have java installed in order to run \"optimizer\" workunits.

thank you for staying with us
Michael


What platforms are supported, is it still only Windows XP ?

I tried to run on some Windows 98 hosts with not so good luck.
They either showed client compute error
or
as Done with 0.00 completeion time and 0.00 credit but credit is still pending. About 3-4 out of 12 may have completed this way.

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0


What platforms are supported, is it still only Windows XP ?

I tried to run on some Windows 98 hosts with not so good luck.
They either showed client compute error
or as Done with 0.00 completeion time and 0.00 credit but credit is still pending. About 3-4 out of 12 may have completed this way.


Hi Keith,
I vaguely remember there was an issue with the wrapper approach, because win98 is not fully compatible with the windows api.. create_process() seemed to be a problem.. since we have taken the newest wrapper version of boinc, and we do get valid results from hosts with win98, I assume this issue has been resolved in the new wrapper versions - so no, it should be compatible also with win98.
The fact that some workunits terminate very quickly is because they realize early that the result is not going to be meaninful and terminate.. Unfortunately we can only do this at runtime, once the workunits are sent out.. and a wide range of durations is normal with this application, since it\'s a lot about exploring new types of models instead of calculating established ones.
Anyway, that\'s a personal opinion, but if I had old computers I think I wouldn\'t let it run on win98, I would take windows xp and tune it to the max (there\'s a lot of tutorials on the web), switch off all the things you don\'t need to run them as boinc client. i once did this with a 400Mhz laptop, and it actually worked ok, not slower than with something like win98, but much more reliable.. but as I said, that\'s my personal taste :)
cheers
Michael
____________
Michael

|MatMan|
Send message
Joined: Jul 17 06
Posts: 1
Credit: 144,597
RAC: 61

my first WU resulted in an error:

22.11.2007 10:55:33|malariacontrol.net beta|Starting opt_8_-392_5_429516145_0
22.11.2007 10:55:33|malariacontrol.net beta|Starting task opt_8_-392_5_429516145_0 using optimizer version 126
22.11.2007 10:56:10|malariacontrol.net beta|[error] Can\'t rename output file opt_8_-392_5_429516145_0_0
22.11.2007 10:56:12|malariacontrol.net beta|Computation for task opt_8_-392_5_429516145_0 finished
22.11.2007 10:56:12|malariacontrol.net beta|Output file opt_8_-392_5_429516145_0_0 for task opt_8_-392_5_429516145_0 absent

E6600, 2GB RAM, Windows XP, Boinc 5.10.28
____________

Profile Krunchin-Keith [USA]
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: Nov 10 05
Posts: 3047
Credit: 5,330,818
RAC: 4,054

I vaguely remember there was an issue with the wrapper approach, because win98 is not fully compatible with the windows api.. create_process() seemed to be a problem.. since we have taken the newest wrapper version of boinc, and we do get valid results from hosts with win98, I assume this issue has been resolved in the new wrapper versions - so no, it should be compatible also with win98.
The fact that some workunits terminate very quickly is because they realize early that the result is not going to be meaninful and terminate.. Unfortunately we can only do this at runtime, once the workunits are sent out.. and a wide range of durations is normal with this application, since it\'s a lot about exploring new types of models instead of calculating established ones.
Anyway, that\'s a personal opinion, but if I had old computers I think I wouldn\'t let it run on win98, I would take windows xp and tune it to the max (there\'s a lot of tutorials on the web), switch off all the things you don\'t need to run them as boinc client. i once did this with a 400Mhz laptop, and it actually worked ok, not slower than with something like win98, but much more reliable.. but as I said, that\'s my personal taste :)
cheers
Michael

Is there a minimum memory requirement ?

I\'ll check my machines, when I can, and see if I can determine if is just the tasks or if it is machine specific.

Most of those I have no choice in the o/s they run. I do not have reason to upgrade them,un-necessary expense, and some of the applications we run are Windows98 only, we do not have the WindowsXP version or there is not one available. Running in an emulation mode causes problems with some of the applications, so we just keep some Windows98 hosts around. We still have some software from 1989 that the boss likes and uses daily.

Profile The Gas Giant
Avatar
Send message
Joined: Mar 7 06
Posts: 1213
Credit: 3,503,340
RAC: 1,667

my first WU resulted in an error:

22.11.2007 10:55:33|malariacontrol.net beta|Starting opt_8_-392_5_429516145_0
22.11.2007 10:55:33|malariacontrol.net beta|Starting task opt_8_-392_5_429516145_0 using optimizer version 126
22.11.2007 10:56:10|malariacontrol.net beta|[error] Can\'t rename output file opt_8_-392_5_429516145_0_0
22.11.2007 10:56:12|malariacontrol.net beta|Computation for task opt_8_-392_5_429516145_0 finished
22.11.2007 10:56:12|malariacontrol.net beta|Output file opt_8_-392_5_429516145_0_0 for task opt_8_-392_5_429516145_0 absent

E6600, 2GB RAM, Windows XP, Boinc 5.10.28

I have seen this as well. Mainly when, for some reason, the hard disk is tied up by another application and BOINC can\'t do what it needs to do when a wu finishes.

glaesum
Send message
Joined: Nov 29 07
Posts: 7
Credit: 109,767
RAC: 1

Is there a minimum memory requirement ?
I\'ll check my machines, when I can, and see if I can determine if is just the tasks or if it is machine specific.

re: Windows98

has anyone reached a conclusion to the Win98 question?
I can see it\'s worth getting the latest java version first.

I\'m just setting up BOINC on an old 600mhz athlon with 384MB and it only runs win98.
the small WUs here at malariacontrol are attractive for such a limited system.

It\'s maddeningly frustrating to discover the minimum system requirements on many of the BOINC projects. I wish every project had a sticky forum thread in \'number crunching\' with that title - although Rosetta and WCG are explicit enough if you dig around. Sadly Rosetta went down just as I started to install so I haven\'t been able to try and test that one yet!

ed: ...ah, I\'ve just seen you can switch on/off each sub-application separately in one\'s own preferences page; if some are known to be ok then hopefully someone can advise which are good and which are doubtful. /pg

Profile Krunchin-Keith [USA]
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: Nov 10 05
Posts: 3047
Credit: 5,330,818
RAC: 4,054


What platforms are supported, is it still only Windows XP ?

I tried to run on some Windows 98 hosts with not so good luck.
They either showed client compute error
or as Done with 0.00 completeion time and 0.00 credit but credit is still pending. About 3-4 out of 12 may have completed this way.


Hi Keith,
I vaguely remember there was an issue with the wrapper approach, because win98 is not fully compatible with the windows api.. create_process() seemed to be a problem.. since we have taken the newest wrapper version of boinc, and we do get valid results from hosts with win98, I assume this issue has been resolved in the new wrapper versions - so no, it should be compatible also with win98.
The fact that some workunits terminate very quickly is because they realize early that the result is not going to be meaninful and terminate.. Unfortunately we can only do this at runtime, once the workunits are sent out.. and a wide range of durations is normal with this application, since it\'s a lot about exploring new types of models instead of calculating established ones.
Anyway, that\'s a personal opinion, but if I had old computers I think I wouldn\'t let it run on win98, I would take windows xp and tune it to the max (there\'s a lot of tutorials on the web), switch off all the things you don\'t need to run them as boinc client. i once did this with a 400Mhz laptop, and it actually worked ok, not slower than with something like win98, but much more reliable.. but as I said, that\'s my personal taste :)
cheers
Michael


Can you expalin why some hosts are showing ZERO time and claimed credit. Its not just Windows98. There is another thread started in which this seems to be an issue.

It apperas to us that the hosts claiming zero time are all marked valid, but the effect it has is it lowers the claimed credit and possibly the cpu claiming zero is getting too much or cheated out, but there is no way to tell without knowing the actual time.

I think the users would like this issue addressed.

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0


Can you expalin why some hosts are showing ZERO time and claimed credit. Its not just Windows98. There is another thread started in which this seems to be an issue.

It appears to us that the hosts claiming zero time are all marked valid, but the effect it has is it lowers the claimed credit and possibly the cpu claiming zero is getting too much or cheated out, but there is no way to tell without knowing the actual time.

I think the users would like this issue addressed.


hi keith and others,
first, sorry for the long delay.. I think something with the \"subscribe to this thread\" function does not work as it should, please report somebody else doesn\'t get notified about new posts..will look into this as well.
We investigated the issue, and it is solved now. From now on, people who use 0 cpu time do not get any credit, and don\'t pull down the granted credit for others. Since this is a sensitive issue, I would prefer not to explain exactly how it came to that and how we solved it. I can only say that it was most likely not intended by the involved users, but a bug, or say an \"unfortunate\" co-occurence in our software and the boinc software, which together granted too much credit in some cases..
We will now try to see how much credit we have to subtract from - only a handfull of probably unknowing but lucky - users, in order to restore justice ..

regards
Michael



____________
Michael

Profile mikey
Avatar
Send message
Joined: Mar 23 07
Posts: 4120
Credit: 5,299,282
RAC: 1,702


Can you expalin why some hosts are showing ZERO time and claimed credit. Its not just Windows98. There is another thread started in which this seems to be an issue.

It appears to us that the hosts claiming zero time are all marked valid, but the effect it has is it lowers the claimed credit and possibly the cpu claiming zero is getting too much or cheated out, but there is no way to tell without knowing the actual time.

I think the users would like this issue addressed.


hi keith and others,
first, sorry for the long delay.. I think something with the \"subscribe to this thread\" function does not work as it should, please report somebody else doesn\'t get notified about new posts..will look into this as well.
We investigated the issue, and it is solved now. From now on, people who use 0 cpu time do not get any credit, and don\'t pull down the granted credit for others. Since this is a sensitive issue, I would prefer not to explain exactly how it came to that and how we solved it. I can only say that it was most likely not intended by the involved users, but a bug, or say an \"unfortunate\" co-occurence in our software and the boinc software, which together granted too much credit in some cases..
We will now try to see how much credit we have to subtract from - only a handfull of probably unknowing but lucky - users, in order to restore justice ..

regards
Michael


For me I would like to say THANKS for finding and solving the problem! We users do not always need to know the whys, sometimes it is of a sensitive nature and we just do not always need to know. The important point is that you figured it out and stopped it in the future, again THANK YOU!
____________

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

hi all,
after major refactoring, fixing, beautifying, adding stuff etc.. of the optimizer application, we are coming back online.. since the application has changed quiet a bit we will put it back to testing status for some time (at least today, let\'s see how it goes).. Just to make sure we didn\'t introduce some horrible bugs. Once a sufficient number of workunits has come back and everything looks ok, we will leave testing status and send work to everybody who has checked the \"run optimizer app\" checkbox in the account -> malariacontrol.net preferences. As before, you will be able to opt out of getting such workunits, by changing your account settings.
looking forward to new results (am very excited :) and thanks again for crunching
have a nice weekend
Michael
____________
Michael

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

looking forward to new results (am very excited :) and thanks again for crunching

So much for that then. 7 hours of No work from project, there was work but not for the applications you have allowed. I\'ve turned the Malariacontrol simulation back on and got fed from that immediately.

Will try again later this weekend.
____________
Jord.

BOINC FAQ Service

Profile The Gas Giant
Avatar
Send message
Joined: Mar 7 06
Posts: 1213
Credit: 3,503,340
RAC: 1,667

looking forward to new results (am very excited :) and thanks again for crunching

So much for that then. 7 hours of No work from project, there was work but not for the applications you have allowed. I\'ve turned the Malariacontrol simulation back on and got fed from that immediately.

Will try again later this weekend.

LOL...yeah I even allowed test applications, but nada. It\'ll be interesting to see when the first wu\'s come through.

Live long and BOINC!

____________
Paul
(S@H1 8888)

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0


So much for that then. 7 hours of No work from project, there was work but not for the applications you have allowed. I\'ve turned the Malariacontrol simulation back on and got fed from that immediately.

Will try again later this weekend.

LOL...yeah I even allowed test applications, but nada. It\'ll be interesting to see when the first wu\'s come through.

Live long and BOINC!

I\'m sorry guys, I closed the tap yesterday night, panicking a bit before going for a beer, because I had the impression something was wrong: One of my workunits finished three times but \"without a finished file\", as it said.. Did this happen to anybody else? Please check in your messages whether you keep on crunching the same workunit or not.. Migth be just a problem with my computer, though.. ok, and maybe I have to send out some more workunits in order to be able to ask someone :)
thx

____________
Michael

Profile Krunchin-Keith [USA]
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: Nov 10 05
Posts: 3047
Credit: 5,330,818
RAC: 4,054

I\'m sorry guys, I closed the tap yesterday night, panicking a bit before going for a beer, because I had the impression something was wrong: One of my workunits finished three times but \"without a finished file\", as it said.. Did this happen to anybody else? Please check in your messages whether you keep on crunching the same workunit or not.. Migth be just a problem with my computer, though.. ok, and maybe I have to send out some more workunits in order to be able to ask someone :)thx

This is for Optimizer 1.28, right ?
I see:
host #1 ran 1 task for 39 minutes, finished with success

host #2 ran 1 task for 16-3/4 hours and exited with \'resource limit exceeded\'
Log shows messages:
maximum CPU time exceeded
then [error] Can\'t rename output file X
task X finished
Output file for task X absent
host #2 is running a second task which has been running for 1-3/4 hours so far.

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

Please check in your messages whether you keep on crunching the same workunit or not..

That might be a problem with your BOINC version as well. I\'m alpha testing 6.1.8 just to see if I can get it to do those things, but that\'s difficult if all I get is the No work from project message. :-(

But I\'ll go back to waiting for an Optimizer.
____________
Jord.

BOINC FAQ Service

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

Please check in your messages whether you keep on crunching the same workunit or not..

That might be a problem with your BOINC version as well. I\'m alpha testing 6.1.8 just to see if I can get it to do those things, but that\'s difficult if all I get is the No work from project message. :-(

But I\'ll go back to waiting for an Optimizer.


thanks for the information,
I have a wu running for more than 5 hours (2h should be absolute maximum - abort them if they exceed 2h).. so something is wrong. There are at least 2 issues:
1. it doesnt stop after 2h, means I have to go through the code again and make a new application version
2. made a mistake when configuring the \"estimated fpops\" parameter for the workunit..one zero to much.. how stupid..that means your clients get somewhat confused and think they are supercomputers, they get a very high opinion of themselves and think they are 10 times faster than they actually are, coz they finished so much work in no time - which causes them to pick up more work than they actually can do before the deadline (they\'ll realize their mistake in time though and will get back to the ground).. it\'s not too bad for you, I think, just for us a bit, but not much, since we are still testing.. to get out of this, just do reset project..easy to fix on our side

I generally think it\'s ok to abort the workunits, since there is a chance you won\'t get credit for it.. sorry about that. Hv to start from scratch again.

so there will be no work for some time now until we\'re ready..

I\'ll be looking into this over the weekend
sorry for inconveniences
Michael


____________
Michael

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

I generally think it\'s ok to abort the workunits, since there is a chance you won\'t get credit for it.. sorry about that. Hv to start from scratch again.

so there will be no work for some time now until we\'re ready..

Like we\'re here for the credits... I\'m not. (You\'re not either... :-))
But nice to see a developer say \"Hey, look, I messed up. I\'ll be available for a public flogging later this weekend!\".. refreshing. :-)

Just holler when the new new new Optimizers are ready to be tested.

Do you also tell your boss that he added a zero too many to your check? ;-)
____________
Jord.

BOINC FAQ Service

BobCat13
Send message
Joined: Jan 4 07
Posts: 6
Credit: 150,785
RAC: 246

I\'m sorry guys, I closed the tap yesterday night, panicking a bit before going for a beer, because I had the impression something was wrong: One of my workunits finished three times but \"without a finished file\", as it said.. Did this happen to anybody else? Please check in your messages whether you keep on crunching the same workunit or not.. Migth be just a problem with my computer, though.. ok, and maybe I have to send out some more workunits in order to be able to ask someone :)
thx


I had this task run for 13:46:56 and then received the following messages:

8:55:16 AM Aborting task opt_27_-18_5_662409262_0: exceeded CPU time limit 49609.375000
8:55:24 AM [error] Can\'t rename output file opt_27_-18_5_662409262_0_0
8:55:25 AM Computation for task opt_27_-18_5_662409262_0 finished
8:55:25 AM Output file opt_27_-18_5_662409262_0_0 for task opt_27_-18_5_662409262_0 absent

Rookie_69
Send message
Joined: Jan 26 08
Posts: 2
Credit: 55,416
RAC: 0

I guess I should go abort those two opt WU\'s that have been running for seven hours then...

How about the one that says estimated time to completion over 35 hours?

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

I guess I should go abort those two opt WU\'s that have been running for seven hours then...

How about the one that says estimated time to completion over 35 hours?


abort them..
actually you should abort all of them now.. a new series will come in 1-2 hours..
you will get credit for those where all results of a workunit hv been processed, but there are some where only 1 or 2 per wu were done.. am sorry about that, but I hv to cancel them, since it is not guaranteed that another result of those would succeed.
the bug is fixed, and running well in our small-scale testing environment..
please wait for the new series..

regards
Michael
____________
Michael

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

update:

- new set of wu\'s is out
- make sure you have optimizer version 1.29 (should happen automatically). if not, abort them and do \"reset project\".
- estimated time to completion is still too high for the first few wu\'s.. in order not to shock the clients too much which were crunching some of the wu\'s before.. ignore estimated time at first.The second row of wu\'s has correct time indicated.
- maximum duration is 2h, anything that runs (much) longer is likely to be flawed, report please..
- errors due to \"time limit exceeded\" will not happen anymore
- pray for good success to whoever you wish :)

michael
____________
Michael

Chris Sutton
Send message
Joined: Nov 10 05
Posts: 297
Credit: 4,941,683
RAC: 0

you will get credit for those where all results of a workunit hv been processed, but there are some where only 1 or 2 per wu were done.. am sorry about that, but I hv to cancel them, since it is not guaranteed that another result of those would succeed.

Michael,
I think this has had some unintentional consequences.
Please see this thread: WUs with errors \"cancelled\"

TIA

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

- maximum duration is 2h, anything that runs (much) longer is likely to be flawed, report please..

10-Feb-08 17:48:51|malariacontrol.net beta|Starting opt_24_-103_5_41959952_2
10-Feb-08 17:48:51|malariacontrol.net beta|Starting task opt_24_-103_5_41959952_2 using optimizer version 129
10-Feb-08 19:49:20|malariacontrol.net beta|Computation for task opt_24_-103_5_41959952_2 finished

It runs for 2 hours 29 seconds. Well done. :-)
Task ID. Finished well on BOINC 6.1.8
____________
Jord.

BOINC FAQ Service

Profile The Gas Giant
Avatar
Send message
Joined: Mar 7 06
Posts: 1213
Credit: 3,503,340
RAC: 1,667

Working well here.....

Live long and BOINC!

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

And they also run for shorter times.

11-Feb-08 13:46:33|malariacontrol.net beta|Starting opt_37_-126_5_73795069_3
11-Feb-08 13:46:33|malariacontrol.net beta|Starting task opt_37_-126_5_73795069_3 using optimizer version 129
11-Feb-08 15:01:30|malariacontrol.net beta|Computation for task opt_37_-126_5_73795069_3 finished

About an hour and 15 minutes there. Looks like it ended normally, or its flop counter thought it was 2 hours already. ;-)
____________
Jord.

BOINC FAQ Service

RandyC
Avatar
Send message
Joined: Jun 23 06
Posts: 2695
Credit: 850,101
RAC: 1,184

Run seems to progressing (mostly) OK. Just reported a string of 25 WUs with only 1 bad unit:
malariacontrol.net beta 2/12/2008 1:02:11 AM [error] Can\'t rename output file opt_2_-6953_5_886055248_1_0

Error on this wu.

This project seems to have taken total control of my system...built up a debt of 54000 seconds since yesterday evening, so I\'ve set no-new-work \'til the current load clears and my other project gets a shot at the machine.

Wayne Farmer
Avatar
Send message
Joined: Nov 26 07
Posts: 10
Credit: 193,626
RAC: 47

Developers of the optimizer application might be interested in a problem where the optimizer fails to respond to a \"suspend\" request from BOINC.

RandyC
Avatar
Send message
Joined: Jun 23 06
Posts: 2695
Credit: 850,101
RAC: 1,184

Had to abort this Opt WU. I just noticed it...it started at 10:41pm last night (after I went to bed) and ran until I aborted it at about 7:15pm tonight. MCN LTD is up to 63,700 now and will just have to work itself off against my other project.

I am disabling Opt WU processing on this machine, I can\'t afford this kind of tie-up.

[edit]Here\'s my message log:

Project Date Message malariacontrol.net beta 3/5/2008 10:41:51 PM Starting opt_44_-519_5_144224089_2 malariacontrol.net beta 3/5/2008 10:41:51 PM Starting task opt_44_-519_5_144224089_2 using optimizer version 130 Einstein@Home 3/6/2008 10:33:56 AM Task h1_0804.45_S5R3__152_S5R3b_1 exited with a DLL initialization error. Einstein@Home 3/6/2008 10:33:56 AM If this happens repeatedly you may need to reboot your computer. malariacontrol.net beta 3/6/2008 10:33:56 AM Task opt_44_-519_5_144224089_2 exited with a DLL initialization error. malariacontrol.net beta 3/6/2008 10:33:56 AM If this happens repeatedly you may need to reboot your computer. malariacontrol.net beta 3/6/2008 10:33:56 AM Restarting task opt_44_-519_5_144224089_2 using optimizer version 130 malariacontrol.net beta 3/6/2008 7:16:55 PM [error] Can\'t rename output file opt_44_-519_5_144224089_2_0 malariacontrol.net beta 3/6/2008 7:17:03 PM Computation for task opt_44_-519_5_144224089_2 finished malariacontrol.net beta 3/6/2008 7:17:03 PM Starting wu_139_415_104113_0_1204726367_2 malariacontrol.net beta 3/6/2008 7:17:03 PM Starting task wu_139_415_104113_0_1204726367_2 using malariacontrol version 556
[/edit] [edit2]corrected starting time[/edit]

Chris Sutton
Send message
Joined: Nov 10 05
Posts: 297
Credit: 4,941,683
RAC: 0

Another reported problem with Optimizer application (optimizer_1.32_windows_intelx86 - Entry point not found) here

glaesum
Send message
Joined: Nov 29 07
Posts: 7
Credit: 109,767
RAC: 1

Krunchin-Kieth, myself and others were chatting about Win98 higher up this thread in late Nov/Dec - {here} - so I thought I\'d try switching on the new optimiser 1.33 application. Well four or five tasks went through and failed so its back off again, herewith an example:

Task ID 23419473
Name opt_46_-2469_5_114887926_2
Workunit 7429277
Outcome Client error
Client state Compute error
Exit status -185 (0xffffff47)
Computer ID 66908
CPU time 0
stderr out
<core_client_version>5.10.30</core_client_version>
<![CDATA[
<message>
CreateProcess() failed - A device attached to the system is not functioning. (0x1f)
</message>
]]>

Validate state Invalid
_

meanwhile malariacontrol v.5.56 tasks still run fine on win98 although on the old rig they take quite a few hours.

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0


CreateProcess() failed - A device attached to the system is not functioning. (0x1f)
</message>
]]>

Validate state Invalid


Hi, createProcess() is a function called in the wrapper application, so this problem must exist for projects using the wrapper approach - hmm, no, wrong, we used the wrapper earlier with the mapping application and it worked on win98 ( except for creating a nasty console window for every task). The things different in this application are only 2, as far as I can think: unzipping of the java runtime environment, and running a java virtual machine..but then why does createProcess() fail? I personally have no idea so far, any suggestions welcome. If you watch the slot directory, do you see a directory called \"jre\" created there? if its not there, that would be an explanation. Any suggestions?
____________
Michael

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

FYI

as from now, the maximum duration of opt_ workunits is set to 3h. We want to test this, since we came to realize that for some specific runs, calculating an hour more would provide crucial information and speed up the fitting process a lot. At the same time we have adjusted the (server side) parameters as well, in order to make the fitting process more efficient.
This should mean, that those 3hour wu\'s just appear in the beginning, and should quickly become very rare (as the server side algorithm is now also more efficient and will after some learning time steer away from creating such wu\'s in the first place..)
Just don\'t kill them when 2h are over, as it was before.
thx
____________
Michael

Ageless
Avatar
Send message
Joined: Jun 29 06
Posts: 261
Credit: 146,907
RAC: 217

Best post that on the main page, News section as well. If only to reach those people that read by RSS only.

____________
Jord.

BOINC FAQ Service

Michael
Volunteer moderator
Project scientist
Send message
Joined: May 5 06
Posts: 79
Credit: 494
RAC: 0

Test of new optimizer app version

The \"optimizer\" application will re-enter testing status for a few days,
after some major changes to the application code. This means that during
this period, only users who allow to receive work for test applications in
there project preferences will get work for this application. Once the
testing completes successfully, we will change the status back.
This is expected to happen from next week.
____________
Michael

Profile Krunchin-Keith [USA]
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: Nov 10 05
Posts: 3047
Credit: 5,330,818
RAC: 4,054

On one computer I got this message:
5/9/2008 10:17:46 AM|malariacontrol.net|Message from server: Estimation of parameters of infection dynamics (no progress bar, max 3h) needs 361.78MB more disk space. You currently have 353.48 MB available and it needs 715.26 MB.

This is wrong as I have 2+GB Free for BOINC, 4.56GB allocated and 2.17GB in use. This may be a client bug, but...

On another computer with about same disk usage, it got work, it had only 1.90GB in use so it had a little more room. And has 5 tasks but malariacontrol is only taking up 107MB of disk space for those 5 and any other previously downloaded application files.

So why does the application say it needs so much more ?
How much space does the application really require ?

Post to thread

Message boards : Malaria Control : A THIRD science application for malariacontrol


Return to malariacontrol.net main page


Copyright © 2013 africa@home