Version 6.15/6.16 of the malariacontrol science application


Advanced search

Message boards : Number crunching : Version 6.15/6.16 of the malariacontrol science application

AuthorMessage
Profile maire
Forum moderator
Project administrator
Project developer
Project scientist
Send message
Joined: Nov 7 05
Posts: 426
Credit: 87,306
RAC: 289
Message 9695 - Posted 11 Apr 2009 16:23:14 UTC

    We’re now ready to start testing version 6.15 of the science application. It contains fixes for problems we discovered with 6.12/13, and some improvements to the part of the application which reads the input files (an XML-parser). As a consequence of these improvements, we had to update our Linux build system to a newer distribution. It is recommended to build BOINC science application on a system with an old Linux distribution installed to prevent problems due to library incompatibilities, but getting the application to build on the Linux image suggested by the BOINC project team proved impossible. We’re now using the oldest Ubuntu distribution which is still supported (6.06 LTS), and hope this is not causing problems for Linux users.

    Nick

    ____________
    Nicolas Maire
    Swiss Tropical and Public Health Institute
    http://www.swisstph.ch

    The Gas Giant
    Avatar
    Send message
    Joined: Mar 7 06
    Posts: 1185
    Credit: 1,783,971
    RAC: 2,140
    Message 9696 - Posted 11 Apr 2009 19:48:46 UTC

      Last modified: 11 Apr 2009 19:50:20 UTC

      Wow...long wu's! The estimated FLOPS are way off.
      ____________
      Paul
      (S@H1 8888)

      Gary Charpentier
      Send message
      Joined: Aug 21 08
      Posts: 1
      Credit: 17,483
      RAC: 29
      Message 9697 - Posted 11 Apr 2009 19:56:07 UTC

        And this W/U core dumped
        http://www.malariacontrol.net/result.php?resultid=44621650

        ____________

        P . P . L .
        Send message
        Joined: Aug 27 08
        Posts: 26
        Credit: 110,724
        RAC: 0
        Message 9699 - Posted 12 Apr 2009 7:33:10 UTC

          Hi.

          This one was a problem for everyone, so far.

          http://www.malariacontrol.net/workunit.php?wuid=16322596

          Outcome Client error
          Client state Compute error
          Exit status -177 (0xffffff4f)
          Computer ID 114782
          Report deadline 15 Apr 2009 17:02:12 UTC
          CPU time 0
          stderr out

          <core_client_version>6.2.14</core_client_version>
          <![CDATA[
          <message>
          Maximum disk usage exceeded
          </message>

          pete

          ]]>
          ____________

          Tore Zachariassen
          Send message
          Joined: Oct 22 07
          Posts: 1
          Credit: 50,647
          RAC: 1
          Message 9700 - Posted 12 Apr 2009 8:25:35 UTC

            Last modified: 12 Apr 2009 8:29:16 UTC

            I've tested version 6.15 on both Ubuntu 8.04 and 8.10, and it's working perfect :)
            Not only is it givin' me a lot of success-results (no errors so far - after they fixed that 'attribute nspore'-error yesterday), but it's also very fast compared to those guys running window$$ (as this work unit shows) :D
            Good work Nick :)

            Profile maire
            Forum moderator
            Project administrator
            Project developer
            Project scientist
            Send message
            Joined: Nov 7 05
            Posts: 426
            Credit: 87,306
            RAC: 289
            Message 9701 - Posted 12 Apr 2009 8:41:55 UTC

              The Windows version of 6.15 seemed to have a problem, the tasks were taking much too long to complete. We released a new version 6.16 for Windows, which should fix this.

              If you still have task running on

              Windows version 6.15

              please abort those.

              Thanks
              Nick
              ____________
              Nicolas Maire
              Swiss Tropical and Public Health Institute
              http://www.swisstph.ch

              Mattia Verga
              Send message
              Joined: Jan 19 07
              Posts: 6
              Credit: 24,634
              RAC: 52
              Message 9702 - Posted 12 Apr 2009 9:17:45 UTC

                Tested one WU on Fedora 10 x64, no problem: 44626930

                Chris Sutton
                Send message
                Joined: Nov 10 05
                Posts: 295
                Credit: 4,941,683
                RAC: 0
                Message 9707 - Posted 12 Apr 2009 16:30:31 UTC - in response to Message 9695.

                  We’re now using the oldest Ubuntu distribution which is still supported (6.06 LTS), and hope this is not causing problems for Linux users.

                  My older linux client doesn't seem to have a problem starting the work, only with finishing it. :)

                  <message>
                  Maximum disk usage exceeded
                  </message>

                  John Clark
                  Avatar
                  Send message
                  Joined: Feb 10 08
                  Posts: 867
                  Credit: 738,143
                  RAC: 1,030
                  Message 9709 - Posted 12 Apr 2009 23:48:39 UTC

                    My crunching of the Win 6.16 WU seems to take about 49 minutes
                    ____________
                    Go away, I was asleep

                    Said a Russell, 3 Shih-Tzus & a Bischeon Frize

                    Profile Tom Philippart
                    Send message
                    Joined: Jun 25 06
                    Posts: 28
                    Credit: 218,769
                    RAC: 25
                    Message 9735 - Posted 15 Apr 2009 10:29:26 UTC

                      there's still a HUGE gap between linux and windows crunch times, please see this WU: http://www.malariacontrol.net/workunit.php?wuid=16324300

                      I crunched it with version 6.16
                      ____________

                      RandyC
                      Avatar
                      Send message
                      Joined: Jun 23 06
                      Posts: 954
                      Credit: 127,798
                      RAC: 119
                      Message 9739 - Posted 15 Apr 2009 11:32:18 UTC - in response to Message 9735.

                        there's still a HUGE gap between linux and windows crunch times, please see this WU: http://www.malariacontrol.net/workunit.php?wuid=16324300

                        I crunched it with version 6.16


                        Lucky for you the higher credit was granted. It was >2x the smaller credit.

                        Keith [SETI.USA]
                        Send message
                        Joined: Feb 25 08
                        Posts: 1
                        Credit: 100,637
                        RAC: 0
                        Message 9745 - Posted 15 Apr 2009 17:06:35 UTC

                          How do you verify the version (6.15 vs 6.16) running?

                          Thanks
                          Keith

                          Profile maire
                          Forum moderator
                          Project administrator
                          Project developer
                          Project scientist
                          Send message
                          Joined: Nov 7 05
                          Posts: 426
                          Credit: 87,306
                          RAC: 289
                          Message 9750 - Posted 16 Apr 2009 0:37:10 UTC - in response to Message 9745.

                            How do you verify the version (6.15 vs 6.16) running?

                            Thanks
                            Keith


                            Your client shows the version number for running tasks in the "Application" column on the "Task" panel (you may need to select advanced view for this). For completed tasks, you can go to "Your account" on this web page, "Tasks View", "Task ID click for details", and check the version at the bottom.
                            Nick

                            ____________
                            Nicolas Maire
                            Swiss Tropical and Public Health Institute
                            http://www.swisstph.ch

                            glaesum
                            Send message
                            Joined: Nov 29 07
                            Posts: 7
                            Credit: 41,409
                            RAC: 3
                            Message 9753 - Posted 16 Apr 2009 11:55:05 UTC

                              Last modified: 16 Apr 2009 11:55:52 UTC

                              hi K-Keith, if you are monitoring the forum...

                              we did the same wu together the other day, wu#16331148 #this workunit# sent apr14th.
                              we have almost identical systems, yours is a fraction faster, otherwise same cpu, same OS, same boinc version but mine took 2.5times longer to crunch than yours. any ideas why??

                              /pete

                              RandyC
                              Avatar
                              Send message
                              Joined: Jun 23 06
                              Posts: 954
                              Credit: 127,798
                              RAC: 119
                              Message 9807 - Posted 24 Apr 2009 11:32:09 UTC

                                Some really off-the-wall credit claims for 6.16:
                                Using BOINC 5.10.30 on Win XP Pro-32

                                This WU took 8,638.09 secs and claimed 97.10 CS
                                This WU took 9,834.59 secs and claimed 110.55 CS (granted 43.81)
                                This WU took 5,956.16 secs and claimed 66.95 CS
                                This WU took 13,084.91 secs and claimed 147.09 CS

                                Now This WU took 13,940.41 secs and claimed/granted 41.87 CS. It was issued 22 Apr 2009 18:44:17 UTC, while all the above were issued 23 Apr 2009 2:21:57 UTC or later.

                                Profile Krunchin-Keith [USA]
                                Forum moderator
                                Volunteer tester
                                Avatar
                                Send message
                                Joined: Nov 10 05
                                Posts: 1535
                                Credit: 2,287,789
                                RAC: 1,644
                                Message 9808 - Posted 24 Apr 2009 13:10:55 UTC - in response to Message 9753.

                                  hi K-Keith, if you are monitoring the forum...

                                  we did the same wu together the other day, wu#16331148 #this workunit# sent apr14th.
                                  we have almost identical systems, yours is a fraction faster, otherwise same cpu, same OS, same boinc version but mine took 2.5times longer to crunch than yours. any ideas why??

                                  /pete

                                  That work unit is no longer available.

                                  My guess would be, it depends on other things running on the system, Disk speed, Bus Speed, Memory bandwidth, Memory available, and all. Certain things take wall time away from boinc tasks, and I have seen applications that measure time in elapsed wall time not actual cpu used time. Other things such as a stuck window's window (not responding) sometimes does the same thing. Don't know the exact reason here. Did you reboot in the middle of the run or is you task switch interval short ? When an application is suspended and then restarts, it sometimes has to go back a little or a lot and compute again those calculations, adding more time to the already existing time used. I have a 90 minute switch set on my system. It may also depend on what other boinc project task runs, I have also seen tasks that use more than 50% cpu (in my case running two, each should have 50%).

                                  Without observing myself the actual task running and noting what else is going on, it is very hard to guess at the actual cause.

                                  RandyC
                                  Avatar
                                  Send message
                                  Joined: Jun 23 06
                                  Posts: 954
                                  Credit: 127,798
                                  RAC: 119
                                  Message 9825 - Posted 26 Apr 2009 21:21:01 UTC - in response to Message 9807.

                                    Mystery solved. Benchmarks were way skewed. Reran them and credit claims should normalize again.

                                    Some really off-the-wall credit claims for 6.16:
                                    Using BOINC 5.10.30 on Win XP Pro-32

                                    This WU took 8,638.09 secs and claimed 97.10 CS
                                    This WU took 9,834.59 secs and claimed 110.55 CS (granted 43.81)
                                    This WU took 5,956.16 secs and claimed 66.95 CS
                                    This WU took 13,084.91 secs and claimed 147.09 CS

                                    Now This WU took 13,940.41 secs and claimed/granted 41.87 CS. It was issued 22 Apr 2009 18:44:17 UTC, while all the above were issued 23 Apr 2009 2:21:57 UTC or later.

                                    glaesum
                                    Send message
                                    Joined: Nov 29 07
                                    Posts: 7
                                    Credit: 41,409
                                    RAC: 3
                                    Message 9833 - Posted 27 Apr 2009 23:26:27 UTC - in response to Message 9808.

                                      Last modified: 27 Apr 2009 23:27:23 UTC

                                      That work unit is no longer available.

                                      My guess would be, it depends on other things running on the system, Disk speed, Bus Speed, Memory bandwidth, Memory available, and all. Certain things take wall time away from boinc tasks, and I have seen applications that measure time in elapsed wall time not actual cpu used time. Other things such as a stuck window's window (not responding) sometimes does the same thing. Don't know the exact reason here. Did you reboot in the middle of the run or is you task switch interval short ? When an application is suspended and then restarts, it sometimes has to go back a little or a lot and compute again those calculations, adding more time to the already existing time used. I have a 90 minute switch set on my system. It may also depend on what other boinc project task runs, I have also seen tasks that use more than 50% cpu (in my case running two, each should have 50%).

                                      Without observing myself the actual task running and noting what else is going on, it is very hard to guess at the actual cause.

                                      ta Keith - just wondered. sorry I left it till just before the history of that wu cleared from the db. there's no obvious thing to pick out of your reply - I have an 80minute switch set between tasks and, whenever I look, things look pretty evenly split 50:50 between the two (virtual) cpus. I do have things set to run on both cores. I've always got cpdn running on one of them. otherwise there's probably loads of idle windows in background including a browser with millions of tabs open. I'll remember to blow the dust out of my cooling fins now the weather is warming up!!
                                      _

                                      oh, another thing for general amusement: I've tried some of the 6.16 test wus on my ancient Win98 boat-anchor and suprisingly they seem to run alright, albeit not very efficiently from the credits earned for time used to complete in the couple so far with the wingman also reporting. I'll only do a few more days of them. I doubt very many crunchers are still trying to keep their old pcs going or that the sysops will be very worried how well 6.16 runs on them - I just treat it as part of my winter-heating system!! :-)

                                      Profile maire
                                      Forum moderator
                                      Project administrator
                                      Project developer
                                      Project scientist
                                      Send message
                                      Joined: Nov 7 05
                                      Posts: 426
                                      Credit: 87,306
                                      RAC: 289
                                      Message 9849 - Posted 30 Apr 2009 13:44:41 UTC

                                        A short update on the current testing phase: We see relatively high error rates on both Linux and Windows platforms (pass rate is only around 90%). Mac's are doing ok with no errors, but the sample size is much smaller there. A lot of the errors seem to be caused by exceeding some of the resource bounds we set in the workunit template. Especially checkpoints take a lot of disk space, more than the 250MB we were allowing so far. The reason is that currently the data is written uncompressed to disk for easier debugging. The next version will compress the data to save disk space.
                                        We have for now increased the disk limit to 350MB per workunit for a few more batches of workunits. This may prevent some computers from getting work. We'll lower this limit again as soon as possible.

                                        At the same time, we have increased the time period during which we keep finished results in the database.
                                        Nick
                                        ____________
                                        Nicolas Maire
                                        Swiss Tropical and Public Health Institute
                                        http://www.swisstph.ch

                                        Post to thread

                                        Message boards : Number crunching : Version 6.15/6.16 of the malariacontrol science application


                                        Return to malariacontrol.net main page


                                        Copyright © 2010 africa@home