Computation errors


Advanced search

Message boards : Macintosh : Computation errors

AuthorMessage
markj
Send message
Joined: Jun 21 08
Posts: 3
Credit: 1,471,692
RAC: 96
Message 10559 - Posted 28 Jul 2009 14:12:44 UTC

    Dear All,
    I am seeing quite a few "Computation Errors" lately, and they occur after nearly one hour of processing, which is annoying.
    Any idea what the problem may be?
    I am using MacOSX Intel, but I am not sure if it is platform-specific, as I don't have other computers running Malariacontrol.
    Greetings,
    markj

    Profile mikey
    Avatar
    Send message
    Joined: Mar 23 07
    Posts: 4711
    Credit: 5,420,919
    RAC: 379
    Message 10565 - Posted 28 Jul 2009 16:32:43 UTC - in response to Message 10559.

      Last modified: 28 Jul 2009 16:36:56 UTC

      Dear All,
      I am seeing quite a few "Computation Errors" lately, and they occur after nearly one hour of processing, which is annoying.
      Any idea what the problem may be?
      I am using MacOSX Intel, but I am not sure if it is platform-specific, as I don't have other computers running Malariacontrol.
      Greetings,
      markj


      Those of us using pure Windows machines just went thru this problem because Malaria used a machine with the newer .net stuff on it to make the units. That means that you and I needed it too, but the new versions of the workunits don't need it. I am crunching units that are using 6.24 under the Application tab in the Boinc Manager. What are you using? If that is not it I have no other idea, sorry!

      I have posted a link to your thread in this thread http://malariacontrol.net/forum_thread.php?id=883

      Hopefully this will get your problem solved
      ____________

      Chris Sutton
      Send message
      Joined: Nov 10 05
      Posts: 297
      Credit: 4,941,683
      RAC: 0
      Message 10569 - Posted 28 Jul 2009 22:32:30 UTC

        The 'nearly one hour' you mention is very close to the default BOINC preference for switching between applications (60 minutes).

        The error reported regarding problems reading checkpoint file strengthens this hypothesis.

        I didn't look too deeply, but it appears to affect all tasks for the given wu, so the problem is more than likely related to the input data.

        Until the project analyses this and possibly makes changes to the input data, your options will most likely be limited to:
        . change the switch between applications setting to > the expected processing time for the task so as to avoid switching out and thereby completing it in a single run;
        . suspend all projects except MCDN on the affected hosts, also lending the box to hopefully run the MCDN units to completion without switching out;

        markj
        Send message
        Joined: Jun 21 08
        Posts: 3
        Credit: 1,471,692
        RAC: 96
        Message 10576 - Posted 29 Jul 2009 7:34:38 UTC - in response to Message 10569.

          thanks for the replies - I don't think the switching is the problem, as units also fail before 60 mins, at 55 mins. I'll try updating the boinc program (currently I'm on 6.6.20 and I see 6.6.36 is recommended now) and setting the switching to 2 hours though, to see if that solves things.
          markj

          Profile mikey
          Avatar
          Send message
          Joined: Mar 23 07
          Posts: 4711
          Credit: 5,420,919
          RAC: 379
          Message 10577 - Posted 29 Jul 2009 8:02:23 UTC - in response to Message 10576.

            thanks for the replies - I don't think the switching is the problem, as units also fail before 60 mins, at 55 mins. I'll try updating the boinc program (currently I'm on 6.6.20 and I see 6.6.36 is recommended now) and setting the switching to 2 hours though, to see if that solves things.
            markj


            Actually downgrading to 6.4.7 might be even better, the 6.6.? versions all have issues with the scheduler. If you only run one project it is fine but if you run multiple projects alot of users are having issues. NOT all users are having problems just a bunch of them. If you CUDA, ie use your gpu to crunch, then you will need at least version 6.5.0 of Boinc to do that thru Boinc.
            ____________

            Ageless
            Avatar
            Send message
            Joined: Jun 29 06
            Posts: 261
            Credit: 150,583
            RAC: 1
            Message 10584 - Posted 29 Jul 2009 10:55:22 UTC - in response to Message 10577.

              Last modified: 29 Jul 2009 11:00:37 UTC

              Actually downgrading to 6.4.7 might be even better, the 6.6.? versions all have issues with the scheduler.

              A good example of "How misinformation gets into the world."
              If you want to explain things, explain them correctly. Ever since 6.6.20 BOINC contains separate CPU and GPU schedulers, built into the client.

              Most of the problems with these have now been fixed. What people are falling over at this time is that BOINC 6.6.38, the latest version available for Windows platforms at least, will now and then request GPU work on projects that have no GPU application. This is only done in case the project installs a GPU application from one day to the other, so people who have a GPU and want that project to work on their GPU will get work. It's just a simple check, nothing broken.

              A problem that has been fixed in 6.6.38 is that result uploads will be grouped together per project, meaning that when a project goes down, it has one timer on the retry to upload those results.

              Furthermore, the problem of "won't finish in time" has been fixed in 6.6.38.

              The biggest problem with new BOINC versions is that people expect:
              a) that those bugs are fixed.
              b) that despite those bug fixes, the way that they are now accustomed to how BOINC works won't change.

              That's a wrong anticipation. Bug fixes will at times change the behaviour of the software. If you can't agree with that, do not run the newer software but stay stuck on something older.

              Between 6.4 and 6.6 the way that debts are calculated changed. Warnings went up on several forums about this, I eventually added it in the release notes. Did people heed the warning? No. They found their BOINC to work differently than before and so decided it was broken. That the older versions may have been broken is something that can't be true. Newer versions with bug-fixes will inevitably break something that worked before... or so the reasoning is.

              If you only run one project it is fine but if you run multiple projects alot of users are having issues. NOT all users are having problems just a bunch of them.

              You are contradicting yourself here. First it's a lot, then it's just a bunch.
              It's not that many, it's mostly people posting at Seti and then their vocal numbers are about 10. On a user base of 327 thousand active BOINC users, that is neither a lot nor a bunch.

              Just panic posting. I have a problem so I need to tell everyone not to use this BOINC.

              If you CUDA, ie use your gpu to crunch, then you will need at least version 6.5.0 of Boinc to do that thru Boinc.

              6.4.5, 6.4.6 and 6.4.7 do CUDA as well. They, as well as 6.5.0 by the way, do not have separate CPU and GPU schedulers, so all work will be requested by the CPU. Which in some cases may leave the GPU go without work, because of CPU debt problems on the project.

              But since we are in the Macintosh forum, there are no working 64bit library sets yet for the latest drivers. The drivers may enable CUDA on Leopard, but without those libraries it won't work. BOINC for the Mac has got CUDA detection built-in though. So as soon as Nvidia releases those libraries, all will be well.
              ____________
              Jord.

              BOINC FAQ Service

              Post to thread

              Message boards : Macintosh : Computation errors


              Return to malariacontrol.net main page


              Copyright © 2013 africa@home