Was looking at IO writes... |
Message boards : Number crunching : Was looking at IO writes...
| Author | Message |
|---|---|
|
Looking in Windows XP TaskMgr, I noticed, that compared to other BOINC projects, Malaria was doing alot of IO writes. | |
| ID: 1602 | Rating: 0 | rate:
| |
compared to other BOINC projects, Malaria was doing alot of IO writes. It's the "same" thing i write down here. 'The Client made too many traffic over the internal LAN.' ____________ | |
| ID: 1603 | Rating: 0 | rate:
| |
|
Curiousity got the better of me. So I downloaded 'Process Monitor', the replacement to Filemon/Regmon, to see what all the writes where. | |
| ID: 1644 | Rating: 0 | rate:
| |
|
No wonder i notice alot of disk activity on one of my machines. I'm getting a lot of IO writes as well (27 million to be exact) | |
| ID: 1655 | Rating: 0 | rate:
| |
|
Hmmmmzzz this may explain the total crash of my pc\'s HD. | |
| ID: 1698 | Rating: 0 | rate:
| |
|
I don\'t know the programmers on the project, except that marie, and the rest of the team, are obviously heavy dudes in the field of mathematical medical modelling and have managed to transfer the models to a working computer algorithm.
Or using this timer interrupt inefficently? (How is the timer bound or dispatched to the BOINC Client in by the main BOINC Program??) Could it be writing a check point *EVERY* time it gets a timing pulse, or is it skipping a few (but not enough) times before it writes? (What is after all 30 seconds (a nice window) of computation time lost because you dont write often to save speed in the os?) Beware of the figures tools like Regmon, and the like, produce. They introduce delays of their own and the figures it produces are often buffered and or delayed. However they do provide a fantastic insight into a running application. A network sniffing session might produce some interesting profiles too.. /me is eagerly awaiting possible sniffs at the source code. BOINC sounds like an interesting architecture. Rock on Finite State Machines! Merry Christmas All! | |
| ID: 1708 | Rating: 0 | rate:
| |
|
I\'m seeing 57 million i/o writes in 1hr40min per app. That\'s 68 million i/o writes per hr on my hyperthreaded 3.2GHz machine. A little too many for mine! | |
| ID: 1717 | Rating: 0 | rate:
| |
|
We\'re all back at work now, and ready to give this problem a closer look. Thank you for your input! I\'ll let you know as soon as we make progress or need further info. | |
| ID: 1740 | Rating: 0 | rate:
| |
We\'re all back at work now, and ready to give this problem a closer look. Thank you for your input! I\'ll let you know as soon as we make progress or need further info. My BOINC stuff all occurs in a partition used by no other applications. Looking at IO rates for this partition gives the following, using iostat in Linux: Time: 00:23:13 avg-cpu: %user %nice %sys %iowait %idle 0.91 98.47 0.61 0.00 0.01 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util hda11 0.00 68.15 0.00 5.07 0.00 585.73 0.00 292.87 115.61 0.18 35.72 0.79 0.40 Time: 00:24:13 avg-cpu: %user %nice %sys %iowait %idle 1.54 97.54 0.89 0.00 0.03 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util hda11 0.00 54.59 0.00 4.23 0.00 470.59 0.00 235.29 111.18 0.17 39.72 0.87 0.37 Time: 00:25:13 avg-cpu: %user %nice %sys %iowait %idle 1.71 96.27 1.99 0.00 0.03 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util hda11 0.00 14.47 0.00 3.35 0.00 142.80 0.00 71.40 42.63 0.02 6.57 0.35 0.12 Time: 00:26:13 avg-cpu: %user %nice %sys %iowait %idle 2.49 93.46 3.63 0.00 0.42 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util hda11 0.00 93.67 0.00 7.12 0.00 806.53 0.00 403.27 113.33 0.38 52.90 0.80 0.57 That is essentially no reading and around 200k bytes/second writing. For me, that is not a big deal, but it could be for others. I have 8 GBytes RAM, so those writes may just sit in the buffers for a while and not write to disk all that often. The drive for that partition is 7200 rpm EIDE with 8 Megabyte buffer. ____________ | |
| ID: 1815 | Rating: 0 | rate:
| |
The current work unit as done, so far, in 3 hours (92% complete) 41.7million IO wrtes, and 1.87 Billion IO write bytes. I was just wondering, what is all the IO it is doing? Checkpointing in our case means writing the complete status of the simulation model to disk, with the current workunits that would usually be between 12 and 25MB of data. Given your host seems to write about 10 checkpoints per hour, that would be up to 250MB (possibly even a bit more) of data written per hour. Reducing the frequency you allow BOINC to write to disk could help. The new version 545 contains a few amendments to the IO buffering during checkpoint writes. I would be interested to know if this make a difference, could you give us some feedback? Thanks a lot Nick ____________ Nicolas Maire Swiss Tropical and Public Health Institute http://www.swisstph.ch | |
| ID: 1887 | Rating: 0 | rate:
| |
|
I have BOINC set to do disk writes every 5 minutes. Watching MalariaControl It is writing one checkpoint file out every 5 minutes now, instead of doing the 400 writes a second it previous was. | |
| ID: 1891 | Rating: 0 | rate:
| |
|
| |
| ID: 1905 | Rating: 0 | rate:
| |
Thats good to hear, thanks! Along with that, the failure rate seems to have dropped a little more with 545. Nick ____________ Nicolas Maire Swiss Tropical and Public Health Institute http://www.swisstph.ch | |
| ID: 1929 | Rating: 0 | rate:
| |
Message boards :
Number crunching :
Was looking at IO writes...