A question for Linux BOINC donors with multi-socket machines

tear

[H]ard|DCer of the Year 2011
Joined
Jul 25, 2011
Messages
1,568
Per subject:

Do you guys use the client from Berkeley (that is, downloaded from there: http://boinc.berkeley.edu/download_all.php)
or a distribution package (that is, installed with apt-get/dpkg, yum/rpm, pacman or whatnot) ?

I'm thinking some early Christmas gifts... *giggity*
 
ive used both in the past, I think most recently I have it just from using the ubuntu software center however (I think...)
 
Tear, are you creating and give away some boxen to be used as borgs?
 
Like WFeather, I just used the client included with Ubuntu following Musky's install guide. The only real change I made was to download a lot more jobs in my queue so you don't run out. Since it is not multi-threaded you end up doing 48 projects at a time and the server can run out of jobs for a while.
 
I haven't had any big rigs, but have ran a few Ubuntu VM's. I typically ran the one in the repository. However, it depends on the version of Ubuntu you end up using. I believe that one of the versions had BOINC 7.0.24 or 7.0.27 which had known problems. If that is the version, I would certainly upgrade.
 
Last edited:
I used the client from the Ubuntu repo. It worked fine on multiple machines.
 
musky, I finally saw one of your error work units. This is the error and please correct me if I'm wrong: http://www.malariacontrol.net/result.php?resultid=183486261

<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
process got signal 11
</message>
<stderr_txt>

</stderr_txt>
]]>

If my memory serves me correct, that was a known issue at some projects with the version of BOINC in the Ubuntu repository for version 7.0.24 and I believe also 7.0.27. I would suggest upgrading your BOINC install to a later version. See if that helps things.

Did you ever get this issue resolved?
 
Last edited:
Did you ever get this issue resolved?

Yeah - I quit running Malaria Control on that machine... :p

That is the truth actually. I ended up running SIMAP+Rosetta on a 4p socket 2011 machine and SIMAP+Malaria Control on a 2p socket 2011. That worked fine. I have no idea why Malaria Control didn't like my 4p, but it certainly didn't.
 
Ok, here's NUMA-aware, affinity-setting BOINC client (goal: improve performance).

It is meant to replace the client from Ubuntu 12.04 so if you're running standalone client
and want to run this client, you'll first need to cease using standalone client.

Target audience: users of dedicated BOINC multisocket machines (w/Ubuntu 12.04) that run CPU jobs

1. If the boinc is already installed, verify that you're not using a version beyond 7.0.x; do not proceed if you see a version higher than 7.0.x.
Code:
dpkg-query -W boinc-client
If the version is 7.0.x or lower, go to step 5.
If no boinc-client is installed and you want to install it, go to step 2.

2. Remove/comment-out cdrom sources in /etc/apt/sources.list (if any):
Code:
sudo sed -i 's/^[^#]*cdrom:.*$//' /etc/apt/sources.list

3. Update package indices
Code:
sudo apt-get update

4. Install boinc
Code:
sudo apt-get install boinc

5. Completely stop BOINC manager (the GUI app), if running.

6. Stop BOINC client:
Code:
sudo service boinc-client stop

7. Wait for the client to stop, it may take a few moments.

8. Go to your home directory
Code:
cd $HOME

9. Download custom BOINC
Code:
wget http://darkswarm.org/ubuntu-boinc/boinc-client_7.0.65+dfsg-3~ubuntu12.04.1~a3_amd64.deb http://darkswarm.org/ubuntu-boinc/boinc-manager_7.0.65+dfsg-3~ubuntu12.04.1~a3_amd64.deb http://darkswarm.org/ubuntu-boinc/boinc_7.0.65+dfsg-3~ubuntu12.04.1~a3_all.deb http://darkswarm.org/ubuntu-boinc/libboinc7_7.0.65+dfsg-3~ubuntu12.04.1~a3_amd64.deb

10. Install custom BOINC
Code:
sudo dpkg -i boinc_7.0.65+dfsg-3~ubuntu12.04.1~a3_all.deb boinc-client_7.0.65+dfsg-3~ubuntu12.04.1~a3_amd64.deb boinc-manager_7.0.65+dfsg-3~ubuntu12.04.1~a3_amd64.deb libboinc7_7.0.65+dfsg-3~ubuntu12.04.1~a3_amd64.deb

11. Start boinc manager: either click the dash icon and type "boinc" to find it or type boincmgr in the terminal.

Feedback is, as always, welcome :)

To check if it's working correctly, examine /var/lib/boinc-client/stdoutae.txt, it should now print extra [numa] lines, e.g.
Code:
30-Nov-2013 18:16:56 [---] Starting BOINC client version 7.0.65 for x86_64-pc-linux-gnu
30-Nov-2013 18:16:56 [---] log flags: file_xfer, sched_ops, task
30-Nov-2013 18:16:56 [---] Libraries: libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3
30-Nov-2013 18:16:56 [---] Data directory: /var/lib/boinc-client
30-Nov-2013 18:16:56 [---] Processor: 48 AuthenticAMD AMD Engineering Sample [Family 16 Model 9 Stepping 1]
30-Nov-2013 18:16:56 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid amd_dcm pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr npt lbrv svm_lock nrip_save pausefilter
30-Nov-2013 18:16:56 [---] OS: Linux: 3.2.0-45-generic
30-Nov-2013 18:16:56 [---] Memory: 31.42 GB physical, 32.00 GB virtual
30-Nov-2013 18:16:56 [---] Disk: 426.95 GB total, 398.63 GB free
30-Nov-2013 18:16:56 [---] Local time is UTC -8 hours
30-Nov-2013 18:16:56 [---] No usable GPUs found
30-Nov-2013 18:16:56 [---] Config: GUI RPCs allowed from:
30-Nov-2013 18:16:56 [World Community Grid] URL http://www.worldcommunitygrid.org/; Computer ID 2575649; resource share 100
30-Nov-2013 18:16:56 [World Community Grid] General prefs: from World Community Grid (last modified 22-Nov-2013 20:43:13)
30-Nov-2013 18:16:56 [World Community Grid] Computer location: home
30-Nov-2013 18:16:56 [---] General prefs: using separate prefs for home
30-Nov-2013 18:16:56 [---] Reading preferences override file
30-Nov-2013 18:16:56 [---] Preferences:
30-Nov-2013 18:16:56 [---]    max memory usage when active: 24128.98MB
30-Nov-2013 18:16:56 [---]    max memory usage when idle: 28954.78MB
30-Nov-2013 18:16:56 [---]    max disk usage: 100.00GB
30-Nov-2013 18:16:56 [---]    (to change preferences, visit a project web site or select Preferences in the Manager)
30-Nov-2013 18:16:56 [---] gui_rpc_auth.cfg is empty - no GUI RPC password protection
30-Nov-2013 18:16:56 [---] Not using a proxy
Initialization completed
30-Nov-2013 18:16:56 [---] [numa] /proc/cpuinfo: 48 CPUs
30-Nov-2013 18:16:56 [---] [numa] examining node 0
30-Nov-2013 18:16:56 [---] [numa] node 0: 6 CPUs (map: 0000003f)
30-Nov-2013 18:16:56 [---] [numa] examining node 1
30-Nov-2013 18:16:56 [---] [numa] node 1: 6 CPUs (map: 00000fc0)
30-Nov-2013 18:16:56 [---] [numa] examining node 2
30-Nov-2013 18:16:56 [---] [numa] node 2: 6 CPUs (map: 0003f000)
30-Nov-2013 18:16:56 [---] [numa] examining node 3
30-Nov-2013 18:16:56 [---] [numa] node 3: 6 CPUs (map: 00fc0000)
30-Nov-2013 18:16:56 [---] [numa] examining node 4
30-Nov-2013 18:16:56 [---] [numa] node 4: 6 CPUs (map: 3f000000)
30-Nov-2013 18:16:56 [---] [numa] examining node 5
30-Nov-2013 18:16:56 [---] [numa] node 5: 6 CPUs (map: fc0000000)
30-Nov-2013 18:16:56 [---] [numa] examining node 6
30-Nov-2013 18:16:56 [---] [numa] node 6: 6 CPUs (map: 3f000000000)
30-Nov-2013 18:16:56 [---] [numa] examining node 7
30-Nov-2013 18:16:56 [---] [numa] node 7: 6 CPUs (map: fc0000000000)
30-Nov-2013 18:16:56 [---] [numa] found 48 CPUs
30-Nov-2013 18:16:56 [World Community Grid] [numa] Found CPU 0 (at node 0)
30-Nov-2013 18:16:56 [World Community Grid] Restarting task FAHV_x3AVMbINfbA_0336649_0426_1 using fahv version 706 in slot 20
30-Nov-2013 18:16:56 [World Community Grid] [numa] Found CPU 1 (at node 0)
30-Nov-2013 18:16:56 [World Community Grid] Restarting task faah377324_ZINC00194283_x4GVMaINleA_00_0 using faah version 715 in slot 21
30-Nov-2013 18:16:56 [World Community Grid] [numa] Found CPU 2 (at node 0)
30-Nov-2013 18:16:56 [World Community Grid] Restarting task faah377324_ZINC00227031_x4GVMaINleA_00_0 using faah version 715 in slot 3
30-Nov-2013 18:16:56 [World Community Grid] [numa] Found CPU 3 (at node 0)
30-Nov-2013 18:16:56 [World Community Grid] Restarting task faah377324_ZINC23478308_x4GVMaINleA_00_0 using faah version 715 in slot 9

Once it starts churning out units, you can also run:
Code:
ps -eopid=,pcpu=,psr=,comm= | awk '{ if ($2 > 10) print }'  | sort -n -k3,3

You should see a list of workers sorted by CPU number, e.g.:
Code:
33292 99.4   0 wcgrid_faah_7.1
34038 98.8   1 wcgrid_fahv_vin
34061 98.5   2 wcgrid_fahv_vin
32450 99.9   3 wcgrid_faah_7.1
33931 99.8   4 wcgrid_mcm1_7.2
33188 99.9   5 wcgrid_faah_7.1
32443 99.9   6 wcgrid_faah_7.1
34055 98.4   7 wcgrid_fahv_vin
33848 99.9   8 wcgrid_mcm1_7.2
32436 99.9   9 wcgrid_faah_7.1
25708 98.7  10 wcgrid_fahv_vin
32365 99.9  11 wcgrid_faah_7.1
33634 99.9  12 wcgrid_mcm1_7.2
33492 99.9  13 wcgrid_faah_7.1
33787 99.9  14 wcgrid_mcm1_7.2
34372 85.8  15 wcgrid_fahv_vin
32347 99.9  16 wcgrid_faah_7.1
19265 98.4  17 wcgrid_fahv_vin
31950 99.9  18 wcgrid_faah_7.1
31433 99.9  19 wcgrid_faah_7.1
31648 99.9  20 wcgrid_faah_7.1
33919 99.9  21 wcgrid_mcm1_7.2
34067 98.5  22 wcgrid_fahv_vin
34367 93.7  23 wcgrid_fahv_vin
34043 98.6  24 wcgrid_fahv_vin
34071 98.8  25 wcgrid_fahv_vin
33941 99.9  26 wcgrid_mcm1_7.2
33494 99.9  27 wcgrid_mcm1_7.2
31219 99.9  28 wcgrid_faah_7.1
31058 99.9  29 wcgrid_faah_7.1
34041 99.6  30 wcgrid_faah_7.1
33991 99.9  31 wcgrid_mcm1_7.2
29909 99.8  32 wcgrid_cep2_qch
33903 99.9  33 wcgrid_faah_7.1
34046 98.6  34 wcgrid_fahv_vin
19797 98.8  35 wcgrid_fahv_vin
33927 99.9  36 wcgrid_mcm1_7.2
34069 98.1  37 wcgrid_fahv_vin
32682 99.9  38 wcgrid_faah_7.1
30129 99.9  39 wcgrid_faah_7.1
34063 97.9  40 wcgrid_fahv_vin
33899 99.9  41 wcgrid_mcm1_7.2
34077 91.8  42 wcgrid_fahv_vin
33682 99.9  43 wcgrid_mcm1_7.2
33759 99.7  44 wcgrid_mcm1_7.2
32668 99.8  45 wcgrid_faah_7.1
31905 99.9  46 wcgrid_faah_7.1
33797 99.9  47 wcgrid_mcm1_7.2

Special thanks go to Grandpa_01 for letting me test the code on his machine :)
 
Last edited:
Thanks tear. Just upgraded and hope to see a boost soon.
 
You're welcome!

One thing Grandpa noticed is that sometimes [?] the installation defaults CPU usage preference to 60%.

A solution/workaround is to either change the preference to 100% or configure BOINC to "run always"
-- you may want to double check how you're set up.

boinc-run-always.png
 
60% was chosen as the default so that it would be more friendly towards laptops. However, I have seen various default settings over the many installs I have done using the same install file. So, I'm not sure how it makes its decisions. I also recommend setting Network activity to always available even if you have BOINC pause while active. This way the client can still upload or download work while work units are paused.
 
I added a walk through for installing BOINC onto Ubuntu 13.10 in the BOINC installation guide. http://hardforum.com/showthread.php?t=1768558

The newer the version of Ubuntu you install, the newer the version of BOINC that makes it into the repository. If you are using older versions of Ubuntu, then you may want to try using the above method. Otherwise, if you are running the newest version of Ubuntu, the one in the repository should do just fine.

I also don't have a multi-socket machine, so can't contest on if the repository method is really any better or not from the method posted above.
 
Last edited:
It's not, I just picked the version from 12.04 as the base 'cause it's LTS.
When 14.04 LTS comes out, the patch will be re-spun.

If you have a multisocket box, stick with 12.04 for now....
 
hey, just wanted to leave a thanks to tear and everyone else at hardforum for all their awesome work and guides, thanks everyone for being awesome ;).
 
Back
Top