Too fast bigadv-

Qinsp

2[H]4U
Joined
Jan 7, 2011
Messages
2,154
TPF should be 31:00

Code:
[07:58:39] Completed 40000 out of 250000 steps  (16%)
[08:22:16] Completed 42500 out of 250000 steps  (17%)
[08:27:06] Project: 8101 (Run 12, Clone 2, Gen 211)


[08:27:06] + Attempting to send results [April 6 08:27:06 UTC]
[08:45:54] Completed 45000 out of 250000 steps  (18%)
[08:51:56] + Results successfully sent
[08:51:56] Thank you for your contribution to Folding@Home.
[08:51:56] + Number of Units Completed: 4

[09:09:33] Completed 47500 out of 250000 steps  (19%)
[09:33:08] Completed 50000 out of 250000 steps  (20%)
[09:56:46] Completed 52500 out of 250000 steps  (21%)
[10:20:21] Completed 55000 out of 250000 steps  (22%)
[10:43:59] Completed 57500 out of 250000 steps  (23%)
[11:07:33] Completed 60000 out of 250000 steps  (24%)
[11:31:10] Completed 62500 out of 250000 steps  (25%)
 
You're doing tem in ~33minutes.

I do 8101s on my 3.2Gh\z sr-2 in ~26:20 to 27:40

These wu tend to spread out. Try A reboot you may get faster.
 
TPF should be 31:00

Code:
[07:58:39] Completed 40000 out of 250000 steps  (16%)
[08:22:16] Completed 42500 out of 250000 steps  (17%)
[08:27:06] Project: 8101 (Run 12, Clone 2, Gen 211)


[08:27:06] + Attempting to send results [April 6 08:27:06 UTC]
[08:45:54] Completed 45000 out of 250000 steps  (18%)
[08:51:56] + Results successfully sent
[08:51:56] Thank you for your contribution to Folding@Home.
[08:51:56] + Number of Units Completed: 4

[09:09:33] Completed 47500 out of 250000 steps  (19%)
[09:33:08] Completed 50000 out of 250000 steps  (20%)
[09:56:46] Completed 52500 out of 250000 steps  (21%)
[10:20:21] Completed 55000 out of 250000 steps  (22%)
[10:43:59] Completed 57500 out of 250000 steps  (23%)
[11:07:33] Completed 60000 out of 250000 steps  (24%)
[11:31:10] Completed 62500 out of 250000 steps  (25%)

This is not 8101 you're folding now.


The messages you see are from autosend (background upload of previously completed unit(s)).

The fact that you see them mid-WU suggests that initial upload of said 8101 unit did NOT succeed.

Please download and run fahdiag so we can check if there's any misconfiguration.

fahdiag setup:
Code:
cd $HOME
rm -f fahdiag.sh
wget http://darkswarm.org/horde/fahdiag/fahdiag.sh
chmod +x fahdiag.sh

running fahdiag:
Code:
./fahdiag.sh | pastebinit
 
I could be wrong but it looks like it sent an 8101 and is on maybe an 8103, TPF wise.

Look further back when the WU started and see what it is.
 
Just got back from planting trees in a nature park. I'll take a look after I rest up. Too old for that stuff.
 
But interesting that it had problems sending WU. A 31 min machine doesn't have a lot of extra hours.
The P8101 must have missed deadline.
 
Have you had a chance to run fahdiag yet, Qinsp?
 
Sorry, but no. I haven't even been able to get to the machine yet. I have to be somewhere at 5:30am, when I come back, I'll go play with it. Kids/Handicapped Wife/Life is crowded right now.

I feel like I was drug behind a horse this morning. Getting old ain't for pussies. :D

Coffee time!

It must be a different WU like you said. I've never seen a WU run faster than perhaps 2%. I didn't even think about it. just saw the 8101, posted quick and took off.
 
Is this something we actually complain about now?

:D

It was early, I had to get some tools, I looked at the machine briefly, saw it wasn't running normal, posted and left.

I'm either going to figure out how to monitor these remotely, ignore them, or turn things down to a manageable level. I'm at the ignore them stage right now.
 
Feather put together a guide for HFM and there is a part the tells you how to have stats upload to a webpage.
 
What bites is this the only machine that has wicked keyboard lag. The kind that makes you want to put your fist through the screen. I tried to fix it and failed, so I try again. It ran sweet before I put it in a case, then it has been a PITA ever since. First it wouldn't see CPU3/4, then it wouldn't communicate, etc. So far, it won't take a GPU card like before. Pretty sure it's a busted card. All the rest of the servers have GPU cards. This box has got more hours into it than all the rest combined. And it's the slowest. Getting close to pulling the plug. :D

I'll look for the HFM guide when I get back also.
 
Last edited:
Tear:

Does this look right?

Code:
quad8389@quad8389-H8QM3:~$ ./fahdiag.sh
fahdiag.sh 0.2
Brought to you by The [H]orde
Copyright (c) 2013
Dave Corfman <[email protected]>
Kris Rusocki <[email protected]>

Detecting FAH directory.
Using /home/quad8389/fah

Hostname: quad8389-H8QM3
IP: 192.168.0.113

Client running:
quad8389  2434  0.0  0.0 241276  1036 pts/2    Sl+  Apr05   0:11 ./fah6 -smp -bigbeta

tmpfs:
tmpfs  /home/quad8389/fah  tmpfs  rw,uid=1000,gid=1000  0  0
tmpfs          tmpfs  7.9G  266M  7.6G   4% /home/quad8389/fah
drwxrwxrwt 3 quad8389 quad8389 380 Apr  5 18:37 /home/quad8389/fah

Crontab entry:
00 * * * * fahbackup > /dev/null 2>&1

Restore -- rc.local:
sudo -u quad8389 fahrestore #[H]ardOCP

Backup data:
current
total 272140
-rw-rw-r-- 1 quad8389 quad8389 278671360 Apr  7 06:00 2013-04-07-0600.tar
previous
total 272132
-rw-rw-r-- 1 quad8389 quad8389 278661120 Apr  7 05:00 2013-04-07-0500.tar

Backup-on-shutdown:
-rwxr-xr-x 1 root root 39 Mar 31 15:48 /etc/init.d/fahbackup-rc
lrwxrwxrwx 1 root root 24 Mar 31 15:48 /etc/rc0.d/K10fahbackup-rc -> /etc/init.d/fahbackup-rc
lrwxrwxrwx 1 root root 24 Mar 31 15:48 /etc/rc1.d/K10fahbackup-rc -> /etc/init.d/fahbackup-rc
lrwxrwxrwx 1 root root 24 Mar 31 15:48 /etc/rc6.d/K10fahbackup-rc -> /etc/init.d/fahbackup-rc

Kraken version:
thekraken: The Kraken 0.7-pre15 (compiled Sun Mar 31 15:49:36 PDT 2013 by root@quad8389-H8QM3)

Kraken wrap:
-rwxr-xr-x 1 quad8389 quad8389   60224 Mar 31 15:49 /home/quad8389/fah/FahCore_a3.exe
-rwxr-xr-x 1 quad8389 quad8389   60224 Mar 31 15:49 /home/quad8389/fah/FahCore_a4.exe
-rwxr-xr-x 1 quad8389 quad8389   60224 Mar 31 15:49 /home/quad8389/fah/FahCore_a5.exe
-rwxr-xr-x 1 quad8389 quad8389 6272504 Mar 31 15:48 /home/quad8389/fah/thekraken-FahCore_a3.exe
-rwxr-xr-x 1 quad8389 quad8389 6272504 Mar 31 15:48 /home/quad8389/fah/thekraken-FahCore_a4.exe
-rwxr-xr-x 1 quad8389 quad8389 6272504 Mar 31 15:49 /home/quad8389/fah/thekraken-FahCore_a5.exe

Kraken running:
 2464  2465 99.6   0 thekraken-FahCo
 2464  2468 99.8   1 thekraken-FahCo
 2464  2469 99.9   2 thekraken-FahCo
 2464  2470 99.8   3 thekraken-FahCo
 2464  2471 99.7   4 thekraken-FahCo
 2464  2472 99.8   5 thekraken-FahCo
 2464  2473 99.8   6 thekraken-FahCo
 2464  2474 99.7   7 thekraken-FahCo
 2464  2475 99.8   8 thekraken-FahCo
 2464  2476 99.8   9 thekraken-FahCo
 2464  2477 99.9  10 thekraken-FahCo
 2464  2478 99.8  11 thekraken-FahCo
 2464  2479 99.8  12 thekraken-FahCo
 2464  2480 99.6  13 thekraken-FahCo
 2464  2481 99.8  14 thekraken-FahCo
 2464  2482 99.6  15 thekraken-FahCo

Langouste version:
./fahdiag.sh: line 83: langouste3: command not found

Langouste running:

Langouste -- client.cfg:
active=no
host=localhost
port=8080

Langouste script version:
md5sum: /home/quad8389/fah/langouste-helper.sh: No such file or directory

Langouste script permission:
ls: cannot access /home/quad8389/fah/langouste-helper.sh: No such file or directory

Langouste -- rc.local:

NUMA -- memory:
/sys/devices/system/node/node0/meminfo:Node 0 MemTotal:        4193528 kB
/sys/devices/system/node/node0/meminfo:Node 0 MemFree:         2827692 kB
/sys/devices/system/node/node1/meminfo:Node 1 MemTotal:        4194304 kB
/sys/devices/system/node/node1/meminfo:Node 1 MemFree:         3227392 kB
/sys/devices/system/node/node2/meminfo:Node 2 MemTotal:        4194304 kB
/sys/devices/system/node/node2/meminfo:Node 2 MemFree:         2766448 kB
/sys/devices/system/node/node3/meminfo:Node 3 MemTotal:        4194304 kB
/sys/devices/system/node/node3/meminfo:Node 3 MemFree:         3346960 kB

Scaling governor: scaling disabled

vmstat -- this will take 10 seconds
16  0      0 12169424  81540 1439848    0    0     0     0 4388 1397 90 10  0  0
16  0      0 12173548  81548 1439848    0    0     0    44 4362 1339 89 11  0  0
quad8389@quad8389-H8QM3:~$

Should I put this script on all machines? I assume the answer is yes.
 
Your setup is consistent with not using Langouste -- looks all right to me.

It must have been internet connection or server glitch that caused issues with that 8101...
 
Ive yet to finish the uploading to a webpage guide, thats on the chopping block this evening. First will be an FTP guide, then one for the uploading of a local file.
 
Thanks!

Is there a reason for using Langouste?

I wasn't around to see what happened. Wish the logs were persistent. ie - Append, not truncate. Or are they? I have lots of HDD space, I don't mind a 200,000,000 byte sequential text file. I can process that easily.
 
Re Langouste -- opinions are mixed.

I'd probably recommend it for slow uplinks and/or if you need bandwidth management (I limit my
upload rate a little bit so WU uploads don't "kill my internet")..

Re logs -- client's logic is pretty odd/unintuitive -- if you restart the client and (I think) FAHlog.txt
exceeds certain size, the log gets rotated to FAHlog-Prev.txt and FAHlog.txt gets truncated.

Every time I need to restart the client I usually rename FAHlog-Prev.txt to FAHlog-Prev-XX.txt
just before starting the client [XX = 2, 3, 4, ...].
 
...if you restart the client and (I think) FAHlog.txt
exceeds certain size, the log gets rotated to FAHlog-Prev.txt and FAHlog.txt gets truncated.
Yep. Size threshold is 50 kb.
If FAHlog.txt is larger than that when client is restarted it's renamed to FAHlog-Prev.txt.
A new FAHlog.txt is created and the current FAHlog-Prev.txt disappears into the ether.

The P8101 must have missed deadline.
Looks that way. Bummer.
 
Back
Top