Unstable Machine

Wheresatom

[H]ard|Gawd
Joined
Mar 20, 2007
Messages
1,390
I think I got a bad work unit. I haven't had any problems until this morning. I will not rule it out as a possibility that I might be unstable though. Here is some code to help you guys advise my next step.

Code:
[01:00:53] + Attempting to get work packet
[01:00:53] - Connecting to assignment server
[01:00:54] - Successful: assigned to (171.67.108.11).
[01:00:54] + News From Folding@Home: GPU folding beta
[01:00:54] Loaded queue successfully.
[01:00:55] + Could not connect to Work Server
[01:00:55] - Attempt #2  to get work failed, and no other work to do.
Waiting before retry.
[01:01:11] + Attempting to get work packet
[01:01:11] - Connecting to assignment server
[01:01:11] - Successful: assigned to (171.67.108.11).
[01:01:11] + News From Folding@Home: GPU folding beta
[01:01:12] Loaded queue successfully.
[01:01:14] + Closed connections
[01:01:14] 
[01:01:14] + Processing work unit
[01:01:14] Core required: FahCore_11.exe
[01:01:14] Core found.
[01:01:14] Working on queue slot 04 [January 14 01:01:14 UTC]
[01:01:14] + Working ...
[01:01:14] 
[01:01:14] *------------------------------*
[01:01:14] Folding@Home GPU Core - Beta
[01:01:14] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:01:14] 
[01:01:14] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:01:14] Build host: amoeba
[01:01:14] Board Type: Nvidia
[01:01:14] Core      : 
[01:01:14] Preparing to commence simulation
[01:01:14] - Looking at optimizations...
[01:01:14] - Created dyn
[01:01:14] - Files status OK
[01:01:14] - Expanded 43861 -> 252912 (decompressed 576.6 percent)
[01:01:14] Called DecompressByteArray: compressed_data_size=43861 data_size=252912, decompressed_data_size=252912 diff=0
[01:01:14] - Digital signature verified
[01:01:14] 
[01:01:14] Project: 5766 (Run 0, Clone 288, Gen 0)
[01:01:14] 
[01:01:14] Assembly optimizations on if available.
[01:01:14] Entering M.D.
[01:01:20] Working on Protein
[01:01:21] Client config found, loading data.
[01:01:21] mdrun_gpu returned 
[01:01:21] NANs detected on GPU
[01:01:21] 
[01:01:21] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:01:24] CoreStatus = 7A (122)
[01:01:24] Sending work to server
[01:01:24] Project: 5766 (Run 0, Clone 288, Gen 0)
[01:01:24] - Read packet limit of 540015616... Set to 524286976.
[01:01:24] - Error: Could not get length of results file work/wuresults_04.dat
[01:01:24] - Error: Could not read unit 04 file. Removing from queue.
[01:01:24] - Preparing to get new work unit...
[01:01:24] + Attempting to get work packet
[01:01:24] - Connecting to assignment server
[01:01:25] - Successful: assigned to (171.67.108.11).
[01:01:25] + News From Folding@Home: GPU folding beta
[01:01:25] Loaded queue successfully.
[01:01:27] + Closed connections
[01:01:32] 
[01:01:32] + Processing work unit
[01:01:32] Core required: FahCore_11.exe
[01:01:32] Core found.
[01:01:32] Working on queue slot 05 [January 14 01:01:32 UTC]
[01:01:32] + Working ...
[01:01:32] 
[01:01:32] *------------------------------*
[01:01:32] Folding@Home GPU Core - Beta
[01:01:32] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:01:32] 
[01:01:32] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:01:32] Build host: amoeba
[01:01:32] Board Type: Nvidia
[01:01:32] Core      : 
[01:01:32] Preparing to commence simulation
[01:01:32] - Looking at optimizations...
[01:01:32] - Created dyn
[01:01:32] - Files status OK
[01:01:32] - Expanded 43861 -> 252912 (decompressed 576.6 percent)
[01:01:32] Called DecompressByteArray: compressed_data_size=43861 data_size=252912, decompressed_data_size=252912 diff=0
[01:01:32] - Digital signature verified
[01:01:32] 
[01:01:32] Project: 5766 (Run 0, Clone 288, Gen 0)
[01:01:32] 
[01:01:32] Assembly optimizations on if available.
[01:01:32] Entering M.D.
[01:01:38] Working on Protein
[01:01:40] Client config found, loading data.
[01:01:40] Starting GUI Server
[01:01:40] mdrun_gpu returned 
[01:01:40] NANs detected on GPU
[01:01:40] 
[01:01:40] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:01:42] CoreStatus = 7A (122)
[01:01:42] Sending work to server
[01:01:42] Project: 5766 (Run 0, Clone 288, Gen 0)
[01:01:42] - Read packet limit of 540015616... Set to 524286976.
[01:01:42] - Error: Could not get length of results file work/wuresults_05.dat
[01:01:42] - Error: Could not read unit 05 file. Removing from queue.
[01:01:42] - Preparing to get new work unit...
[01:01:42] + Attempting to get work packet
[01:01:42] - Connecting to assignment server
[01:01:43] - Successful: assigned to (171.67.108.11).
[01:01:43] + News From Folding@Home: GPU folding beta
[01:01:43] Loaded queue successfully.
[01:01:44] + Closed connections
[01:01:49] 
[01:01:49] + Processing work unit
[01:01:49] Core required: FahCore_11.exe
[01:01:49] Core found.
[01:01:49] Working on queue slot 06 [January 14 01:01:49 UTC]
[01:01:49] + Working ...
[01:01:50] 
[01:01:50] *------------------------------*
[01:01:50] Folding@Home GPU Core - Beta
[01:01:50] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:01:50] 
[01:01:50] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:01:50] Build host: amoeba
[01:01:50] Board Type: Nvidia
[01:01:50] Core      : 
[01:01:50] Preparing to commence simulation
[01:01:50] - Looking at optimizations...
[01:01:50] - Created dyn
[01:01:50] - Files status OK
[01:01:50] - Expanded 43861 -> 252912 (decompressed 576.6 percent)
[01:01:50] Called DecompressByteArray: compressed_data_size=43861 data_size=252912, decompressed_data_size=252912 diff=0
[01:01:50] - Digital signature verified
[01:01:50] 
[01:01:50] Project: 5766 (Run 0, Clone 288, Gen 0)
[01:01:50] 
[01:01:50] Assembly optimizations on if available.
[01:01:50] Entering M.D.
[01:01:56] Working on Protein
[01:01:57] Client config found, loading data.
[01:01:57] mdrun_gpu returned 
[01:01:57] NANs detected on GPU
[01:01:57] 
[01:01:57] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:02:00] CoreStatus = 7A (122)
[01:02:00] Sending work to server
[01:02:00] Project: 5766 (Run 0, Clone 288, Gen 0)
[01:02:00] - Read packet limit of 540015616... Set to 524286976.
[01:02:00] - Error: Could not get length of results file work/wuresults_06.dat
[01:02:00] - Error: Could not read unit 06 file. Removing from queue.
[01:02:00] - Preparing to get new work unit...
[01:02:00] + Attempting to get work packet
[01:02:00] - Connecting to assignment server
[01:02:00] - Successful: assigned to (171.67.108.11).
[01:02:00] + News From Folding@Home: GPU folding beta
[01:02:01] Loaded queue successfully.
[01:02:02] + Closed connections
[01:02:07] 
[01:02:07] + Processing work unit
[01:02:07] Core required: FahCore_11.exe
[01:02:07] Core found.
[01:02:07] Working on queue slot 07 [January 14 01:02:07 UTC]
[01:02:07] + Working ...
[01:02:08] 
[01:02:08] *------------------------------*
[01:02:08] Folding@Home GPU Core - Beta
[01:02:08] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:02:08] 
[01:02:08] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:02:08] Build host: amoeba
[01:02:08] Board Type: Nvidia
[01:02:08] Core      : 
[01:02:08] Preparing to commence simulation
[01:02:08] - Looking at optimizations...
[01:02:08] - Created dyn
[01:02:08] - Files status OK
[01:02:08] - Expanded 43861 -> 252912 (decompressed 576.6 percent)
[01:02:08] Called DecompressByteArray: compressed_data_size=43861 data_size=252912, decompressed_data_size=252912 diff=0
[01:02:08] - Digital signature verified
[01:02:08] 
[01:02:08] Project: 5766 (Run 0, Clone 288, Gen 0)
[01:02:08] 
[01:02:08] Assembly optimizations on if available.
[01:02:08] Entering M.D.
[01:02:14] Working on Protein
[01:02:15] Client config found, loading data.
[01:02:15] mdrun_gpu returned 
[01:02:15] NANs detected on GPU
[01:02:15] 
[01:02:15] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:02:18] CoreStatus = 7A (122)
[01:02:18] Sending work to server
[01:02:18] Project: 5766 (Run 0, Clone 288, Gen 0)
[01:02:18] - Read packet limit of 540015616... Set to 524286976.
[01:02:18] - Error: Could not get length of results file work/wuresults_07.dat
[01:02:18] - Error: Could not read unit 07 file. Removing from queue.
[01:02:18] - Preparing to get new work unit...
[01:02:18] + Attempting to get work packet
[01:02:18] - Connecting to assignment server
[01:02:18] - Successful: assigned to (171.67.108.11).
[01:02:18] + News From Folding@Home: GPU folding beta
[01:02:19] Loaded queue successfully.
[01:02:21] + Closed connections
[01:02:26] 
[01:02:26] + Processing work unit
[01:02:26] Core required: FahCore_11.exe
[01:02:26] Core found.
[01:02:26] Working on queue slot 08 [January 14 01:02:26 UTC]
[01:02:26] + Working ...
[01:02:26] 
[01:02:26] *------------------------------*
[01:02:26] Folding@Home GPU Core - Beta
[01:02:26] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:02:26] 
[01:02:26] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:02:26] Build host: amoeba
[01:02:26] Board Type: Nvidia
[01:02:26] Core      : 
[01:02:26] Preparing to commence simulation
[01:02:26] - Looking at optimizations...
[01:02:26] - Created dyn
[01:02:26] - Files status OK
[01:02:26] - Expanded 43861 -> 252912 (decompressed 576.6 percent)
[01:02:26] Called DecompressByteArray: compressed_data_size=43861 data_size=252912, decompressed_data_size=252912 diff=0
[01:02:26] - Digital signature verified
[01:02:26] 
[01:02:26] Project: 5766 (Run 0, Clone 288, Gen 0)
[01:02:26] 
[01:02:26] Assembly optimizations on if available.
[01:02:26] Entering M.D.
[01:02:33] Working on Protein
[01:02:33] Client config found, loading data.
[01:02:34] mdrun_gpu returned 
[01:02:34] NANs detected on GPU
[01:02:34] 
[01:02:34] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:02:37] CoreStatus = 7A (122)
[01:02:37] Sending work to server
[01:02:37] Project: 5766 (Run 0, Clone 288, Gen 0)
[01:02:37] - Read packet limit of 540015616... Set to 524286976.
[01:02:37] - Error: Could not get length of results file work/wuresults_08.dat
[01:02:37] - Error: Could not read unit 08 file. Removing from queue.
[01:02:37] EUE limit exceeded. Pausing 24 hours.
[04:04:59] + Working...
[10:04:56] + Working...

So, I figure being it didn't even get out of the gates, thats a bad work unit right? What do I do now? Restart the client? Do I have to delete the queue and work folder? It has been a while since I had a problem so I forget how to handle it.

 
yeah its a bad project.. it just kept retrying the project.. after 3 times it think it finally gives up and downloads a new one.. if it doesnt i believe you can delete the projects/data folder (some one correct me on which folder it is exactly).. and it should download a new one.. or try restarting it if you havent done that already..
 
Yes, but delete the 'work' directory first along with queue.dat
 
There must be a bad batch of units out with 353 points of credit.
Code:
[13:34:42] 
[13:34:42] *------------------------------*
[13:34:42] [EMAIL="Folding@Home"]Folding@Home[/EMAIL] GPU Core - Beta
[13:34:42] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[13:34:42] 
[13:34:42] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[13:34:42] Build host: amoeba
[13:34:42] Board Type: Nvidia
[13:34:42] Core      : 
[13:34:42] Preparing to commence simulation
[13:34:42] - Looking at optimizations...
[13:34:42] - Created dyn
[13:34:42] - Files status OK
[13:34:42] - Expanded 46790 -> 252912 (decompressed 540.5 percent)
[13:34:42] Called DecompressByteArray: compressed_data_size=46790 data_size=252912, decompressed_data_size=252912 diff=0
[13:34:42] - Digital signature verified
[13:34:42] 
[13:34:42] Project: 5768 (Run 3, Clone 245, Gen 2)
[13:34:42] 
[13:34:42] Assembly optimizations on if available.
[13:34:42] Entering M.D.
[13:34:49] Working on Protein
[13:34:49] Client config found, loading data.
[13:34:49] Starting GUI Server
[13:34:49] mdrun_gpu returned 
[13:34:49] NANs detected on GPU
[13:34:49] 
[13:34:49] [EMAIL="Folding@home"]Folding@home[/EMAIL] Core Shutdown: UNSTABLE_MACHINE
[13:34:53] CoreStatus = 7A (122)
[13:34:53] Sending work to server
[13:34:53] Project: 5768 (Run 3, Clone 245, Gen 2)
[13:34:53] - Read packet limit of 540015616... Set to 524286976.
[13:34:53] - Error: Could not get length of results file work/wuresults_06.dat
[13:34:53] - Error: Could not read unit 06 file. Removing from queue.
[13:34:53] EUE limit exceeded. Pausing 24 hours.
 
I noticed that the only time I get WU that "UNSTABLE MACHINE" and EUE out, they come from the same IP address. They are also 5766 and a couple others.
Let me stress, my machine is 100% folding stable. I have confirmed that this is not MY problem. This problem is specific to the 5766 Project.

If Stanford won't fix their WU, I wont fold them.

I used my router to block 171.67.108.11. Now if I get assigned that server I go through a series of cannot connect to server errors for about 5 minutes, and then the assignment server puts me onto another server.

Razor, yes there are, and they fold like a charm on a GTX260, faster even then the old 480 pointers.

 
Since yesterday, it seems we have a few bad WU floating. I got hit by 3 bad units just yesterday evenign and this morning
 
Some of these new WUs stress the GPUs quite a bit. I backed off my OC on my 8800GTX (not a whole lot, just a couple) when I was getting EUE's left and right. I've not had an EUE in months.
 
I have a whole mess of thoes WU in my rigs and no issues that I can see. When I go home for lunch I'll check.
 
nice u guys can at least see that. i don't see anything like that with my seti crunching as far as i know :(
 
any time that you can back off the OC on your card and get stable preformance, don't blame Stanford. It's the NV team coding stuff to the limit and not leaving enough room outside default settings.

Edit: I did have a couple EUE on boxen. None where OC-ed.
 
Razor, yes there are, and they fold like a charm on a GTX260, faster even then the old 480 pointers.

Yeah that one happened on my 8800 GTS 640mb (G80), my GTX 260's are busy doing other things atm. They'll get them figured out soon I'm sure.
I'm quite sure it's not my most stable card out of my many that I have that is the problem so it's probably just a batch of glitchy units they'll have to pull and fix.:)
 
I think I am still being hit by bad units. I backed off my OC on my shaders from 1512 to 1404 and I still can't seem to keep the darn things away.

Code:
[01:21:30] - Ask before connecting: No
[01:21:30] - User name: Wheresatom (Team 33)
[01:21:30] - User ID: 527BB9385EEE6437
[01:21:30] - Machine ID: 2
[01:21:30] 
[01:21:30] Work directory not found. Creating...
[01:21:30] Could not open work queue, generating new queue...
[01:21:30] Initialization complete
[01:21:30] - Preparing to get new work unit...
[01:21:30] + Attempting to get work packet
[01:21:30] - Connecting to assignment server
[01:21:31] - Successful: assigned to (171.67.108.11).
[01:21:31] + News From Folding@Home: GPU folding beta
[01:21:31] Loaded queue successfully.
[01:21:32] + Closed connections
[01:21:32] 
[01:21:32] + Processing work unit
[01:21:32] Core required: FahCore_11.exe
[01:21:32] Core found.
[01:21:32] Working on queue slot 01 [January 22 01:21:32 UTC]
[01:21:32] + Working ...
[01:21:32] 
[01:21:32] *------------------------------*
[01:21:32] Folding@Home GPU Core - Beta
[01:21:32] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:21:32] 
[01:21:32] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:21:32] Build host: amoeba
[01:21:32] Board Type: Nvidia
[01:21:32] Core      : 
[01:21:32] Preparing to commence simulation
[01:21:32] - Looking at optimizations...
[01:21:32] - Created dyn
[01:21:32] - Files status OK
[01:21:32] - Expanded 46647 -> 252912 (decompressed 542.1 percent)
[01:21:32] Called DecompressByteArray: compressed_data_size=46647 data_size=252912, decompressed_data_size=252912 diff=0
[01:21:32] - Digital signature verified
[01:21:32] 
[01:21:32] Project: 5766 (Run 0, Clone 458, Gen 2)
[01:21:32] 
[01:21:32] Assembly optimizations on if available.
[01:21:32] Entering M.D.
[01:21:38] Working on Protein
[01:21:39] Client config found, loading data.
[01:21:39] mdrun_gpu returned 
[01:21:39] NANs detected on GPU
[01:21:39] 
[01:21:39] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:21:42] CoreStatus = 7A (122)
[01:21:42] Sending work to server
[01:21:42] Project: 5766 (Run 0, Clone 458, Gen 2)
[01:21:42] - Read packet limit of 540015616... Set to 524286976.
[01:21:42] - Error: Could not get length of results file work/wuresults_01.dat
[01:21:42] - Error: Could not read unit 01 file. Removing from queue.
[01:21:42] - Preparing to get new work unit...
[01:21:42] + Attempting to get work packet
[01:21:42] - Connecting to assignment server
[01:21:43] - Successful: assigned to (171.67.108.11).
[01:21:43] + News From Folding@Home: GPU folding beta
[01:21:43] Loaded queue successfully.
[01:21:43] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.

Folding@Home Client Shutdown.


--- Opening Log file [January 22 01:22:16 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.20r1

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Users\Adam\AppData\Roaming\Folding@home-gpu


[01:22:16] - Ask before connecting: No
[01:22:16] - User name: Wheresatom (Team 33)
[01:22:16] - User ID: 527BB9385EEE6437
[01:22:16] - Machine ID: 2
[01:22:16] 
[01:22:16] Work directory not found. Creating...
[01:22:16] Could not open work queue, generating new queue...
[01:22:17] Initialization complete
[01:22:17] - Preparing to get new work unit...
[01:22:17] + Attempting to get work packet
[01:22:17] - Connecting to assignment server
[01:22:17] - Successful: assigned to (171.67.108.11).
[01:22:17] + News From Folding@Home: GPU folding beta
[01:22:17] Loaded queue successfully.
[01:22:22] + Closed connections
[01:22:22] 
[01:22:22] + Processing work unit
[01:22:22] Core required: FahCore_11.exe
[01:22:22] Core found.
[01:22:22] Working on queue slot 01 [January 22 01:22:22 UTC]
[01:22:22] + Working ...
[01:22:23] 
[01:22:23] *------------------------------*
[01:22:23] Folding@Home GPU Core - Beta
[01:22:23] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:22:23] 
[01:22:23] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:22:23] Build host: amoeba
[01:22:23] Board Type: Nvidia
[01:22:23] Core      : 
[01:22:23] Preparing to commence simulation
[01:22:23] - Looking at optimizations...
[01:22:23] - Created dyn
[01:22:23] - Files status OK
[01:22:23] - Expanded 46740 -> 252912 (decompressed 541.1 percent)
[01:22:23] Called DecompressByteArray: compressed_data_size=46740 data_size=252912, decompressed_data_size=252912 diff=0
[01:22:23] - Digital signature verified
[01:22:23] 
[01:22:23] Project: 5765 (Run 13, Clone 450, Gen 4)
[01:22:23] 
[01:22:23] Assembly optimizations on if available.
[01:22:23] Entering M.D.
[01:22:29] Working on Protein
[01:22:29] Client config found, loading data.
[01:22:30] mdrun_gpu returned 
[01:22:30] NANs detected on GPU
[01:22:30] 
[01:22:30] Folding@home Core Shutdown: UNSTABLE_MACHINE

What do I need to do? Is it a client thing? Do I just keep getting bad units. Something is up because I was really close to 10k PPD a week ago, and now with all of this crap I have fallen pretty drastically. What are my options?

 
It's probably not you Wheresatom
I have been error free for about a week now but the bad units seem to be back as of today on my machines..:(
Code:
[00:53:46] 
[00:53:46] *------------------------------*
[00:53:46] [EMAIL="Folding@Home"]Folding@Home[/EMAIL] GPU Core - Beta
[00:53:46] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[00:53:46] 
[00:53:46] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[00:53:46] Build host: amoeba
[00:53:46] Board Type: Nvidia
[00:53:46] Core      : 
[00:53:46] Preparing to commence simulation
[00:53:46] - Looking at optimizations...
[00:53:46] - Created dyn
[00:53:46] - Files status OK
[00:53:46] - Expanded 46763 -> 252912 (decompressed 540.8 percent)
[00:53:46] Called DecompressByteArray: compressed_data_size=46763 data_size=252912, decompressed_data_size=252912 diff=0
[00:53:46] - Digital signature verified
[00:53:46] 
[00:53:46] Project: 5765 (Run 7, Clone 392, Gen 2)
[00:53:46] 
[00:53:46] Assembly optimizations on if available.
[00:53:46] Entering M.D.
[00:53:52] Working on Protein
[00:53:53] Client config found, loading data.
[00:53:53] Starting GUI Server
[00:53:53] mdrun_gpu returned 
[00:53:53] NANs detected on GPU
[00:53:53] 
[00:53:53] [EMAIL="Folding@home"]Folding@home[/EMAIL] Core Shutdown: UNSTABLE_MACHINE
[00:53:57] CoreStatus = 7A (122)
[00:53:57] Sending work to server
[00:53:57] Project: 5765 (Run 7, Clone 392, Gen 2)
[00:53:57] - Read packet limit of 540015616... Set to 524286976.
[00:53:57] - Error: Could not get length of results file work/wuresults_04.dat
[00:53:57] - Error: Could not read unit 04 file. Removing from queue.
[00:53:57] Trying to send all finished work units
[00:53:57] + No unsent completed units remaining.
[00:53:57] - Preparing to get new work unit...
[00:53:57] + Attempting to get work packet
[00:53:57] - Will indicate memory of 3070 MB
[00:53:57] - Connecting to assignment server
[00:53:57] Connecting to [URL]http://assign-GPU.stanford.edu:8080/[/URL]
[00:54:01] Posted data.
[00:54:01] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[00:54:01] + News From [EMAIL="Folding@Home"]Folding@Home[/EMAIL]: GPU folding beta
[00:54:01] Loaded queue successfully.
[00:54:01] Connecting to [URL]http://171.67.108.11:8080/[/URL]
[00:54:03] Posted data.
[00:54:03] Initial: 0000; - Receiving payload (expected size: 47275)
[00:54:03] Conversation time very short, giving reduced weight in bandwidth avg
[00:54:03] - Downloaded at ~92 kB/s
[00:54:03] - Averaged speed for that direction ~125 kB/s
[00:54:03] + Received work.
[00:54:03] Trying to send all finished work units
[00:54:03] + No unsent completed units remaining.
[00:54:03] + Closed connections
[00:54:08] 
[00:54:08] + Processing work unit
[00:54:08] Core required: FahCore_11.exe
[00:54:08] Core found.
[00:54:08] Working on queue slot 05 [January 22 00:54:08 UTC]
[00:54:08] + Working ...
[00:54:08] - Calling '.\FahCore_11.exe -dir work/ -suffix 05 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 5204 -version 623'
[00:54:08] 
[00:54:08] *------------------------------*
[00:54:08] [EMAIL="Folding@Home"]Folding@Home[/EMAIL] GPU Core - Beta
[00:54:08] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[00:54:08] 
[00:54:08] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[00:54:08] Build host: amoeba
[00:54:08] Board Type: Nvidia
[00:54:08] Core      : 
[00:54:08] Preparing to commence simulation
[00:54:08] - Looking at optimizations...
[00:54:08] - Created dyn
[00:54:08] - Files status OK
[00:54:08] - Expanded 46763 -> 252912 (decompressed 540.8 percent)
[00:54:08] Called DecompressByteArray: compressed_data_size=46763 data_size=252912, decompressed_data_size=252912 diff=0
[00:54:08] - Digital signature verified
[00:54:08] 
[00:54:08] Project: 5765 (Run 7, Clone 392, Gen 2)
[00:54:08] 
[00:54:08] Assembly optimizations on if available.
[00:54:08] Entering M.D.
[00:54:15] Working on Protein
[00:54:15] Client config found, loading data.
[00:54:15] mdrun_gpu returned 
[00:54:15] NANs detected on GPU
[00:54:15] 
[00:54:15] [EMAIL="Folding@home"]Folding@home[/EMAIL] Core Shutdown: UNSTABLE_MACHINE
[00:54:19] CoreStatus = 7A (122)
[00:54:19] Sending work to server
[00:54:19] Project: 5765 (Run 7, Clone 392, Gen 2)
[00:54:19] - Read packet limit of 540015616... Set to 524286976.
[00:54:19] - Error: Could not get length of results file work/wuresults_05.dat
[00:54:19] - Error: Could not read unit 05 file. Removing from queue.
[00:54:19] Trying to send all finished work units
[00:54:19] + No unsent completed units remaining.
[00:54:19] - Preparing to get new work unit...
[00:54:19] + Attempting to get work packet
[00:54:19] - Will indicate memory of 3070 MB
[00:54:19] - Connecting to assignment server
[00:54:19] Connecting to [URL]http://assign-GPU.stanford.edu:8080/[/URL]
[00:54:19] Posted data.
[00:54:19] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[00:54:19] + News From [EMAIL="Folding@Home"]Folding@Home[/EMAIL]: GPU folding beta
[00:54:19] Loaded queue successfully.
[00:54:19] Connecting to [URL]http://171.67.108.11:8080/[/URL]
[00:54:21] Posted data.
[00:54:21] Initial: 0000; - Receiving payload (expected size: 47275)
[00:54:21] Conversation time very short, giving reduced weight in bandwidth avg
[00:54:21] - Downloaded at ~92 kB/s
[00:54:21] - Averaged speed for that direction ~122 kB/s
[00:54:21] + Received work.
[00:54:21] Trying to send all finished work units
[00:54:21] + No unsent completed units remaining.
[00:54:21] + Closed connections
[00:54:26] 
[00:54:26] + Processing work unit
[00:54:26] Core required: FahCore_11.exe
[00:54:26] Core found.
[00:54:26] Working on queue slot 06 [January 22 00:54:26 UTC]
[00:54:26] + Working ...
[00:54:26] - Calling '.\FahCore_11.exe -dir work/ -suffix 06 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 5204 -version 623'
[00:54:26] 
[00:54:26] *------------------------------*
[00:54:26] [EMAIL="Folding@Home"]Folding@Home[/EMAIL] GPU Core - Beta
[00:54:26] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[00:54:26] 
[00:54:26] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[00:54:26] Build host: amoeba
[00:54:26] Board Type: Nvidia
[00:54:26] Core      : 
[00:54:26] Preparing to commence simulation
[00:54:26] - Looking at optimizations...
[00:54:26] - Created dyn
[00:54:26] - Files status OK
[00:54:26] - Expanded 46763 -> 252912 (decompressed 540.8 percent)
[00:54:26] Called DecompressByteArray: compressed_data_size=46763 data_size=252912, decompressed_data_size=252912 diff=0
[00:54:26] - Digital signature verified
[00:54:26] 
[00:54:26] Project: 5765 (Run 7, Clone 392, Gen 2)
[00:54:26] 
[00:54:26] Assembly optimizations on if available.
[00:54:26] Entering M.D.
[00:54:33] Working on Protein
[00:54:34] Client config found, loading data.
[00:54:34] mdrun_gpu returned 
[00:54:34] NANs detected on GPU
[00:54:34] 
[00:54:34] [EMAIL="Folding@home"]Folding@home[/EMAIL] Core Shutdown: UNSTABLE_MACHINE
[00:54:37] CoreStatus = 7A (122)
[00:54:37] Sending work to server
[00:54:37] Project: 5765 (Run 7, Clone 392, Gen 2)
[00:54:37] - Read packet limit of 540015616... Set to 524286976.
[00:54:37] - Error: Could not get length of results file work/wuresults_06.dat
[00:54:37] - Error: Could not read unit 06 file. Removing from queue.
[00:54:37] Trying to send all finished work units
[00:54:37] + No unsent completed units remaining.
[00:54:37] - Preparing to get new work unit...
[00:54:37] + Attempting to get work packet
[00:54:37] - Will indicate memory of 3070 MB
[00:54:37] - Connecting to assignment server
[00:54:37] Connecting to [URL]http://assign-GPU.stanford.edu:8080/[/URL]
[00:54:37] Posted data.
[00:54:37] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[00:54:37] + News From [EMAIL="Folding@Home"]Folding@Home[/EMAIL]: GPU folding beta
[00:54:37] Loaded queue successfully.
[00:54:37] Connecting to [URL]http://171.67.108.11:8080/[/URL]
[00:54:40] Posted data.
[00:54:40] Initial: 0000; - Receiving payload (expected size: 47275)
[00:54:40] Conversation time very short, giving reduced weight in bandwidth avg
[00:54:40] - Downloaded at ~92 kB/s
[00:54:40] - Averaged speed for that direction ~118 kB/s
[00:54:40] + Received work.
[00:54:40] Trying to send all finished work units
[00:54:40] + No unsent completed units remaining.
[00:54:40] + Closed connections
[00:54:45] 
[00:54:45] + Processing work unit
[00:54:45] Core required: FahCore_11.exe
[00:54:45] Core found.
[00:54:45] Working on queue slot 07 [January 22 00:54:45 UTC]
[00:54:45] + Working ...
[00:54:45] - Calling '.\FahCore_11.exe -dir work/ -suffix 07 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 5204 -version 623'
[00:54:45] 
[00:54:45] *------------------------------*
[00:54:45] [EMAIL="Folding@Home"]Folding@Home[/EMAIL] GPU Core - Beta
[00:54:45] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[00:54:45] 
[00:54:45] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[00:54:45] Build host: amoeba
[00:54:45] Board Type: Nvidia
[00:54:45] Core      : 
[00:54:45] Preparing to commence simulation
[00:54:45] - Looking at optimizations...
[00:54:45] - Created dyn
[00:54:45] - Files status OK
[00:54:45] - Expanded 46763 -> 252912 (decompressed 540.8 percent)
[00:54:45] Called DecompressByteArray: compressed_data_size=46763 data_size=252912, decompressed_data_size=252912 diff=0
[00:54:45] - Digital signature verified
[00:54:45] 
[00:54:45] Project: 5765 (Run 7, Clone 392, Gen 2)
[00:54:45] 
[00:54:45] Assembly optimizations on if available.
[00:54:45] Entering M.D.
[00:54:51] Working on Protein
[00:54:52] Client config found, loading data.
[00:54:52] mdrun_gpu returned 
[00:54:52] NANs detected on GPU
[00:54:52] 
[00:54:52] [EMAIL="Folding@home"]Folding@home[/EMAIL] Core Shutdown: UNSTABLE_MACHINE
[00:54:55] CoreStatus = 7A (122)
[00:54:55] Sending work to server
[00:54:55] Project: 5765 (Run 7, Clone 392, Gen 2)
[00:54:55] - Read packet limit of 540015616... Set to 524286976.
[00:54:55] - Error: Could not get length of results file work/wuresults_07.dat
[00:54:55] - Error: Could not read unit 07 file. Removing from queue.
[00:54:55] Trying to send all finished work units
[00:54:55] + No unsent completed units remaining.
[00:54:55] - Preparing to get new work unit...
[00:54:55] + Attempting to get work packet
[00:54:55] - Will indicate memory of 3070 MB
[00:54:55] - Connecting to assignment server
[00:54:55] Connecting to [URL]http://assign-GPU.stanford.edu:8080/[/URL]
[00:54:56] Posted data.
[00:54:56] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[00:54:56] + News From [EMAIL="Folding@Home"]Folding@Home[/EMAIL]: GPU folding beta
[00:54:56] Loaded queue successfully.
[00:54:56] Connecting to [URL]http://171.67.108.11:8080/[/URL]
[00:54:57] Posted data.
[00:54:57] Initial: 0000; - Receiving payload (expected size: 47275)
[00:54:57] Conversation time very short, giving reduced weight in bandwidth avg
[00:54:57] - Downloaded at ~92 kB/s
[00:54:57] - Averaged speed for that direction ~115 kB/s
[00:54:57] + Received work.
[00:54:57] Trying to send all finished work units
[00:54:57] + No unsent completed units remaining.
[00:54:57] + Closed connections
[00:55:02] 
[00:55:02] + Processing work unit
[00:55:02] Core required: FahCore_11.exe
[00:55:02] Core found.
[00:55:02] Working on queue slot 08 [January 22 00:55:02 UTC]
[00:55:02] + Working ...
[00:55:02] - Calling '.\FahCore_11.exe -dir work/ -suffix 08 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 5204 -version 623'
[00:55:03] 
[00:55:03] *------------------------------*
[00:55:03] [EMAIL="Folding@Home"]Folding@Home[/EMAIL] GPU Core - Beta
[00:55:03] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[00:55:03] 
[00:55:03] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[00:55:03] Build host: amoeba
[00:55:03] Board Type: Nvidia
[00:55:03] Core      : 
[00:55:03] Preparing to commence simulation
[00:55:03] - Looking at optimizations...
[00:55:03] - Created dyn
[00:55:03] - Files status OK
[00:55:03] - Expanded 46763 -> 252912 (decompressed 540.8 percent)
[00:55:03] Called DecompressByteArray: compressed_data_size=46763 data_size=252912, decompressed_data_size=252912 diff=0
[00:55:03] - Digital signature verified
[00:55:03] 
[00:55:03] Project: 5765 (Run 7, Clone 392, Gen 2)
[00:55:03] 
[00:55:03] Assembly optimizations on if available.
[00:55:03] Entering M.D.
[00:55:09] Working on Protein
[00:55:10] Client config found, loading data.
[00:55:10] mdrun_gpu returned 
[00:55:10] NANs detected on GPU
[00:55:10] 
[00:55:10] [EMAIL="Folding@home"]Folding@home[/EMAIL] Core Shutdown: UNSTABLE_MACHINE
[00:55:13] CoreStatus = 7A (122)
[00:55:13] Sending work to server
[00:55:13] Project: 5765 (Run 7, Clone 392, Gen 2)
[00:55:13] - Read packet limit of 540015616... Set to 524286976.
[00:55:13] - Error: Could not get length of results file work/wuresults_08.dat
[00:55:13] - Error: Could not read unit 08 file. Removing from queue.
[00:55:13] EUE limit exceeded. Pausing 24 hours.
[00:57:32] - Autosending finished units... [January 22 00:57:32 UTC]
[00:57:32] Trying to send all finished work units
[00:57:32] + No unsent completed units remaining.
[00:57:32] - Autosend completed
[00:57:32] + Working...
 
Ok, what do I need to do. I continue to be plagued by these problems, and to be honest it is getting old to have a rock solid SMP client and have my GPU2 client unstable all of the time. The water tastes funny in bizzaro world.

Until very recently I was folding with my shaders set to 1512. Since then I have clocked down to 1304. That has not seemed to account for any uptick in stability as I am still getting unstable machine errors right out of the gate on several projects a day. This weekend I was able to restart my clients whenever anything happened, which accounts for a decent uptick in points, but really I am getting tired of it.

Someone mentioned blocking a certain server that has been giving out bad work often, however I am not sure that is allowed by Stanford. I would really like to go back to set it and forget it.

Is there something about my setup that needs to change if I am still getting these errors?
 
I keep having the same problems on my GPU. except only on my second GPU core....weird.
 
I been still getting them to, last night and today have been pretty bad.
 
This has been going on for a few weeks according to the negative feedback on FF. It's making me really think twice about buying a few GPUs next week as planned and going with more CPU instead. :(
 
Well no offense, but misery loves company. I hadn't seen much about it on our boards so I figured I was the only one having these problems. I am glad to see I am not the only one having problems. I'll quit my bitchin' and go about my business.
 
Same problem here about two days ago. Lasted for 1-2 days and only affected one card. I did not think to see which assignment server I was being connected with but I too downclocked my card with no improvements. Thinking it was my setup, I moved to the 181.22 drivers. About this time things got better so I thought that fixed it. Sounds to me like there were some bad wu's.
 
See I am getting this same error but it has nothin to do with the WU's At least in my case.

I can run on my first GPU but when I try to run on my 2nd Core I get this error.
 
Back
Top