Lost WU due to "Can't restore state." error.

Aix.

[H]ard|Gawd
Joined
Sep 30, 2010
Messages
1,959
I was away from home for a couple of days and came home to my SR-2 happily folding away on a 2684 BigAdv WU @ 85%. I stopped the client to perform a couple of trivial tasks and when I restarted it THIS happened:

Code:
[01:39:08] Project: 2684 (Run 7, Clone 11, Gen 33)
[01:39:08] 
[01:39:09] Entering M.D.
[01:39:15] Using Gromacs checkpoints
[01:39:20] fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.Resuming from checkpoint
[01:39:20] fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=010FCF50, varsize=21120
[01:39:20] Can't restore state.CoreStatus = C0000029 (-1073741783)
[01:39:39] Client-core communications error: ERROR 0xc0000029
[01:39:39] Deleting current work unit & continuing...
[01:39:42] Killing all core threads
[01:39:42] Could not get process id information.  Please kill core process manually

WTF!? I looked at the log file and there's NOTHING there except for what I posted above. Anyone had this happen before?

Edit: I should also mention that restarting the client causes a "fah_core_a3.exe has stopped working and needs to close" notice.
 
Could you share how you stopped the client? (Just so I don't repeat that :0)
 
I clicked the red X in the top right-hand corner...which is how I have closed it every single other time without issue. I think I've got an issue with my RAM, honestly, but I've never seen this kind of error before...and of course it happens after I let the thing fold 85% of a 2684. What a piss-off.
 
yeah sounds like something got corrupted when the data was transferred from the ram to disc data. may want to do some stress testing on the ram just to make sure its not a problem that will happen again.
 
On the last mile of a 2684 -ow, just ow. :eek:

I thought it was better to shut down with Ctrl-C? - although I can't remember where I read that.
 
It turns out I have a bad stick of RAM: I thought I was just a victim of the "disappearing RAM" bug that SR-2 owners have from time to time, but after trying new B2B values, loosening my timings, and even resetting my CMOS I couldn't get my 2GB back. The system originally had no trouble displaying 12GB, but recently started showing 10GB...I thought it was something I had changed in the BIOS that had affected it, but after trying all the sticks in various configurations I have narrowed it down to a single stick.

And yes, the worst part is that the 2684 would have finished without issue if I had just let it continue, but I came home and my girlfriend said she wasn't getting any sound from the sound card, so I stopped folding while I fiddled with settings and tried out a few different media files. Sigh.

RMA time.
 
Last edited:
they both do the same thing. ctrl+C is just the shortcut for closing it.

That isn't exactly true. Ctrl+C stops the application running in cmd.exe. The "X" stops cmd.exe, which causes the application running in it to also stop. To see the difference, run cmd.exe, navigate to your folding directory, and start your folding execuable. If you then hit ctrl+C, folding will stop but the command window stays open. If you start folding by double-clicking on the folding execuable or a shortcut to it, ctrl+C and the "X" look the same. Ctrl-C stops the app, which stops cmd.exe. The "X" stops cmd.exe, which forces the app to stop as well. So they are different. Which is better? No idea. I always use ctrl+C myself.
 
Women! If you're not doing their type of "folding" they muck everything up :D

You gave her the best HTPC ever and this is how she thanks you?

Can you run in dual-channel mode for now, or are you going to RMA the whole kit? Sorry for your luck.


It turns out I have a bad stick of RAM: I thought I was just a victim of the "disappearing RAM" bug that SR-2 owners have from time to time, but after trying new B2B values, loosening my timings, and even resetting my CMOS I couldn't get my 2GB back. The system originally had no trouble displaying 12GB, but recently started showing 10GB...I thought it was something I had changed in the BIOS that had affected it, but after trying all the sticks in various configurations I have narrowed it down to a single stick.

And yes, the worst part is that the 2684 would have finished without issue if I had just let it continue, but I came home and my girlfriend said she wasn't getting any sound from the sound card, so I stopped folding while I fiddled with settings and tried out a few different media files. Sigh.

RMA time.
 
Women! If you're not doing their type of "folding" they muck everything up :D

You gave her the best HTPC ever and this is how she thanks you?

Can you run in dual-channel mode for now, or are you going to RMA the whole kit? Sorry for your luck.

I've never actually had to RMA and ram before, but I assume I'll have to RMA the whole kit. The whole "optimized for dual/triple channel" marketing thing is debatable, but when I look at the labels on the sticks I see that one kit's serial numbers end in 25/26/27 and the other has 56/57/58 so perhaps there's something to it.

I'll try and run in single channel mode for now, I guess.
 
Back
Top