• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

Weird Error

extide

2[H]4U
Joined
Dec 19, 2008
Messages
3,494
Anyone seen this before?




NOTE: Turning on dynamic load balancing

[22:51:47] Completed 205000 out of 500000 steps (41%)

step 73706199: Water molecule starting at atom 166260 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.

Step 73706200:
The charge group starting at atom 166260 moved than the distance allowed by the domain decomposition (1.429701) in direction X
distance out of cell 5.516729
Old coordinates: 11.755 9.418 0.131
New coordinates: 19.081 7.671 2.350
Old cell boundaries in direction X: 9.179 13.564
New cell boundaries in direction X: 9.105 13.564

-------------------------------------------------------
Program Gromacs, VERSION 4.5.3
Source code file: /vspm58/VM/fah-converted/mnt/fah_windows_build/LinuxBuilds/gromacs-4.5.3/src/mdlib/domdec.c, line: 4117

Fatal error:
A charge group moved too far between two domain decomposition steps
This usually means that your system is not well equilibrated
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

Thanx for Using GROMACS - Have a Nice Day

[22:53:55] mdrun returned 255
[22:53:55] Going to send back what have done -- stepsTotalG=500000
[22:53:55] Work fraction=356.4518 steps=500000.
[22:53:59] logfile size=29453 infoLength=29453 edr=25 trr=1
[22:53:59] logfile size: 29453 info=29453 bed=25 hdr=1
[22:53:59] - Writing 29991 bytes of core data to disk...
[22:53:59] Done: 29479 -> 6644 (compressed to 22.5 percent)
[22:53:59] ... Done.
[22:55:37]
[22:55:37] Folding@home Core Shutdown: UNSTABLE_MACHINE
[22:55:37] CoreStatus = 7A (122)
[22:55:37] Sending work to server
[22:55:37] Project: 6099 (Run 8, Clone 47, Gen 147)


[22:55:37] + Attempting to send results [April 9 22:55:37 UTC]
[22:55:37] - Reading file work/wuresults_02.dat from core
[22:55:37] (Read 7156 bytes from disk)
[22:55:37] Connecting to http://128.143.231.202:8080/
[22:55:37] Posted data.
[22:55:37] Initial: 0000; Conversation time very short, giving reduced weight in bandwidth avg
[22:55:37] - Uploaded at ~15 kB/s
[22:55:37] - Averaged speed for that direction ~191 kB/s
[22:55:37] + Results successfully sent
[22:55:37] Thank you for your contribution to Folding@Home.
 
Either legitimate WU problem or.... OC/memory corruption issue.

If you feel like investigating and have spare copy of the client you could re-run the WU
and see if it gives error in the same spot.

If so -- bad WU, if not -- hardware issue.

Alternatively, you can ask in the FF (or ask ChelseaOilman) if this WU has been completed
by someone else.

EDIT: oops, it took me 4 hours to click "Post" :eek:
 
That is on a box that is not overclocked. I will ask in the FF forum then.

Usually when it's an unstable OC I have seen different errors anyways, so I though this was kinda interesting.
 
Anyone seen this before?

NOTE: Turning on dynamic load balancing

[22:51:47] Completed 205000 out of 500000 steps (41%)

step 73706199: Water molecule starting at atom 166260 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate...........

Yes. Today. Right after the bigadv server being down for a while. I thought it was just my machine but apparently not.

Fortunately my crash was after only 4%, not after 41% like you.



 
Yes. Today. Right after the bigadv server being down for a while. I thought it was just my machine but apparently not.

Fortunately my crash was after only 4%, not after 41% like you.




Woah, this also happened on my bigadv box, it happened to grab that wu when the server was down, crash, then picked up another 6901.

 
Alternatively, you can ask in the FF (or ask ChelseaOilman) if this WU has been completed
by someone else.

Hi extide (team 33),
Your WU (P6099 R8 C47 G147) was added to the stats database on 2012-04-09 16:07:44 for 0 points of credit.

No one else yet.
 
Hi extide (team 33),
Your WU (P6099 R8 C47 G147) was added to the stats database on 2012-04-09 16:07:44 for 0 points of credit.

No one else yet.

Awesome, I will keep an eye on this box. I just put it together on Saturday, but most of the parts I have been using for a while. So far it has completed 1 6901 successfully, failed on that 6099, and now picked up a second 6091 and is working on that.
 
Last time I remember getting an error like that it said something like 'your system has exploded' :eek:

As I recall, it turned out to be a bad WU.
 
The WU has now been finished by someone else. It's not a bad WU.

Your WU (P6099 R8 C47 G147) was added to the stats database on 2012-04-10 07:07:41 for 12665.9 points of credit.
 
Well that box has been crunching a 6901 all night without error so I am not sure, hrmm. I'll still keep watch of it though.
 
oh no ...


Making 2D domain decomposition 6 x 4 x 1
starting mdrun 'Overlay'
15000000 steps, 60000.0 ps (continuing from step 14973250, 59893.0 ps).
[22:28:41] Resuming from checkpoint
[22:28:44] Verified work/wudata_08.log
[22:28:45] Verified work/wudata_08.trr
[22:28:46] Verified work/wudata_08.xtc
[22:28:46] Verified work/wudata_08.edr
[22:28:47] Completed 223250 out of 250000 steps (89%)

NOTE: Turning on dynamic load balancing

[22:45:37] Completed 225000 out of 250000 steps (90%)
[23:09:37] Completed 227500 out of 250000 steps (91%)
[23:33:41] Completed 230000 out of 250000 steps (92%)
Warning: 1-4 interaction between 119064 and 1242424 at distance 8.000 which is larger than the 1-4 table size 2.700 nm
These are ignored for the rest of the simulation
This usually means your system is exploding,
if not, you should increase table-extension in your mdp file
or with user tables increase the table size
 
Back
Top