• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

Strange problem with core

Nathan_P

[H]ard DCOTM x3
Joined
Mar 2, 2010
Messages
3,522
As the title says, as you can see from the log a WU will start, run for about 15 minutes and then restart the core, any idea what could be causing this? Ubuntu 10.10 as per the guide running kernel 2.6.35-30-generic-ck, v6 client and kraken 0.6


Code:
[14:55:03] Preparing to commence simulation
[14:55:03] - Looking at optimizations...
[14:55:03] - Created dyn
[14:55:03] - Files status OK
[14:55:08] - Expanded 57213893 -> 71843392 (decompressed 50.5 percent)
[14:55:08] Called DecompressByteArray: compressed_data_size=57213893 data_size=71843392, decompressed_data_size=71843392 diff=0
[14:55:08] - Digital signature verified
[14:55:08] 
[14:55:08] Project: 6904 (Run 1, Clone 18, Gen 103)
[14:55:08] 
[14:55:08] Assembly optimizations on if available.
[14:55:08] Entering M.D.
                         :-)  G  R  O  M  A  C  S  (-:

                   Groningen Machine for Chemical Simulation

                            :-)  VERSION 4.5.3  (-:

        Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
      Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra, 
        Gerrit Groenhof, Peter Kasson, Per Larsson, Pieter Meulenhoff, 
           Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schulz, 
                Michael Shirts, Alfons Sijbers, Peter Tieleman,

               Berk Hess, David van der Spoel, and Erik Lindahl.

       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
            Copyright (c) 2001-2010, The GROMACS development team at
        Uppsala University & The Royal Institute of Technology, Sweden.
            check out http://www.gromacs.org for more information.


                               :-)  Gromacs  (-:

Reading file work/wudata_07.tpr, VERSION 4.5.4-dev-20110530-cc815 (single precision)
[14:55:16] Mapping NT from 24 to 24 
Starting 24 threads
Making 2D domain decomposition 6 x 4 x 1
starting mdrun 'Overlay'
26000000 steps, 104000.0 ps (continuing from step 25750000, 103000.0 ps).
[14:55:21] Completed 0 out of 250000 steps  (0%)
[15:11:48] ng M.D.
[15:11:54] Using Gromacs checkpoints
                         :-)  G  R  O  M  A  C  S  (-:

                   Groningen Machine for Chemical Simulation

                            :-)  VERSION 4.5.3  (-:

        Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
      Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra, 
        Gerrit Groenhof, Peter Kasson, Per Larsson, Pieter Meulenhoff, 
           Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schulz, 
                Michael Shirts, Alfons Sijbers, Peter Tieleman,

               Berk Hess, David van der Spoel, and Erik Lindahl.

       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
            Copyright (c) 2001-2010, The GROMACS development team at
        Uppsala University & The Royal Institute of Technology, Sweden.
            check out http://www.gromacs.org for more information.


                               :-)  Gromacs  (-:

[15:11:58] Mapping NT from 24 to 24 
Reading file work/wudata_07.tpr, VERSION 4.5.4-dev-20110530-cc815 (single precision)
Starting 24 threads

Reading checkpoint file work/wudata_07.cpt generated: Fri Apr 13 16:10:28 2012


Making 2D domain decomposition 6 x 4 x 1
starting mdrun 'Overlay'
26000000 steps, 104000.0 ps (continuing from step 25750875, 103003.5 ps).
[15:12:27] Resuming from checkpoint
[15:12:28] Verified work/wudata_07.log
[15:12:28] Verified work/wudata_07.trr
[15:12:28] Verified work/wudata_07.xtc
[15:12:28] Verified work/wudata_07.edr
[15:12:31] Completed 875 out of 250000 steps  (0%)

NOTE: Turning on dynamic load balancing

[15:39:16] Completed 2500 out of 250000 steps  (1%)
[16:20:30] Completed 5000 out of 250000 steps  (2%)
[17:01:46] Completed 7500 out of 250000 steps  (3%)
[17:43:02] Completed 10000 out of 250000 steps  (4%)
[18:24:14] Completed 12500 out of 250000 steps  (5%)
[19:05:30] Completed 15000 out of 250000 steps  (6%)
[19:46:57] Completed 17500 out of 250000 steps  (7%)
[20:28:23] Completed 20000 out of 250000 steps  (8%)
[21:09:41] Completed 22500 out of 250000 steps  (9%)
 
The kraken starting DLB

Check thekraken's config file to see if you have autorestart=1 in there.

If it is in there it is restarting the client like Grandpa said.
This generally causes enough imbalance to get DLB to engage.
 
Check thekraken's config file to see if you have autorestart=1 in there.

If it is in there it is restarting the client like Grandpa said.
This generally causes enough imbalance to get DLB to engage.

I'll have a look but if that is the case then DLB is doing its job, as you can see in the log virtually no load imbalance.
 
I use ubantu operating system in core mothre board..And so far i not face this kind of problem..when ever i find solution of your problem, i will be share with you...Thanks
 
Yes, that looks exactly like the auto restart function of The Kraken. It was a surprise to me when I first saw it as well. If you used musky's guide to set up a new machine recently, you will notice that he updated it a bit ago with The Kraken v0.6, which includes the auto restart function.
 
Back
Top