• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

holy crap what happened

ICE_9

2[H]4U
Joined
Feb 28, 2005
Messages
3,459
I am currently running F@H on two processors. One is an HT 4.11 using two programs, the other is a 2.6 northwood. They dropped 11 WU today alone. Doses this happen normally?

 
How far are you trying to overclock the systems? If you have enough RAM available, you may be getting multiple instances of QMD cores, and yes you will crash often, in this scenario.

 
no crashing. just 11WU in a day from 2 logical processors and one HT processor.

I was averaging about 1-2 a day. did my computer suddenly find a cure or something?
 
No, it is probably choking on work units. Your computer doesn't need to be freezing up for it to fail from an overclock with FAH. If you are overclocking there is probably something that is either failing, or some element has changed (your case getting dusty, temperature of room is higher, etc.)

FAH is usually the first thing to be irritated by an overclock.


 
ok. when I mean "dropped" 11 WU I mean "completed". sorry for the confusion. :)
 
How many points did that 11 WU account for? If a WU fails prematurely, but sends back its results, it still shows up for completed but maybe only as a few points. What you are describing looks like this kind of case.
 
I've taken a look at your daily production graph and things appear to be somewhat in order. It looked like, from what I can tell, that you've had a couple of tinker units that completed correctly. The thing that concers me is yesterday at 3PM you turned in 6 WU's for 276 points. I know Stanford has been pushing some smaller units, but that seems a little out of sorts. You might want to look at FAHlog.txt in the same directory as your console and see if there is any information pertaining to premature unit end. If there is, you might consider running some system diagnostics like Prime95, or memtest. I'd give it a little longer before that though, seeing as how you have posted units to completion, and Stanford on occasion produces units that will blow up regardless of the circumstance. FYI, the graphs of your production I used are here.

Also, welcome to the team! :D

 
[16:10:31]
[16:10:31] Loaded queue successfully.
[16:10:32] Initialization complete
[16:10:32] + Benchmarking ...
[16:10:35]
[16:10:35] + Processing work unit
[16:10:35] Core required: FahCore_78.exe
[16:10:35] Core found.
[16:10:35] Working on Unit 06 [April 13 16:10:35]
[16:10:35] + Working ...
[16:10:36]
[16:10:36] *------------------------------*
[16:10:36] Folding@Home Gromacs Core
[16:10:36] Version 1.80 (March 16, 2005)
[16:10:36]
[16:10:36] Preparing to commence simulation
[16:10:36] - Looking at optimizations...
[16:10:37] - Files status OK
[16:10:37] - Expanded 383470 -> 1904957 (decompressed 496.7 percent)
[16:10:37]
[16:10:37] Project: 1411 (Run 7, Clone 55, Gen 4)
[16:10:37]
[16:10:37] Assembly optimizations on if available.
[16:10:37] Entering M.D.
[16:10:57] (Starting from checkpoint)
[16:10:57] Protein: p1411_Q26x3 in water
[16:10:57]
[16:10:57] Writing local files
[16:10:57] Completed 1782363 out of 2500000 steps (71)
[16:10:58] Extra SSE boost OK.
[16:18:45] Gromacs cannot continue further.
[16:18:45] Going to send back what have done.
[16:18:45] logfile size: 166400
[16:18:46] - Writing 166936 bytes of core data to disk...
[16:18:46] ... Done.
[16:18:46]
[16:18:46] Folding@home Core Shutdown: EARLY_UNIT_END
[16:18:48] CoreStatus = 72 (114)
[16:18:48] Sending work to server


[16:18:48] + Attempting to send results
[16:18:54] + Results successfully sent
[16:18:54] Thank you for your contribution to Folding@Home.

Here is a snipit I found. I guess I was having an overclock issue. I have backed it off since then due to some minor instability.

Nice boost though! :D
 
If you're not stable, DON'T FOLD.

Go run Prime95 for 24 hours on the maximum heat and power dissipation test.

If you can't make it work at your current clock, turn it down.

Again, I repeat:

If you're not stable, DON'T FOLD.

You are screwing with our science.
 
In my experience, Prime95 won't even get everything, neither will running FAH as some WUs stress different parts of the CPU.

It sucks that you sent in a couple crappy WUs. They will be covered by other people as the FAH servers will send them out to other systems, but you have corrected the problem and are contributing again.

The important thing is to correct any problem as quickly as possible so that you are wasting cycles in search of science. People who overclock (me included) have a responsibility to be more vigilant and ensure that they aren't sending back cr4p.
 
Hito Bahadur said:
In my experience, Prime95 won't even get everything, neither will running FAH as some WUs stress different parts of the CPU.

Prime95 will get a lot more than Folding will. Naysay all you want. This is a pretty standard stability test. Prime 95 consistently gets my cpu temps 3C hotter than folding does. Rigs that are 24 hour Folding stable are not necessarily 24 hour Prime95 stable, however I've never seen it the other way around. Every rig that I've gotten thru 24 hours of Prime has never once coughed, burped or went weewee over folding. I have seen rigs that could fold for days on end choke on prime95 after five minutes. God KNOWS what sort of crap they were turning in.

The point here is, use something other than folding to test your stability.
 
ICE_9 said:
ok. when I mean "dropped" 11 WU I mean "completed". sorry for the confusion. :)

here we go using this new age hippity hop language on the message boards again. confusing everyone..... :rolleyes:


 
Another test program you can use is "StressCPU" that you can download from Larry's page at www.em-dc.com. It focuses more on the SSE instructions, so it should do a better job of simulation the stress that folding puts on a processor.
 
I've seen people with like 1000+WU "completed" and only ~3000points, kinda sad when every single WU dies on the first frame :(

 
Actually back in the early days, and BillR or someone can confirm this, the values of WU's were drastically lower than they are currently, so that's part of the reason why you've seen that.

 
p[H]ant0m said:
Actually back in the early days, and BillR or someone can confirm this, the values of WU's were drastically lower than they are currently, so that's part of the reason why you've seen that.


Yes, a couple years back, WU's were only worth about 1-3 points apiece. When stanford changed the point scheme, the change pretty much threw out all of the previous WU's completed (in terms of points anyway). Especially since you could match your point total with 10 more WU's or less
 
Yup.... the point values have grown a good bit... But if you just started, and you have 40 WUs in a day, for 5 points.... there's a problem somewhere......


Keep on Folding!!

 
Everything seems to be running correctly now. the log output is recording everything. I believe a couple factors of too high of overclock, and a recient temp rising (AZ getting hot again) was probably an issue. Either way, the problem is corrected. Folding again without crapping out the system.

Thanks to everyone for all your suggestions.:D

 
Good to hear you're squared away! And keep quiet about the temps rising, we don't want those Aussie spies knowing its getting too warm up here... /looks around suspiciously :p

 
p[H]ant0m said:
Good to hear you're squared away! And keep quiet about the temps rising, we don't want those Aussie spies knowing its getting too warm up here... /looks around suspiciously :p

Warm??!! It just snowed a week ago here in SE Utah. Granted it was about an inch of powder, and melted by noon but it still snowed. :)
 
Logikality said:
Warm??!! It just snowed a week ago here in SE Utah. Granted it was about an inch of powder, and melted by noon but it still snowed. :)

how about the fact that less than two weeks ago we(cleveland) got 6-7inches. about 350,000 people lost power. humph.

finally nice though, hopefully we'll see some leaves by the end of next week.
 
Dark Ember said:
Yes, a couple years back, WU's were only worth about 1-3 points apiece. When stanford changed the point scheme, the change pretty much threw out all of the previous WU's completed (in terms of points anyway). Especially since you could match your point total with 10 more WU's or less
the prob is that there are still a few people turning in 1-2 points/WU
 
yes it looks like my A64 wasn't playing nice with some of the amber and gromacs core. I switched to the console client last night, we will see when I get home if that solves my problems. I even dropped my O/C to nothing and still got errors, so I think I was having problems with the graphical client itself. At the previous O/C, I was prime torture test stable for as long as I'd let the client run, so I really don't think the problem was my O/C, but it may have had something to do with actually using my A64 for playing games without shutting down F@H I'll know in about an hour and a half.



see what he means?
 
actually that may be it, I looked at all the stat pages, I haven't turned in a WU today, which is actually a good thing since I'm not expecting one to be done before about 2 am tomorrow morning :)
 
Ronbo said:
yes it looks like my A64 wasn't playing nice with some of the amber and gromacs core. I switched to the console client last night, we will see when I get home if that solves my problems. I even dropped my O/C to nothing and still got errors, so I think I was having problems with the graphical client itself. At the previous O/C, I was prime torture test stable for as long as I'd let the client run, so I really don't think the problem was my O/C, but it may have had something to do with actually using my A64 for playing games without shutting down F@H I'll know in about an hour and a half.



see what he means?
didn't mean to pick on you. You were just the first non-dead member I found on the list with funky looking stats. Good luck getting things worked out.
 
Worth mentioning, but I had a recent problem with stability too, and.. RAM that had been seated just fine for about a year or year and a half its their RAM slots ended up being the culprit. No overclock at all, and I tried everything, and my last step before ordering a new pair of 512's was taking them out, and putting them back in.

It worked.

FAH is happy, Memtest is happy, I'm happy. :D

I have a feeling there's a dozen little things that could drive FAH cores crazy, just like that.

 
NP Neisius, I had noticed the problem last night so after my CoH session I changed over to the console client, so far so good.

 
Back
Top