Trouble with Project 3062?

HighYield · Mar 10, 2008

Anybody having an early_unit_end error with the 3062 work unit? My system ran through about 43% and gave a "warning: long 1-4 interactions" message.
[01:45:05] CoreStatus = 7B (123)
[01:45:05] Client-core communications error: ERROR 0x7b
After that it loaded up a 2653 unit and went back to work.

I did recently boost the OC on my Q6600 from 2.88 to 3.2 GHz (Asus Commando mobo 9x356). Bios Vcore is about 1.325V and Everest reads about 1.25V. Temp is about 52C. The memory timings are pretty slack at 5-5-5-15 (the ram is rated at 4-4-4-12).

Kendrak · Mar 10, 2008

I've had some funky errors on my E4500.

Should have copied them down.

After 5x reboots and 24 hours (I have to sleep and work) I cleaned the USB stick and it got a new WU and is humming along fine.

onetwenty8k · Mar 10, 2008

I had some SMP funkyness aswell, I just deleted the work and got a new WU because it kept getting stuck. Chugging along again now

APOLLO · Mar 10, 2008

Some of the new SMP WUs are unstable but most of them seem to run OK, if a little low on the PPD scale.

Kendrak · Mar 10, 2008

APOLLO said:
Some of the new SMP WUs are unstable but most of them seem to run OK, if a little low on the PPD scale.

no kidding

My headless Q6600 had one of it's two instances hung when I came home from work. I knew it wasn't the hardware

APOLLO · Mar 10, 2008

Kendrak said:
no kidding

My headless Q6600 had one of it's two instances hung when I came home from work. I knew it wasn't the hardware

If hardware has been running for months without a hitch and then Stanford releases new WUs that start crashing, we know where the problem likely lies. BTW, what is headless?

Kendrak · Mar 10, 2008

APOLLO said:
If hardware has been running for months without a hitch and then Stanford releases new WUs that start crashing, we know where the problem likely lies. BTW, what is headless?

"Headless" is running a boxen without a monitor.

http://www.hardforum.com/showpost.php?p=1027699975&postcount=2

HighYield · Mar 11, 2008

Glad to know I'm not the only one having problems on occasion. I would hate to slow down that CPU any more as Kendrak is already putting the hurt onto me as it is. I see I am not even on his threat list anymore.....

Xilikon · Mar 11, 2008

Can you paste the run, clone and gen numbers ? I can check this on FCF to see if others have the same issue. Some WU is inherently bad so they never complete on any computer (2652 units is notorious for that).

HighYield · Mar 11, 2008

The WU that had an EARLY_UNIT_END was Project: 3062 (Run 3, Clone 15, Gen 55).

Xilikon · Mar 11, 2008

The closest I found is : http://foldingforum.org/viewtopic.php?f=19&t=1561&start=0&st=0&sk=t&sd=a

However, no report about that exact WU. Since you said you recently overclocked, it's very possible it's not 100% stable so raise the voltage a bit and see if it stabilize.

APOLLO · Mar 11, 2008

HighYield said:
Glad to know I'm not the only one having problems on occasion. I would hate to slow down that CPU any more as Kendrak is already putting the hurt onto me as it is. I see I am not even on his threat list anymore.....

If your system(s) have been running well for a while, chances are that the problem has its origins with the WU. There's no reason to suspect hardware unless your CPU cooling devices have started to malfunction. If everything tests well, then it's the WU.

Xilikon said:
Some WU is inherently bad so they never complete on any computer (2652 units is notorious for that).

The P2652 was very bad. Luckily, Stanford doesn't issue these anymore.

The P3062 WUs seems to be great for PPD. Comparable or better even than the P2653 WUs. I only have one P3062 currently being processed by my machines. If there is indeed a problem, I hope Stanford will get it to run better.

Connundrum1 · Mar 11, 2008

those 3062's have been completing just fine for me. Its a PITA since it drops PPD dramatically but no EUE's here.

APOLLO · Mar 11, 2008

Connundrum1 said:
those 3062's have been completing just fine for me. Its a PITA since it drops PPD dramatically but no EUE's here.

It's really weird. I have one P3062 running on one of my socket-A AMD systems, and the PPD increased slightly vs that of a P2653. On my Intel dual quad, the PPD dropped immensely. Is there a vast variation between P3062 WUs? My 2.2GHz dual Barton machine is getting an average of 39 minutes per frame vs 22 minutes on my 2.5GHz dual quad. With the P2653s, its 45 minutes vs 12 minutes, respectively. Considering the IPC gulf between the two architectures, that's weird. Very weird.

Connundrum1 · Mar 11, 2008

I have 2 identical E2180 systems running right now, one has a 2653 protein and is at 1076PPD and the other system running the 3062 is at 705PPD. running XP and fah affinity changer on both.

Kendrak · Mar 11, 2008

Connundrum1 said:
I have 2 identical E2180 systems running right now, one has a 2653 protein and is at 1076PPD and the other system running the 3062 is at 705PPD. running XP and fah affinity changer on both.

yep.... some of the SMP have diff points.

Tigerbiten · Mar 11, 2008

All SMP work units are benchmarked at 200 PpDpGhz.
If you can run faster than that, its a bonus and be happy ............

If your Box/Cpu has a faster/better sub-system than the benchmark box then you'll gain in terms of PpDpGhz.
It could be cpu cache, memory speed, etc, etc.
But it all adds up to a bigger PpD gain with some protien on some hardware, as against other protiens on the same hardware or other protiens on the same hardware.

Luck ............

Sunin · Mar 11, 2008

Crap crap crap.. I just picked up 2 3065 projects adn they are F'n slow!!! 18 minutes per 1%, and valued at 2144 points each... is projected to take 1d 4hrs at current rates....

That's on my 3.6ghz Quad... hopefully the first 3% is not going to be what the rest of the project does... oye...

Ah well its for a good cause

definitely hope they speed up SSE boost is on

woah 2.5M steps... hehehe

APOLLO · Mar 12, 2008

Sunin said:
Crap crap crap.. I just picked up 2 3065 projects adn they are F'n slow!!! 18 minutes per 1%, and valued at 2144 points each... is projected to take 1d 4hrs at current rates.... That's on my 3.6ghz Quad... hopefully the first 3% is not going to be what the rest of the project does... oye...

It's my experience that some of the new projects process faster as the WU progresses, while others don't change much. The P2653s by comparison seem to be very consistent from the first frame on. YMMV.

Ah well its for a good cause definitely hope they speed up SSE boost is on woah 2.5M steps... hehehe

Speaking of which, anyone try Intel Penryn-generation architecture with these new WUs?

APOLLO · Mar 12, 2008

I just had an EUE on one of the P3062 WUs my systems were working on. Fortunately, the WU was only 7% complete. This system has been stable with the P2653 WUs. So, it would appear that there is some problem with stability. I hope it's not another P2652 situation.

Sunin · Mar 12, 2008

APOLLO said:
I just had an EUE on one of the P3062 WUs my systems were working on. Fortunately, the WU was only 7% complete. This system has been stable with the P2653 WUs. So, it would appear that there is some problem with stability. I hope it's not another P2652 situation.

Well that sucks, hope you all are reporting the issue to standford so they fix it.

SmokeRngs · Mar 12, 2008

APOLLO said:
I just had an EUE on one of the P3062 WUs my systems were working on. Fortunately, the WU was only 7% complete. This system has been stable with the P2653 WUs. So, it would appear that there is some problem with stability. I hope it's not another P2652 situation.

Is the machine prime stable? I always check that first before I start folding. 12-24 hours of prime can save me a lot more time later on as I don't have to worry about my overclock and I will not get EUEs unless it is a bad protein.

I realize that prime is not the end all be all, but it usually catches most bad overclocks.

I'm not saying your system isn't stable as the 2653s and 2605s are tougher on machines from what I've seen. If you watch your temps, more than likely you will see higher temps when processing those projects. On my old E6400 I would notice a minimum of 3C higher temps when running those projects versus any of the other ones. I would think if your rig can handle them no problem, then it's likely stable. I would still try some stress testing if you have not already done so, though.

APOLLO · Mar 12, 2008

SmokeRngs said:
Is the machine prime stable? I always check that first before I start folding. 12-24 hours of prime can save me a lot more time later on as I don't have to worry about my overclock and I will not get EUEs unless it is a bad protein.

I realize that prime is not the end all be all, but it usually catches most bad overclocks.

I'm not saying your system isn't stable as the 2653s and 2605s are tougher on machines from what I've seen. If you watch your temps, more than likely you will see higher temps when processing those projects. On my old E6400 I would notice a minimum of 3C higher temps when running those projects versus any of the other ones. I would think if your rig can handle them no problem, then it's likely stable. I would still try some stress testing if you have not already done so, though.

Thanks for the advice SmokeRngs. I might have some of my systems tested for stability in the near future. This system is a dual quad and it's running several SMP clients simultaneously. Also, it has been running without a glitch for months since I built it. The EUE might be the very first occurrence thus far on this particular machine. It has proved quite stable with its current OC frequency, but all I have been running on it were P2653 WUs up until recently.

Sunin · Mar 12, 2008

Also if it is air cooled check the temps... and if you have a log of temps even better... I log my temps via Core Temp. This way if it errors I can see if it was hot or what caused it.

Air Cooled 420 · Mar 13, 2008

I am having nothing but grief with p2653(4-194-77)
After about 5 EUE, I set everything back to just about stock speeds, still EUE.
Update, I give up, couple more tries same result. Flushing WU

Sunin · Mar 13, 2008

Well I just got handed 3 of these so I will tell you by the end of today how they run.. Sofar I'm in to them about 38% and pretty smooth, slower than a 2653, but no issues so far.

Tigerbiten · Mar 13, 2008

If you get a work unit that EUE's then look in you work folder for a results file.
If there is one in there, then you can use Qfix to try and repair the queue.dat file so you can send it in.
Not only will you get some points for the work but also you tell Stanford that there may be a problem with that work unit.
Full Instructions for useing Qfix Here.

I also report them in This section of the support forums.

Luck ..............

HighYield · Mar 13, 2008

A few days ago I started this thread talking about the following WU crapping out on me: 3062 (Run 3, Clone 15, Gen 55). It ran 43% and gave a "warning: long 1-4 interactions" message.
[01:45:05] CoreStatus = 7B (123)
[01:45:05] Client-core communications error: ERROR 0x7b
After that it loaded up a 2653 unit and went back to work.

After it got done with that 2653 it reloaded the same exact 3062 (Run 3, Clone 15, Gen 55) and it again stalled at 43% with the same error message. I thought if it was going to reload the same WU it would have done it right way and not alternated with something else in between. Since the deadline has now passed for that WU am I in the clear now from getting it again?

Xilikon · Mar 13, 2008

Usually, you get 3 retries with the same WU before getting the next.

APOLLO · Mar 13, 2008

HighYield said:
After it got done with that 2653 it reloaded the same exact 3062 (Run 3, Clone 15, Gen 55) and it again stalled at 43% with the same error message. I thought if it was going to reload the same WU it would have done it right way and not alternated with something else in between. Since the deadline has now passed for that WU am I in the clear now from getting it again?

Always check the deadline date on the WU when your client DL a new one after it crashes or produces an error. One of the monitoring tools will help in this regard. It has happened to me a few times on slower systems where a client would DL the same WU after an error, but there was simply not sufficient time left to complete it on those particular systems.

Tigerbiten · Mar 13, 2008

HighYield said:
...................
After it got done with that 2653 it reloaded the same exact 3062 (Run 3, Clone 15, Gen 55) and it again stalled at 43% with the same error message. I thought if it was going to reload the same WU it would have done it right way and not alternated with something else in between. Since the deadline has now passed for that WU am I in the clear now from getting it again?

Fraid not.
See this This thread or This about some of the multiple runins I've had with the same bad work units.
My record is 6x over 2 months with the same result each time.
So you will have a chance of downloading the same bad work unit untill its taken off the servers.
Can you get Qfix.exe to send a partial result in to tell Stanford that there may be a bad work unit in the wild.
I've just reported my first EUE in 2 months on the support forums, p3051(3,26,63), again to give Stanford a headup that something might be wrong with it.

Luck ..............

Sunin · Mar 14, 2008

I completed without issue all three 3062's and I picked up 1 more and a 3064 on my E6600... 1414PPD on project 3064 on a machine that normally cranks out 2200? /shakes fist at the random WU's! LOL

Trouble with Project 3062?

Gawd

[H]ard|DCer of the Year 2009

2[H]4U

[H]ard|DCer of the Month - March 2009

[H]ard|DCer of the Year 2009

[H]ard|DCer of the Month - March 2009

[H]ard|DCer of the Year 2009

Gawd

[H]ard|DCer of the Year 2008

Gawd

[H]ard|DCer of the Year 2008

[H]ard|DCer of the Month - March 2009

Gawd

[H]ard|DCer of the Month - March 2009

Gawd

[H]ard|DCer of the Year 2009

[H]ard|DCer of the Month - February 2007/January 2

[H]ard|DCer of the Month - August 2008

[H]ard|DCer of the Month - March 2009

[H]ard|DCer of the Month - March 2009

[H]ard|DCer of the Month - August 2008

[H]ard|DCer of the Month - April 2008

[H]ard|DCer of the Month - March 2009

[H]ard|DCer of the Month - August 2008

Gawd

[H]ard|DCer of the Month - August 2008

[H]ard|DCer of the Month - February 2007/January 2

Gawd

[H]ard|DCer of the Year 2008

[H]ard|DCer of the Month - March 2009

[H]ard|DCer of the Month - February 2007/January 2

[H]ard|DCer of the Month - August 2008