Trouble with Project 3062?

HighYield

Gawd
Joined
Feb 2, 2008
Messages
944
Anybody having an early_unit_end error with the 3062 work unit? My system ran through about 43% and gave a "warning: long 1-4 interactions" message.
[01:45:05] CoreStatus = 7B (123)
[01:45:05] Client-core communications error: ERROR 0x7b
After that it loaded up a 2653 unit and went back to work.

I did recently boost the OC on my Q6600 from 2.88 to 3.2 GHz (Asus Commando mobo 9x356). Bios Vcore is about 1.325V and Everest reads about 1.25V. Temp is about 52C. The memory timings are pretty slack at 5-5-5-15 (the ram is rated at 4-4-4-12).

 
I've had some funky errors on my E4500.

Should have copied them down.

After 5x reboots and 24 hours (I have to sleep and work) I cleaned the USB stick and it got a new WU and is humming along fine.

 
I had some SMP funkyness aswell, I just deleted the work and got a new WU because it kept getting stuck. Chugging along again now

 
Some of the new SMP WUs are unstable but most of them seem to run OK, if a little low on the PPD scale.

 
Some of the new SMP WUs are unstable but most of them seem to run OK, if a little low on the PPD scale.


no kidding :p

My headless Q6600 had one of it's two instances hung when I came home from work. I knew it wasn't the hardware :p

 
no kidding :p

My headless Q6600 had one of it's two instances hung when I came home from work. I knew it wasn't the hardware :p
If hardware has been running for months without a hitch and then Stanford releases new WUs that start crashing, we know where the problem likely lies. BTW, what is headless?
 
Glad to know I'm not the only one having problems on occasion. I would hate to slow down that CPU any more as Kendrak is already putting the hurt onto me as it is. I see I am not even on his threat list anymore.....

 
Can you paste the run, clone and gen numbers ? I can check this on FCF to see if others have the same issue. Some WU is inherently bad so they never complete on any computer (2652 units is notorious for that).

 
Glad to know I'm not the only one having problems on occasion. I would hate to slow down that CPU any more as Kendrak is already putting the hurt onto me as it is. I see I am not even on his threat list anymore.....
If your system(s) have been running well for a while, chances are that the problem has its origins with the WU. There's no reason to suspect hardware unless your CPU cooling devices have started to malfunction. If everything tests well, then it's the WU.

Some WU is inherently bad so they never complete on any computer (2652 units is notorious for that).
The P2652 was very bad. Luckily, Stanford doesn't issue these anymore.

The P3062 WUs seems to be great for PPD. Comparable or better even than the P2653 WUs. I only have one P3062 currently being processed by my machines. If there is indeed a problem, I hope Stanford will get it to run better.
 
those 3062's have been completing just fine for me. Its a PITA since it drops PPD dramatically but no EUE's here.
 
those 3062's have been completing just fine for me. Its a PITA since it drops PPD dramatically but no EUE's here.
It's really weird. I have one P3062 running on one of my socket-A AMD systems, and the PPD increased slightly vs that of a P2653. On my Intel dual quad, the PPD dropped immensely. Is there a vast variation between P3062 WUs? My 2.2GHz dual Barton machine is getting an average of 39 minutes per frame vs 22 minutes on my 2.5GHz dual quad. With the P2653s, its 45 minutes vs 12 minutes, respectively. Considering the IPC gulf between the two architectures, that's weird. Very weird. :confused:
 
I have 2 identical E2180 systems running right now, one has a 2653 protein and is at 1076PPD and the other system running the 3062 is at 705PPD. running XP and fah affinity changer on both.




 
I have 2 identical E2180 systems running right now, one has a 2653 protein and is at 1076PPD and the other system running the 3062 is at 705PPD. running XP and fah affinity changer on both.





yep.... some of the SMP have diff points.

 
All SMP work units are benchmarked at 200 PpDpGhz.
If you can run faster than that, its a bonus and be happy ............ :)

If your Box/Cpu has a faster/better sub-system than the benchmark box then you'll gain in terms of PpDpGhz.
It could be cpu cache, memory speed, etc, etc.
But it all adds up to a bigger PpD gain with some protien on some hardware, as against other protiens on the same hardware or other protiens on the same hardware.

Luck ............ :D
 
Crap crap crap.. I just picked up 2 3065 projects adn they are F'n slow!!! 18 minutes per 1%, and valued at 2144 points each... is projected to take 1d 4hrs at current rates.... :( That's on my 3.6ghz Quad... hopefully the first 3% is not going to be what the rest of the project does... oye...

Ah well its for a good cause :) definitely hope they speed up SSE boost is on :) woah 2.5M steps... hehehe
 
Crap crap crap.. I just picked up 2 3065 projects adn they are F'n slow!!! 18 minutes per 1%, and valued at 2144 points each... is projected to take 1d 4hrs at current rates.... :( That's on my 3.6ghz Quad... hopefully the first 3% is not going to be what the rest of the project does... oye...
It's my experience that some of the new projects process faster as the WU progresses, while others don't change much. The P2653s by comparison seem to be very consistent from the first frame on. YMMV.

Ah well its for a good cause :) definitely hope they speed up SSE boost is on :) woah 2.5M steps... hehehe
Speaking of which, anyone try Intel Penryn-generation architecture with these new WUs?
 
I just had an EUE on one of the P3062 WUs my systems were working on. Fortunately, the WU was only 7% complete. This system has been stable with the P2653 WUs. So, it would appear that there is some problem with stability. I hope it's not another P2652 situation.

 
I just had an EUE on one of the P3062 WUs my systems were working on. Fortunately, the WU was only 7% complete. This system has been stable with the P2653 WUs. So, it would appear that there is some problem with stability. I hope it's not another P2652 situation.


Well that sucks, hope you all are reporting the issue to standford so they fix it.
 
I just had an EUE on one of the P3062 WUs my systems were working on. Fortunately, the WU was only 7% complete. This system has been stable with the P2653 WUs. So, it would appear that there is some problem with stability. I hope it's not another P2652 situation.


Is the machine prime stable? I always check that first before I start folding. 12-24 hours of prime can save me a lot more time later on as I don't have to worry about my overclock and I will not get EUEs unless it is a bad protein.

I realize that prime is not the end all be all, but it usually catches most bad overclocks.

I'm not saying your system isn't stable as the 2653s and 2605s are tougher on machines from what I've seen. If you watch your temps, more than likely you will see higher temps when processing those projects. On my old E6400 I would notice a minimum of 3C higher temps when running those projects versus any of the other ones. I would think if your rig can handle them no problem, then it's likely stable. I would still try some stress testing if you have not already done so, though.

 
Is the machine prime stable? I always check that first before I start folding. 12-24 hours of prime can save me a lot more time later on as I don't have to worry about my overclock and I will not get EUEs unless it is a bad protein.

I realize that prime is not the end all be all, but it usually catches most bad overclocks.

I'm not saying your system isn't stable as the 2653s and 2605s are tougher on machines from what I've seen. If you watch your temps, more than likely you will see higher temps when processing those projects. On my old E6400 I would notice a minimum of 3C higher temps when running those projects versus any of the other ones. I would think if your rig can handle them no problem, then it's likely stable. I would still try some stress testing if you have not already done so, though.
Thanks for the advice SmokeRngs. I might have some of my systems tested for stability in the near future. This system is a dual quad and it's running several SMP clients simultaneously. Also, it has been running without a glitch for months since I built it. The EUE might be the very first occurrence thus far on this particular machine. It has proved quite stable with its current OC frequency, but all I have been running on it were P2653 WUs up until recently.
 
Also if it is air cooled check the temps... and if you have a log of temps even better... I log my temps via Core Temp. This way if it errors I can see if it was hot or what caused it.

 
I am having nothing but grief with p2653(4-194-77)
After about 5 EUE, I set everything back to just about stock speeds, still EUE.
Update, I give up, couple more tries same result. Flushing WU
 
Well I just got handed 3 of these so I will tell you by the end of today how they run.. Sofar I'm in to them about 38% and pretty smooth, slower than a 2653, but no issues so far.

 
If you get a work unit that EUE's then look in you work folder for a results file.
If there is one in there, then you can use Qfix to try and repair the queue.dat file so you can send it in.
Not only will you get some points for the work but also you tell Stanford that there may be a problem with that work unit.
Full Instructions for useing Qfix Here.

I also report them in This section of the support forums.

Luck .............. :D
 
A few days ago I started this thread talking about the following WU crapping out on me: 3062 (Run 3, Clone 15, Gen 55). It ran 43% and gave a "warning: long 1-4 interactions" message.
[01:45:05] CoreStatus = 7B (123)
[01:45:05] Client-core communications error: ERROR 0x7b
After that it loaded up a 2653 unit and went back to work.

After it got done with that 2653 it reloaded the same exact 3062 (Run 3, Clone 15, Gen 55) and it again stalled at 43% with the same error message. I thought if it was going to reload the same WU it would have done it right way and not alternated with something else in between. Since the deadline has now passed for that WU am I in the clear now from getting it again?

 
Usually, you get 3 retries with the same WU before getting the next.

 
After it got done with that 2653 it reloaded the same exact 3062 (Run 3, Clone 15, Gen 55) and it again stalled at 43% with the same error message. I thought if it was going to reload the same WU it would have done it right way and not alternated with something else in between. Since the deadline has now passed for that WU am I in the clear now from getting it again?
Always check the deadline date on the WU when your client DL a new one after it crashes or produces an error. One of the monitoring tools will help in this regard. It has happened to me a few times on slower systems where a client would DL the same WU after an error, but there was simply not sufficient time left to complete it on those particular systems.
 
...................
After it got done with that 2653 it reloaded the same exact 3062 (Run 3, Clone 15, Gen 55) and it again stalled at 43% with the same error message. I thought if it was going to reload the same WU it would have done it right way and not alternated with something else in between. Since the deadline has now passed for that WU am I in the clear now from getting it again?

Fraid not.
See this This thread or This about some of the multiple runins I've had with the same bad work units.
My record is 6x over 2 months with the same result each time.
So you will have a chance of downloading the same bad work unit untill its taken off the servers.
Can you get Qfix.exe to send a partial result in to tell Stanford that there may be a bad work unit in the wild.
I've just reported my first EUE in 2 months on the support forums, p3051(3,26,63), again to give Stanford a headup that something might be wrong with it.

Luck .............. :D
 
I completed without issue all three 3062's and I picked up 1 more and a 3064 on my E6600... 1414PPD on project 3064 on a machine that normally cranks out 2200? /shakes fist at the random WU's! LOL

 
Back
Top