3-5 day call to arms for Rosetta@home (CPU's)

Gilthanis

[H]ard|DCer of the Year - 2014
Joined
Jan 29, 2006
Messages
8,718
We are participating in the BOINC Pentathlon. The final project was announced and it is the shortest category of it. Rosetta@home will only last 3 days. You can "bunker" work now if you prefer, but only work that validates during those 3 days will count. So, if you finish work before it starts, that is fine but you won't want it to validate until those dates begin.

Dates - 6/15-6/18
Team page - [H]ard|OCP

Pentathlon thread:
7th Annual BOINC Pentathlon (2016)

Work units typically want ~512MB RAM
 
I ran two Rosetta units and there is only 1 quorum needed, meaning no validation is required unlike the other four challenges. Is this true for all other WUs during the Pentathlon or it depends on WUs?
If no quorum is needed, let's load up the CPU and may the best team with the most hardware wins!
 
As of this moment, Rosetta has 1.3M WUs "In progress". That's hell of a lot WUs in progress. More than the other remaining three challenges combined. So there is a lot of bunkering going on now especially no validation is required, if I'm correct. This is my first time I'm running Rosetta, so I could be wrong.

Our anchor person Linden is "happily" crunching away at 65K points per day right now. Wondering if Linden would like to bunker....:sneaky:

If Universe is not able to validate my WUs at all later tonite, I'm planning to switch to Rosetta. I'm aware what we do here is for science in general but it's fun doing a bit of competition too.:)


Rosetta server status:
upload_2016-6-12_9-5-49.png





Primegrid server status:

upload_2016-6-12_9-22-40.png



CSG server status:
upload_2016-6-12_9-23-26.png



Universe server status:
upload_2016-6-12_9-25-6.png
 
Just started up Rosetta but a total noob at this. Joined the [H] team and got the client running and it seems to have picked up work.
9OCRkz0.jpg

Does CPU clock speed make a significant difference in Rosetta? Right now I'm running a 4930k at 4.5 but can bump it to 4.875 if I lower Ram frequency to 1600mhz.
 
Glad that you can join the pentathlon. I still learning as I go along. Here is what I do as there are other methods. I used "advanced view" in the BOINC manager. See below to go to "advanced view"
upload_2016-6-12_12-8-10.png

In the "advanced view", if you want to bunker, click on "Suspend network activity". Remember to turn on again to upload the completed task when the challenge starts on Jun 15 00:00:00 UTC date and time.
But before you do this you should have (a) one one project (i.e Rosetta) running on that PC, otherwise other project tasks will not be able to upload/download and (b) you have download enough WUs to keep you busy till the challenge starts.
upload_2016-6-12_12-9-50.png
 

Attachments

  • upload_2016-6-12_12-9-22.png
    upload_2016-6-12_12-9-22.png
    24.9 KB · Views: 30
If you are bunkering, you will want to make sure that you have enough work to last you for 2 days as well before disabling your networking. And glad to have you on board with us. Clock speed will matter as the more work units you complete the more points you will have. RAM speed is less important as CPU speed if you have to make a trade off. Just make sure that it is stable running Rosetta work at that clock speed before fully trusting it.
 
Glad that you can join the pentathlon. I still learning as I go along. Here is what I do as there are other methods. I used "advanced view" in the BOINC manager. See below to go to "advanced view"
View attachment 3985
In the "advanced view", if you want to bunker, click on "Suspend network activity". Remember to turn on again to upload the completed task when the challenge starts on Jun 15 00:00:00 UTC date and time.
But before you do this you should have (a) one one project (i.e Rosetta) running on that PC, otherwise other project tasks will not be able to upload/download and (b) you have download enough WUs to keep you busy till the challenge starts.
View attachment 3987
Thanks. This was pretty helpful. I set it for 5 days worth of work which came out to about 150 WU. Hopefully that'll be enough.

If you are bunkering, you will want to make sure that you have enough work to last you for 2 days as well before disabling your networking. And glad to have you on board with us. Clock speed will matter as the more work units you complete the more points you will have. RAM speed is less important as CPU speed if you have to make a trade off. Just make sure that it is stable running Rosetta work at that clock speed before fully trusting it.
Went up to 4.7. Thanks for the advice.
 
As of this moment, Rosetta has 1.3M WUs "In progress". That's hell of a lot WUs in progress. More than the other remaining three challenges combined. So there is a lot of bunkering going on now especially no validation is required, if I'm correct. This is my first time I'm running Rosetta, so I could be wrong.

Our anchor person Linden is "happily" crunching away at 65K points per day right now. Wondering if Linden would like to bunker....:sneaky:

If Universe is not able to validate my WUs at all later tonite, I'm planning to switch to Rosetta. I'm aware what we do here is for science in general but it's fun doing a bit of competition too.:)


Rosetta server status:
View attachment 3978




Primegrid server status:

View attachment 3979


CSG server status:
View attachment 3980


Universe server status:
View attachment 3981


Oh my. Almost 2.2M WUs are in progress. That's a lot of bunkering since yesterday. It appears that those who bunker the most may win (note: no WU validation is required).

Welcome to the bunkering rather than the sprinting competition.

Hoping to see more of [H]ard|OCP members joining in for the bunker, I mean sprint competition.

upload_2016-6-13_12-50-43.png
 
I should have set my Rosetta preference to longer CPU hours. Now, two of my PCs have just finished crunching the WUs. There is a limit of 100 WUs per CPU (not cores). I underestimated the timing. Those with VM machines can bunker a lot, I think.
 
I was not aware of the limitation, so will add that to our guides documentation. Now when you said CPU, I am assuming you meant per host? And yes, in that case, VM's will be very handy.
 
Oh my. Almost 2.2M WUs are in progress. That's a lot of bunkering since yesterday. It appears that those who bunker the most may win (note: no WU validation is required).

Welcome to the bunkering rather than the sprinting competition.

Hoping to see more of [H]ard|OCP members joining in for the bunker, I mean sprint competition.

View attachment 4028
II'm going to switch over this evening. The number of WUs pending validation is climbing quickly.
 
Just moved over ~12 systems to this. As expected, I "lost" Thing01, but still have Thing02... now if I can only get it to behave with this project. Started spitting out invalid WUs almost right away, so have reset it. Weird thing is, Thing02 is reporting ~23hrs per WU, whereas all the other systems are reporting 7-8hrs/per. Seems to be going down though. After seven minutes they're down to 21h48m. Do think the benchmark portion failed since things finished too fast; perhaps that's part of the inaccurate projection. :)
 
I think I know what you meant. When I started downloading it says remaining time (estimated) is 9+ hours but after crunching through, it only took about 2 hrs. I should have waited for a few WUs to be validated first just to confirm the timing. Glad that you can still run Thing02.

Now 2.8M WUs are in progress! This is a one big crazy bunker competition.

upload_2016-6-13_19-45-7.png
 
Hmm, 1M WUs in progress just disappeared in about 1.5 hours. I wonder what's happening?
upload_2016-6-13_21-16-3.png
 
I think... possibly... that some Android WU's may have been sent out to Windows/Linux machines. I saw something that seemed odd; but could be wrong.

Will say I'm getting quite frustrated trying to build up a bunker of these WUs. Got a couple systems that will HOPEFULLY stay busy for the next ~15 hours... after extensive baby sitting to manually request WUs every four minutes. Thing02 was being a ****ing bastard, but after resetting and then completely nuking BOINC from the system, it seems to be OK. (ETA for new WUs is only ~250% of reality @ 6hrs vs 2+ days before) One quad socket isn't getting new WUs at all, despite being 75% idle; am really hoping I don't have to do the nuke process on it as well.
 
6/13/2016 9:51:55 PM | rosetta@home | Rosetta Mini for Android is not available for your type of computer.

Just noticed one system has 535 tasks, so it's more than ready! Yet other systems are struggling to get enough to keep all cores busy. How the servers running the project decides how many WUs to send out, and to which computers, makes little to no sense.
 
Please go to your Rossetta account and click on "Computers on this account" view.

upload_2016-6-13_21-58-52.png


Then click on Thing02 in the Computer ID column. In my 2600K, there is a line stating "maximum daily WU quota per CPU" = 100/day. I wonder if this might be it considering that you have something like 128 threads? Just guessing.
upload_2016-6-13_21-59-52.png
 
Full stats for you. Am thinking about turning HT back on and setting it to 128 cores and seeing what happens.


Thing02-64.JPG
 
Oh, and the server has requested WUs more than six times. Other odd thing: some systems are getting 25-50 WUs at a go, this one is getting 5-10. Strange.

*edit* Had to engage in a little trickery, yet the results paid off. Obtained ~175 more WUs in short order. Much as I wish this box had at least 400WUs in the hopper right now, I'll take it.
 
Last edited:
Yes, turn HT on. You will see higher output at Rosetta with it on. Not sure what is happening with your rigs. If you run into more issues, you may want to copy and paste some event logs.
 
I noticed that one of my right has 100 WU limits and the other is not.
Just moved over ~12 systems to this. As expected, I "lost" Thing01, but still have Thing02... now if I can only get it to behave with this project. Started spitting out invalid WUs almost right away, so have reset it. Weird thing is, Thing02 is reporting ~23hrs per WU, whereas all the other systems are reporting 7-8hrs/per. Seems to be going down though. After seven minutes they're down to 21h48m. Do think the benchmark portion failed since things finished too fast; perhaps that's part of the inaccurate projection. :)
Seems like you are ready to do B52-type of carpet bombing in less than 8hrs time (y)(y)(y)
 
I'm not sure if this "bunkering" is a good idea, but I'm all aboard doing some crunching for science! I've participated in Rosetta@Home for several years now and glad to help.

Public service announcement: Please make sure your rig is stable at 100% CPU (or GPU) before joining this effort. Corrupt data or fried CPUs are of no value to the project. :)
 
"bunkering" isn't something the team supports or condones officially. Most projects (especially the large ones) roll with the punches ok. And for the Pentathlon's it is pretty much a staple of the challenge. I myself am not a fan, but the projects don't typically say one way or another on their concerns with it.
 
This is the first, perhaps the second, time I've tried to "bunker". Truthfully it's been a pain in my ass. ;) Several systems went AWOL while I was trying to deal with them remotely, but will see if I can whip them into shape once I have a chance. Normally I just throw a crap load of sockets at my project of choice and simply let things go as they will. However since this is the last, and shortest, of all the events... I'm willing to try and play along by bunkering... just this once.
 
Bunkering is an art that requires the knowledge of how the project behaves. It also helps to have ran work before bunkering so that your client is smart enough to know how to handle the work units and how long to estimate for that work. Some projects let you download tons or work to where you could get 10 days of cached work. Others only allow a few work units per core to prevent a lot of work being downloaded and then abandoned. If a project has multiple applications, that can throw your client off as well as the client learns to estimate based on project rather than each work unit (unless something has changed). So, if a project has small work units and then you download some monster work units, your client may over commit and think it can download a ton and finish on time. Then it panics and starts pausing other work in hopes to complete on time even when all hope is gone. This (by design) will even cause the client to suspend GPU work in hopes of squeezing every bit of CPU out for CPU work units to complete. Planning the bunkering is also an art. You have to make sure that you have the right time as most challenges are based on UTC. Then you don't want to drop the bunker right off the bat. You want to kinda stagger it if possible so that others don't crash the server while you are trying to upload yours. Some people will edit their cc_config files to limit their transfers to just 1 at a time (and so on) to limit the possibility of losing their bunker. Larger machines would certainly take a lot longer to offload, but that is part of the game. Small projects are more likely to puke during bunkering. Large projects typically role pretty well with it. The bad part is when people use bunkering in a shady way. Such an example is to prevent others from validating during the challenge. Some members will load up VM's with full caches at the beginning of the challenge and then shut the VM's down so that those that did bunker have a harder time validating. They may continue to do that if they are ahead to prevent others from validating new work. Some will even do this on a project that is hard to get work just so that others can't complete it. Meanwhile their host keeps chugging away on what work it can. Very shady biz but all part of the strategies.
 
We started this portion of the Pentathlon in 10th place. We really need more people to jump in. Rosetta does not need wingmen, so if any work is turned in during the next <3 days, it should count.
 
However many of my WUs are slowly being uploaded with a limit of 100KB. 3,000 maybe? Am still trying to look into a few systems that went AWOL now that it's the end of the day. One of the quads did nothing all night and another dual system had some odd issue... now it's only getting a few WUs.
 
BTW, I've seen BOINC report that communication has been deferred for days. Might be worth checking your boxes for.
 
So another one of my quad sockets was stuck at bogus RAID controller BIOS error all this time. Been fixed, but no bunker to speak of from it. :( Another system won't let me login; can only surmise I made a stupid typo last night while I was tired. Still tracking down problem system #3 while juggling field issues. :p (EDIT - Found system #3... uploaded another 235 WUs in addition to the ones below)

What bunker I was able to build up last night with ~10 systems was good for around 140,000 points. Not sure if that's 100% accurate or how that stacks up, but there you have it. :)

Edit #2 - finally got my locked out little piggy back online. What a serious PITFA! System is set to UEFI, which doesn't allow my favorite password nuker to boot. Reset system, set to Legacy, boot nuker, bitches about Windows not being cleanly shut down, reboot to UEFI, shut server down via single press of power button, reboot to Legacy, nuker boots, nuke password, reboot to UEFI and finally logged in! Very very carefully type in new password... whew.

Tweaked allowable HDD space on yet another quad socket and it's running at full load. Think all is well now and all systems are hammering away. :p

Edit #3 - So I lied! Took me a while to find the server that wasn't coming online... I have a lot of systems in this small area and the fact it wasn't powered on threw me off. Seems the $1K+ RAID controller in this brand new system has apparently bit the dust. Has a capacitor backup, so takes a while to discharge. Will let it sit overnight and hopefully it'll be back online tomorrow. If not, well, I have a few other things I can throw at this, time permitting.
 
Last edited:
This is highest position (4th) that we have achieved as a team since the Pentathlon 2016 started. :D

Very rare, so I took a snapshot of this to motivate those still on the fence.

Can we keep up the momentum going?:rolleyes:

Will there be bunker dumping by other teams soon?

Any [H]ard|OCP members with B52 type of machine stored in a closet? Wingman is not needed this time.

Let see how we do tomorrow...

upload_2016-6-14_21-26-49.png
team needs you!



upload_2016-6-14_21-13-6.png
 
As a test I re-enabled HT on the quad E7 box, then told BOINC to use all 128 "CPUs", yet Rosetta refused to use more than 64 of them. Weird. :(
 
The BOINC event log? It reported 128 CPUs and that it was emulating 128 CPUs.
 
Well that is good. Did Rosetta's server replies give any clues? Like not requesting tasks (Not needed not highest priority, etc...)
 
This morning, I'm unable to run all my 24 threads for the 2695 rig. Didn't have time to fix this (gotta go to work) until now. I'm guessing that some of the Rosetta Mini work units require almost 1GB of disk space. See below. Believe me I've plenty of disk space.
upload_2016-6-15_21-43-35.png


What I did is to increase the max Disk usage and all the threads are now able to run. Just reporting in case this might be of help to you. BTW, I think the default value is 10 GB. Since you have 128 threads, you need more than 10GB or whichever is more restrictive in this option menu.
upload_2016-6-15_21-47-43.png
 
Last edited:
pututu: I went though some similar issues with disk space as well. On a few of my quad boxes the C: drive is only 100GB, so told BOINC to leave some small amount of space left as use the rest, problem solved. :p

Just did some tinkering on Thing02 and, for the moment at least, it's utilizing 97% of 128 processors. We'll see how long that holds up. :)
 
fastgeek: One would expect a 4p system to have more disk space. Your RAM is 1TB and disk space is 0.1TB :jawdrop:

I never expected some Rosetta WUs to consume 1GB of disk space. :confused:

We have passed half way mark through the sprint challenge. We were off the block pretty good (at 4th) but kind of stumbling at 9th position now.
 
Heh. Thing02 has 400GB OS and 22TB data, so it's fine. :p Some different 4P systems only have 100GB OS and "only" have 256GB RAM. Of course, now that I think about it, I could make a RAM disk... but meh. :bag: Normally they have 40-80TB of space attached to them, but been lazy! :shame:
 
Back
Top