BOINC how to set priority for GPUs over CPU WU's

  • Thread starter Deleted member 88227
  • Start date
D

Deleted member 88227

Guest
I am running a CPU only project that I want to run only on CPUs (obviously) but I am also running a CPU/GPU project that I only want to run the GPU WU's. I have a few rigs that are multiple GPU setups and sometimes the CPU projects take up all the CPU slots and the GPU are left idling.

How can I set priority for GPU WU's and only allow the CPU projects to run on idle CPUs (that aren't being used by the GPU project)?
 
Process Lasso #1. I use this to keep my E@H tasks separate from CPU tasks. Letting Win7 handle it causes the threads to jump around too much and I get poor GPU utilization.

Two, if the GPUs project's app_config you can set how many CPU threads to reserve per GPU task. Say you have 8 CPU threads and want the CPU to have two threads for two tasks. See the the GPU app_config to use 1 CPU per task and any CPU project will get 6 CPU threads to crunch on.

Three, in the CPU project's app_config file you can see the max_concurrent. Doing #2 will basically do this by limiting the CPU threads.
 
I cant manage to get WCG to do anything less then 100% resource usage even after setting 2 different projects to 50% in BAM
 
Process Lasso #1. I use this to keep my E@H tasks separate from CPU tasks. Letting Win7 handle it causes the threads to jump around too much and I get poor GPU utilization.

Seems this is Windows only? I need something that will work with Linux also.

Two, if the GPUs project's app_config you can set how many CPU threads to reserve per GPU task. Say you have 8 CPU threads and want the CPU to have two threads for two tasks. See the the GPU app_config to use 1 CPU per task and any CPU project will get 6 CPU threads to crunch on.

This doesn't seem to do what I want. This just tells the project how many CPU cores to use for the GPU right? Which means their is still a chance that the CPU projects will take over and not leave anything for the GPU project to run.

Three, in the CPU project's app_config file you can see the max_concurrent. Doing #2 will basically do this by limiting the CPU threads.

Seems I would have to do this for every-single CPU project? Their isn't an overall setting I can use to cover them all?

Basically, for example. I have a 6 core, 12 thread CPU setup with 3x 980 Ti GPUs. Right now, their are 12 CPU tasks running project A and 0 GPU tasks running project B.
Project A is CPU only and project B is GPU only. This means I have 3x 980 Ti GPUs sitting idle doing nothing.

I know Project B can see and use those GPUs because if I pause, remove, detach Project A then the 3 GPUs will run without issue.
 
Basically, for example. I have a 6 core, 12 thread CPU setup with 3x 980 Ti GPUs. Right now, their are 12 CPU tasks running project A and 0 GPU tasks running project B.
Project A is CPU only and project B is GPU only. This means I have 3x 980 Ti GPUs sitting idle doing nothing.

The two projects should be playing nicely together and sharing resource 50/50 so this shouldn't be happening unless you have previously altered the resource share of one of the projects.

The resource share can be altered on the BAM > My Projects page or alternatively on the project's website under your account > preferences for this project. To force what you want you would set the cpu project resource share to zero which, in theory, tells it to only use unutilised threads.

From distant memory, the exception to this rule is WCG which just does whatever it wants unless you change it on their website. To change on the WCG site from your account go to settings > device manager > device profile > default (assumed) and then right at the bottom of the page change 'project weight' to zero.

You may have to alter the 'switch application every 60 minutes' but, sorry, I'm struggling to remember.
 
Resource share is set to 100 on all projects. I am running Collatz on the GPUs and TN-Grid on the CPUs. If I just let it run wild then eventually TN-Grid will take over Collatz and the rigs with multiple GPUs will sometimes only run 1 or 2 GPU tasks. I had to take TN-Grid (and all other CPU projects) off the GPU rigs so the GPUs will be fully utilized.

GPUs are way more important to keep running than the CPUs and I just can't figure out how to get projects to use the idle CPUs that they GPUs aren't using.
 
Also, as an afterthought, check that the setting to suspend GPU if computer is in use isn't set on the GPU projects website. This one's just a shot in the dark as I'm not sure that boinc considers itself as being 'in use'.
 
Try setting the TN-Grid resource share to zero.

Is it possible to do this on a per host basis? I am running the project on CPU only rigs and while it's the only project running I do intend to add other projects once I go through the list of all projects in FB to ensure they're all working and setup so I can switch to them when needed. If I set the resource share to 0 then they'll run with 0 resource share on those rigs as well and well, I don't want that.

Also, as an afterthought, check that the setting to suspend GPU if computer is in use isn't set on the GPU projects website. This one's just a shot in the dark as I'm not sure that boinc considers itself as being 'in use'.

Both settings are set to "no".
Suspend work while computer is in use? no
Suspend GPU work while computer is in use? Enforced by version 6.6.21+ no

I tried setting CPU usage to 75% on the 12 core rig so that the CPU projects would only run on 9 cores. This worked and a GPU project ran along with 9 TN-Grid projects, but that still left 2 GPU's idling.
 
From distant memory, the exception to this rule is WCG which just does whatever it wants unless you change it on their website. To change on the WCG site from your account go to settings > device manager > device profile > default (assumed) and then right at the bottom of the page change 'project weight' to zero.

You may have to alter the 'switch application every 60 minutes' but, sorry, I'm struggling to remember.

This fixed my issue. Now both WCG & POGs are at 50%.
 
Seems this is Windows only? I need something that will work with Linux also.



This doesn't seem to do what I want. This just tells the project how many CPU cores to use for the GPU right? Which means their is still a chance that the CPU projects will take over and not leave anything for the GPU project to run.



Seems I would have to do this for every-single CPU project? Their isn't an overall setting I can use to cover them all?

Basically, for example. I have a 6 core, 12 thread CPU setup with 3x 980 Ti GPUs. Right now, their are 12 CPU tasks running project A and 0 GPU tasks running project B.
Project A is CPU only and project B is GPU only. This means I have 3x 980 Ti GPUs sitting idle doing nothing.

I know Project B can see and use those GPUs because if I pause, remove, detach Project A then the 3 GPUs will run without issue.

1 - Linux isn't stupid and a process lasso type program isn't needed. I've ran 2x the number of CPU threads as the actual number of CPU threads (64 on a 2p 2760v1) with many NCI apps and 4x GPU threads. Zero issues feeding the GPUs. They get what they need.

2 - No. If you set 2 to reserve for GPUs then there is only 6 for CPUs no matter how many you download. Managing affinity is beyond BOINC though.

3. Yes, thats why its #3.

I had no issues in windows running 2x collatz and full CPUs in Win7. Process lasso can also set priority. GPU apps at Normal or Below normal and CPUs below that at Idle.
 
Is it possible to do this on a per host basis? I am running the project on CPU only rigs and while it's the only project running I do intend to add other projects once I go through the list of all projects in FB to ensure they're all working and setup so I can switch to them when needed. If I set the resource share to 0 then they'll run with 0 resource share on those rigs as well and well, I don't want that.



Both settings are set to "no".
Suspend work while computer is in use? no
Suspend GPU work while computer is in use? Enforced by version 6.6.21+ no

I tried setting CPU usage to 75% on the 12 core rig so that the CPU projects would only run on 9 cores. This worked and a GPU project ran along with 9 TN-Grid projects, but that still left 2 GPU's idling.

You can create several locations. Usually each project has a Home/School/Work location and computers can get the preferences per location. Some projects have more available locations than the 4 (including default).
 
Example of setting up the GPUs project's app_config?
 
Is it possible to do this on a per host basis? I am running the project on CPU only rigs and while it's the only project running I do intend to add other projects once I go through the list of all projects in FB to ensure they're all working and setup so I can switch to them when needed. If I set the resource share to 0 then they'll run with 0 resource share on those rigs as well and well, I don't want that.

You can set by location but altering the resource share shouldn't stop the cpu rigs if you are only running one project on them. It affects the relative priority to other projects, If TN-Grid is the only project they are running then it shouldn't affect them. You can set it to 10 to be safe. This would mean that other projects (if they exist) would have ten times more priority if they were set to 100.
 
You can set by location but altering the resource share shouldn't stop the cpu rigs if you are only running one project on them. It affects the relative priority to other projects, If TN-Grid is the only project they are running then it shouldn't affect them. You can set it to 10 to be safe. This would mean that other projects (if they exist) would have ten times more priority if they were set to 100.

It's only the only project for now. I am going through all the projects one by one to get them setup so I can easily switch between them when needed. Once I am done then I'll let most of them run and let BOINC control who gets what. This also means that my current problem will be effected then, too. Most likely since their will be a mix of CPU and GPU projects I run at once I don't want idle GPUs when this happens.

Seems mmonnin might have the solution. I may have misunderstood him, but it seems the GPUs project's app_config might allow me to reserve the number of CPU cores I need for my GPUs. While I may have to do this for each individual GPU project, if that's what it takes then I shall do it. Kind of hoping the app_config if the same contents so I can easily just create a script to download it from my web site for each project.
 
One for E@H that will reserve 1 CPU core for each GPU task and will run 2x GPU tasks per GPU.
Code:
<app_config>
  <app>
    <name>hsgamma_FGRPB1G</name>
     <gpu_versions>
      <gpu_usage>0.5</gpu_usage>
      <cpu_usage>1</cpu_usage>
     </gpu_versions>
  </app>
</app_config>

If you save it into the collatz folder the event log will tell you the correct app name. Altho I think it might still be collatz_sieve if my old app_config file works.
 
I also use just 0% and 100%. The 0% is a backup project in case the one I am running goes down for maint or runs out of work then the 0% will only get work when there is nothing else to do. Although I've heard others using a wide range of % if they want to run twice as much of one project vs another as an example.

I believe 2x at 100% will share the same as 2x at 50%. Eventually at least. It may take some time to get them split evenly by CPU time esp if they have different deadlines.
 
So if I wanted to just run 1 GPU task per CPU and GPU I'd do this?

Code:
<app_config>
  <app>
   <name>hsgamma_FGRPB1G</name>
    <gpu_versions>
     <gpu_usage>1</gpu_usage>
     <cpu_usage>1</cpu_usage>
    </gpu_versions>
  </app>
</app_config>

What does "<name>hsgamma_FGRPB1G</name>" mean and it is specific to a project? Meaning, can I copy this file to all projects that I want a GPU task to run on 1GPU and 1CPU. Actually I think this is what you're referring to as app name huh?

Also this will force the CPU slot to always be available to the GPU? Because when I look in BoincTasks I can see that the Collatz project is only utilizing around .9 CPU per 1GPU task.
 
So if I wanted to just run 1 GPU task per CPU and GPU I'd do this?

Code:
<app_config>
  <app>
   <name>hsgamma_FGRPB1G</name>
    <gpu_versions>
     <gpu_usage>1</gpu_usage>
     <cpu_usage>1</cpu_usage>
    </gpu_versions>
  </app>
</app_config>

What does "<name>hsgamma_FGRPB1G</name>" mean and it is specific to a project? Meaning, can I copy this file to all projects that I want a GPU task to run on 1GPU and 1CPU. Actually I think this is what you're referring to as app name huh?

Also this will force the CPU slot to always be available to the GPU? Because when I look in BoincTasks I can see that the Collatz project is only utilizing around .9 CPU per 1GPU task.

Yes, 1 and 1.

The <name> field is specific to each project's application, in that case E@H. Replace it with collatz_sieve instead of hsgamma. Save the file xml in the Project folder for Collatz. Here in Win7 and default install for Debian:
C:\ProgramData\BOINC\projects\boinc.thesonntags.com_collatz
/var/lib/projects/boinc.thesonntags.com_collatz

Tell the BOINC MGR to reread config files and it will find the new app_config file. Check the event log to make sure that it found your app_config file. If the <name> was incorrect, it will mention the correct name there. Update if needed. Then any new tasks will say 1 CPU and 1 GPU.

This will limit any CPU task to N-1 CPU threads since one is reserved for the GPU task whether the GPU task uses it or not.
 
Also, keep in mind that BOINC manager per its design will give priority to CPU over GPU. What this means is if you have a large cache built up and BM over reaches its capability to complete the work on time, it may very well suspend GPU work units in an attempt to squeeze every bit of CPU time it can to the "panic mode" work units. I have not kept up with the app_configs to test if they will override that behavior but have seen GPU's suspended in the past from it.
 
Also, keep in mind that BOINC manager per its design will give priority to CPU over GPU. What this means is if you have a large cache built up and BM over reaches its capability to complete the work on time, it may very well suspend GPU work units in an attempt to squeeze every bit of CPU time it can to the "panic mode" work units. I have not kept up with the app_configs to test if they will override that behavior but have seen GPU's suspended in the past from it.

Strange that you mention that because those tasks were highlighted in red and their status was "Running, High Priority" when it occurred.
 
Strange that you mention that because those tasks were highlighted in red and their status was "Running, High Priority" when it occurred.

Yeah what Gilthanis mentioned can happen. I don't know either if app_config will overrule it. You were in Panic mode though.
 
Back
Top