Work Graphs coming to a DirectX update near us soon!

Lakados

[H]F Junkie
Joined
Feb 3, 2014
Messages
10,471
https://overclock3d.net/news/gpu-di...mance-gains-using-work-graphs-with-radeon-gpu
https://developer.nvidia.com/blog/advancing-gpu-driven-rendering-with-work-graphs-in-direct3d-12/
https://developer.nvidia.com/blog/work-graphs-in-direct3d-12-a-case-study-of-deferred-shading/
https://github.com/NVIDIAGameWorks/donut_examples/tree/main/examples/work_graphs
https://overclock3d.net/news/softwa...ure-could-make-future-games-less-cpu-limited/

With the proliferation of GPU-driven rendering techniques – such as Nanite in Unreal Engine 5 – the role of the CPU is trending towards primarily resource management and hazard tracking, with only a fraction of time spent generating GPU commands. Prior to D3D12 Work Graphs, it was difficult to perform fine-grained memory management on the GPU, which meant it was practically impossible to support algorithms with dynamic work expansion. Even simple long chains of sequential compute work could result in a significant synchronization and memory overhead.

GPU-driven rendering was accomplished by the CPU having to guess what temporary allocations were needed by the GPU, often over-allocating to the worst case, and using previous frame readback for refinement. Any workloads with dynamic expansion either meant issuing worst case dispatches from the CPU, having the GPU early out of unnecessary work, or non-portable techniques were used, like persistent threads.

With Work Graphs, complex pipelines that are highly variable in terms of overall “shape” can now run efficiently on the GPU, with the scheduler taking care of synchronization and data flow. This is especially important for producer-consumer pipelines, which are very common in rendering algorithms. The programming model also becomes significantly simpler for developers, as complex resource and barrier management code is moved from the application into the Work Graph runtime.

We have been advocating for something like this for a number of years, and it is very exciting to finally see the release of Work Graphs.

DX12 with the latest Agility update supports them but it will be a good while before games start using them as they have to be built with this rendering method in mind.
 
That's great, but games are mostly GPU-limited these days. Maybe this will reduce sudden frame rate drops?

I expect this to have about as much impact as DX12 did (make slow systems less slow, no impact on mid-fast systems), or that DirectStorage has.
 
That's great, but games are mostly GPU-limited these days. Maybe this will reduce sudden frame rate drops?

I expect this to have about as much impact as DX12 did (make slow systems less slow, no impact on mid-fast systems), or that DirectStorage has.
Well a 5800x paired with a 7900xtx got a 39% performance increase in the AMD demo and the Nvidia demo’s out there show significantly better frame times.
 
Last edited:
That's great, but games are mostly GPU-limited these days. Maybe this will reduce sudden frame rate drops?

I expect this to have about as much impact as DX12 did (make slow systems less slow, no impact on mid-fast systems), or that DirectStorage has.

This is a good thing, we (usually) want the GPU doing as much work as it possibly can.

It's way easier to optimize for being GPU bound than it is CPU bound - and it's straight up easier to replace the actual hardware.

Half the time you're CPU bound you're practically screwed. Reducing the graphics is going to have little to no effect. You can run at a lower resolution, turn off post processing, etc. etc. and you're still stuck. Versus if you're GPU bound as shit more often than not you dial upscaling a notch and oh you have an immediate performance gain.
 
That's great, but games are mostly GPU-limited these days. Maybe this will reduce sudden frame rate drops?
In the dragon game, cpu seem to have quite the effect, without needing a 4090 to show.
https://www.pcgamer.com/games/rpg/dragons-dogma-2-performance-analysis/#section-1440p-performance

1440p with a 7800xt you go from 55 avg fps/38, 1%low to 73/55 with a 7900x, lot of the newest games on a lot of cpu could be leaving 50% of what the gpu could do.

4070ti go from 57/25 to 89/69 from a 9700k to a 14700KF, nearly tripling your lows. In the city where a lot of people-stuff happen it is even more dramatic.

And some stuff that should have made the situation even more GPU bound like Raytracing seem to be asking a lot of the CPU
 
This is a good thing, we (usually) want the GPU doing as much work as it possibly can.

It's way easier to optimize for being GPU bound than it is CPU bound - and it's straight up easier to replace the actual hardware.

Half the time you're CPU bound you're practically screwed. Reducing the graphics is going to have little to no effect. You can run at a lower resolution, turn off post processing, etc. etc. and you're still stuck. Versus if you're GPU bound as shit more often than not you dial upscaling a notch and oh you have an immediate performance gain.
Also, the CPU doesn't necessarily know what the GPU is asking for or what it is trying to do, it's guessing 99% of the time and it's rarely right.

You want the GPU to be as busy as possible and it will always keep itself busy because of how the pipelines work. Still, often it is busy with filler literal dead noise just so it can keep things passed correctly because the CPU provided too little information. Or sometimes the CPU sends too much and it forces the GPU to stop stand up ditch something they are halfway through make a new request for less then start again.
One scenario is leaving performance on the table, and the other is how you get frame hiccups.
In either case, the GPU is asking the CPU to send it resources and allocate work, this works to further remove the CPU from the equation and lets it handle its own tasks.

This is good, it further leads to a separation of work and decouples the respective performance impacts. In theory, it also leads to an easier programming methodology for game engines as they self-optimize their workflows and it becomes less of a developer-side process and places it back in the hands of the GPU driver teams. This could ultimately simplify their jobs as well because it should lessen the amount of per-game optimizations they need to do.
It's a win-win and I completely understand why developers have been asking for this to be a thing for the past few years.
 
Back
Top