CS:GO Multithreading

TheEschaton

Weaksauce
Joined
Aug 21, 2015
Messages
108
Just thought I'd post this here because it was a bit of an eye-opener for me, and I'm sure it probably will be for at least a few other people who have played Source games.

CS:GO is now heavily multithreaded. It used to be the case that CS:GO really used one big fat thread, with one or two supporting threads, and that was the end of it. But now this game most assuredly uses, if not all available threads, then at least a majority of them. I came by this information during my own testing.

Test Rig 1:
2x Intel Xeon E5345 (2x4 cores @ 2.33ghz, Core 2 era arch)
HP OEM Mobo
10gb FB-DIMM RAM (dual channel)
Nvidia GeForce GTX 760 2gb
2Tb WD Green Drive SATA III (5400rpm)
Windows 10 Preview

Test Rig 2:
Intel Core i3 4130 (2C/4T @ 3.4ghz, Haswell)
MSI H81 mobo
8gb DDR3 1600mhz (dual channel)
AMD Radeon R7 260X 2gb (XFX Ghost edition)
500gb WD SATA II drive (7200rpm)
ZorinOS Core 64-bit (Ubuntu Linux spin w/ proprietary drivers installed)

Both test rigs performed well with this game; minimum FPS never dropped below 60fps. Test Rig 1 actually had less stuttering than Test Rig 2, possibly due to the more powerful graphics configuration, while Test Rig 2 reached higher maximum framerates. The game indicated usage of at least 6 cores on the dual Xeon setup (a little difficult to tell because it jumped around quite a bit in Windows Task Manager), and for Haswell i3 all four threads were utilized fairly heavily, though not exactly equally (something like a 40%, 40%, 60%, 60% kind of load at rest, with some portions of the game jumping CPU load up near 80-90% on all threads).

That this was accomplished on two very different platforms, with very different CPUs and very different operating systems, without any config file tweaking on my part, pretty much proves that the game uses many threads to improve performance, and load across threads can be distributed fairly equally to promote smoother gameplay.

The results of this testing are totally the opposite of my original hypothesis, which was that this game would perform much better on the Core i3 platform.
 
This is actually pretty cool. It's nice to see games starting to use all the power that is available on PCs now days. I wish more games would follow this.

Edit: have you tried to manually set the core affinity to see what effect that would have on performance and compare that to this? I'd be interested to see that test on the Xeon rig. Also what about other source games with "Multicore threading"?
 
This is actually pretty cool. It's nice to see games starting to use all the power that is available on PCs now days. I wish more games would follow this.

Edit: have you tried to manually set the core affinity to see what effect that would have on performance and compare that to this? I'd be interested to see that test on the Xeon rig. Also what about other source games with "Multicore threading"?

For the most part they are. Most of the big new titles released since at least mid-2014, and lots of the little ones, use most or all of a computer's threads. It's been a big sea-change that I don't think most people have cottoned onto yet, but I've been following this closely because I am a big advocate of Xeon and AMD octocore CPUs, banking on the rise of just this kind of programming zeitgeist ever since I heard about the specs that the PS4 was releasing with and the first octocore phones started coming out.

The weird thing here is that an older game had another pass done on it to improve its multicore support. Valve is absolutely terrible at releasing products, but let it not be said that they do not support the crap out of whatever they do release!
 
Edit: have you tried to manually set the core affinity to see what effect that would have on performance and compare that to this? I'd be interested to see that test on the Xeon rig. Also what about other source games with "Multicore threading"?

I should have done that, but I didn't. I also have not tested other source games. Would you be willing to contribute to the effort? The Xeon rig is disassembled right now, so I don't really have the ability to do it myself, and the i3 rig is going to a customer within the next few days.

I could set it up on my main rig, but I'm guessing that my OC means it would not really matter how many cores I gave the game. I do know you can still play the game great on something like an Athlon X2 370K.

I should think a test of HL2, TF2, HL1:S and CSS would be in order if we wanted to be complete about this.
 
Yeah I might be able to help out I can run some passes this weekend when I have a bit more time.
 
I went from an OC'd Q6600 to a 6700k with the same GPU (7870) and my FPS basically tripled.

I went from probably about 130~ average to near 400~. My FPS is stupid.
 
I'm going to first admit that I didn't read the details because my attention span is currently limited due to alcohol consumption.

With that said, I saw CS:GO and CPU threading mentioned so I'll shoot my mouth off and say that Source engine games have typically been very CPU dependent. The engine is at least eleven years old and it probably wasn't until around six years ago that multi-threading was properly implemented. From what I've seen, it's limited to 300FPS, yet, the more CPU cores available, the better it performs.

I recently bought a 144Hz monitor and have [slightly] rekindled my interest in playing Team Fortress 2 since it can easily be rendered at 144FPS. Any modern GPU and 4-core CPU should be capable of running Source-based games at high framerates, if not 300FPS.
 
I went from an OC'd Q6600 to a 6700k with the same GPU (7870) and my FPS basically tripled.

I went from probably about 130~ average to near 400~. My FPS is stupid.

You have four additional threads, but also a much faster core and wildly different RAM and probably I/O as well. Can you tell us what the load per thread was, and the number if threads used?
 
You have four additional threads, but also a much faster core and wildly different RAM and probably I/O as well. Can you tell us what the load per thread was, and the number if threads used?

Sitting in CT Dust2 spawn, I have over 500 fps with 25~% CPU load total.

You can explicitly force the number of worker threads it tries to use with the -threads parameter. 8 is a solid chunk slower than the default (which is either 2 or 3).

Source 2 scales much more gracefully with more cores. I'd see nearly 100% on my Q6600 in DOTA 2 after the Reborn patch and my framerate probably close to _doubled_ on average.
 
25% load total tells us very little unfortunately. That could be two threads at 100%, or it could be all 8 of your threads at some fraction, or it could be something in-between.

All evidence points to CSGO not being on Source 2 engine, but I can tell you right now that the game looks like it uses all 12 of my threads:

This is at the menu, where everything looks normal:
https://drive.google.com/file/d/0B9g6EyuXHTnkYU5EelVZbWNMZFk/view?usp=sharing


This was at the peak of a bot fight on Dust2:
https://drive.google.com/file/d/0B9g6EyuXHTnkQ3d5NjBzVzZXVk0/view?usp=sharing

This was at the peak of some casual online play on a full server (16v16 or something near that):
https://drive.google.com/file/d/0B9g6EyuXHTnkYXRwTmVaSml0d0k/view?usp=sharing

As you can surmise from these pictures, the evidence that the CPU is using all of its threads to run this game is pretty good. The GPU's the limit, there aren't any changes in background programs from the time I took the main menu pic to the actual gameplay. Either CSGO hops threads from one core to another so often that it looks like multithreaded programming, or else (more likely IMHO) the game already scales quite well to core/thread count. It would explain the excellent gameplay experience I had on the Socket 771 Xeons (2.33ghz) anyway.
 
You can rest safely assured that it's not - you're seeing the OS scheduler doing its thing.
 
You can rest safely assured that it's not - you're seeing the OS scheduler doing its thing.

Not only that, there is so little in common between the rigs that it makes comparing them impossible. One is using Windows and Nvidia card and the other Linux with an AMD card.

The reason the one with the AMD card is not performing is because AMD's Linux drivers are notoriously shitty. Even if they were decent drivers, he's still comparing a completely different GPU and OS that handles graphics protocols and CPU scheduling differently.

Op needs to retry these tests with his Nvidia 760 and Windows on both. Then he will get some idea of what's going on.
 
Not only that, there is so little in common between the rigs that it makes comparing them impossible. One is using Windows and Nvidia card and the other Linux with an AMD card.

The reason the one with the AMD card is not performing is because AMD's Linux drivers are notoriously shitty. Even if they were decent drivers, he's still comparing a completely different GPU and OS that handles graphics protocols and CPU scheduling differently.

Op needs to retry these tests with his Nvidia 760 and Windows on both. Then he will get some idea of what's going on.

Actually, the OS differences are a help to the test, showing us that it is actually unlikely to be a scheduling issue - two different scheduling mechanisms showing basically the same result is no accident of a particular software.

FPS was not really important for this test since both were GPU limited and over 60fps
 
Actually, the OS differences are a help to the test, showing us that it is actually unlikely to be a scheduling issue - two different scheduling mechanisms showing basically the same result is no accident of a particular software.

FPS was not really important for this test since both were GPU limited and over 60fps
You are missing the point.
I'm telling you to only change 1 variable at a time to keep the testing consistent and eliminate the possibility of other variables messing with your conclusion. This is basic testing methodology.

If you really want to prove your point you would setup the tests like this:
-Both rigs on Win10 and same GPU...see what differences the CPU makes
-Both rigs on Linux and same GPU...compare results to previous test to see if the OS makes any further difference.
 
You are missing the point.
I'm telling you to only change 1 variable at a time to keep the testing consistent and eliminate the possibility of other variables messing with your conclusion. This is basic testing methodology.

If you really want to prove your point you would setup the tests like this:
-Both rigs on Win10 and same GPU...see what differences the CPU makes
-Both rigs on Linux and same GPU...compare results to previous test to see if the OS makes any further difference.

I think I see the confusion now. I should not have used the word "test" so liberally in the OP. The real reason I ran the game on these three systems was to get a general idea of the performance of each rig as I was purposing them, not to see what kind of engine was under the hood with CSGO. I only posted these results because I was so surprised that it seemed to run well with or without strong single core performance, when my hypothesis had been that it would be the single core oriented part of my ad-hoc test.

The result was enough for me to believe it, but I will be the first to admit I dont have the time or inclination to do it rigorously. Others are welcome to present more rigorous results here regardless of their conclusions, of course.
 
Easier way to do it is as mentioned, use the Xeon rig:

Set the Affinity to 2/4/6/8 et cetera for the exe on NVIDIA card - test each
Set the Affinity to 2/4/6/8 et cetera for the exe on AMD card - test each

done and will eliminate all the variables with different OS / CPU's and everything else.
 
Easier way to do it is as mentioned, use the Xeon rig:

Set the Affinity to 2/4/6/8 et cetera for the exe on NVIDIA card - test each
Set the Affinity to 2/4/6/8 et cetera for the exe on AMD card - test each

done and will eliminate all the variables with different OS / CPU's and everything else.

However, as mentioned, that rig is no longer in existence.
 
Back
Top