Intel Core i9-9900KS Review: The Fastest Gaming CPU Bar None

Lets use my Cinebench screenshot as an example of multi threaded application vs single threaded application. If you were to render that image using a single core with the same level of performance, you would need a processor running at a speed that is never going to happen on silicon technology. We're at the limits, and both makes of processor offer impressive single threaded performance - In the real world there's very little between them.

We have no choice but to overcome the current scheduler issues and begin splitting applications into threads if we want to continue to ramp speeds faster. As shown in my example regarding Doom 2016, game developers are already beginning to apply such logic to games using modern API's/engines. Something that really wasn't possible regarding older versions of DirectX/OGL.

I feel you think I’m saying everything should be single threaded. That’s not at all what I’m saying. I am saying that per-thread performance is still quite important, and improvements are absolutely the focus of quite a lot of research spending.

This is complimentary to having more cores, improving schedulers, and increasing parallelism in software, not contradictory.
 
Lets use my Cinebench screenshot as an example of multi threaded application vs single threaded application.

This is a really bad idea- that type of rendering is about as close as you're going to get to a real-world infinitely parallelizable workload. As in, there's very little branching logic, and work like that should really be done on specialized hardware i.e. SSE, AVX, GPUs.

That benchmark exists solely to provide a real-world workload that runs float code.

What we're talking about with single thread performance is code that is branching and has dependency chains where the workload cannot be logically done in parallel. Game logic is one example and is accessible, which is why we're talking about it relative to the 9900KS.
 
This is a really bad idea- that type of rendering is about as close as you're going to get to a real-world infinitely parallelizable workload. As in, there's very little branching logic, and work like that should really be done on specialized hardware i.e. SSE, AVX, GPUs.

That benchmark exists solely to provide a real-world workload that runs float code.

What we're talking about with single thread performance is code that is branching and has dependency chains where the workload cannot be logically done in parallel. Game logic is one example and is accessible, which is why we're talking about it relative to the 9900KS.

And yet while Cinebench's utilization is, naturally, better - It's implementation is literally identical to my Doom 2016 example. Furthermore, as stated, I'm getting great performance in a CPU limited scenario at a speed well under 4Ghz.

People are missing the point - There's no point arguing that single core performance is still paramount when it's obvious it's not going to scale much faster than it's currently at due to the limitations of the technology used.
 
  • Like
Reactions: kac77
like this
Your biggest problem with mechanical HDD's and performance in most scenarios is NTFS and the NT kernel - Both are overdue for retirement. It's also one of the biggest problems with multi threaded application for most.

Remember, you said I could discuss it. ;)

Oh good, here you are again claiming that hard drives run fast under linux and slow under windows but, hard drives are slow, period, regardless of the OS or file system.
 
No, even a supercomputer would be slow with a single HDD in loading and storing data into memory.
A single HDD is far more than a bottleneck on even low power and embedded systems at this point, let alone a workstation with a powerful CPU.

Also, you can totally discuss it here. :p
Once the data has been loaded from the HDD into RAM, the "experience" can be fast; it is the wait-time for the data to move from the HDD to RAM that is abysmal.

2TB and lower, HDDs are completely obsolete, even at cost.
For 3TB and above, HDDs are still good at storage, but not for OS and day-to-day usage, let alone databases or enterprise outside of WORM media.

Well, still use 2 x 1TB HDD's and 3 x 2TB HDD's in 2 of my three computers. However, I have had those for years and see no reason to stop using them, since they still work fine. No, they are not a boot drive, however.
 
I feel you think I’m saying everything should be single threaded. That’s not at all what I’m saying. I am saying that per-thread performance is still quite important, and improvements are absolutely the focus of quite a lot of research spending.

This is complimentary to having more cores, improving schedulers, and increasing parallelism in software, not contradictory.
No one is saying IPC isn't important. But it's a fact that IPC is at it limits.

The same designer of ryzen is now hired by Intel to drive things farther but you have to look at the the total. Sure 5% more IPC seems huge but not if you have an adjacent core sitting there idle.

As a person who spent at least 5 years of my life developing games there's allot of low hanging fruit just sitting there in multi-cored systems.

It's really the next path until a serious change in x86 architecture.

I can tell you for a fact ray tracing depends on this more than ever. Unless some major development happens between now and then.
 
I haven't had a ten minute bootup on a computer since the IMSAI 8080.
Something is seriously wrong if your 8080 and 8" FDD (or HDD) was taking 10 minutes to boot.
Those normally booted within 10-15 seconds, absolute max after full checks and booting CP/M or any other OS.

Do you still have the IMSAI? I'll make you an offer! ;)
Not of I get to it first! :D
 
And yet while Cinebench's utilization is, naturally, better - It's implementation is literally identical to my Doom 2016 example. Furthermore, as stated, I'm getting great performance in a CPU limited scenario at a speed well under 4Ghz.

People are missing the point - There's no point arguing that single core performance is still paramount when it's obvious it's not going to scale much faster than it's currently at due to the limitations of the technology used.
I don't think they are going to get it. Literally it was only two years ago when people thought 4 cores were enough. 6 cores is now the sweet spot. In another year 8 cores will be standard.

There's a reason for this and that reason is that right now it's easier to add cores than to increase IPC.

If Intel could have done it then it would have. But it's not.., for reasons that have to do with current packaging technology. Even Intel knows this which is why it's working on stacking tech until x86 is dead.

Intel has been hinting at it since itanium.

Google working on quantum (and Intel btw) is to address the ipc issue.

4-way SMT is to address a very specific work load it is not frée ipc. There's just no panacea there unless something major happens.
 
Not of I get to it first!

Ohhh, competition! ;D

Oh good, here you are again claiming that hard drives run fast under linux and slow under windows but, hard drives are slow, period, regardless of the OS or file system.

Actually, you were the only one here mentioning Linux. I was just quoting the facts. Once again, do you seriously have to turn everything into an us vs them argument? Can we just discuss the facts like mature technically savvy adults?

I've got nothing against you ManOdGod, but this seriously has to stop.
 
Something is seriously wrong if your 8080 and 8" FDD (or HDD) was taking 10 minutes to boot.
Those normally booted within 10-15 seconds, absolute max after full checks and booting CP/M or any other OS.

You could get them to boot in 10-15 seconds using a decently fast paper tape reader! Cassette would be a different scenario however, mind you I don't think anyone booted CP/M off cassette?
 
I haven't had a ten minute bootup on a computer since the IMSAI 8080.
Yeah I used to get my win98se k6-2 382mhz laptop with a 4200rpm? spinner boot in 26 secs with serious tweaking of Windows and crapware junk clearing..
 
Cassette would be a different scenario however, mind you I don't think anyone booted CP/M off cassette?
Ok, that is a valid point.
I was assuming booting an 8080 from 8" FDD or a HDD, but from cassette... I could see that taking 10+ minutes to boot.

Yeah, hopefully no one had to do that, yikes! :eek:
 
These articles are so sensationalised, in general terms max clock gaming Intel have had the best CPU, the 8700K then the 8066K, then the 9900K now the 9900KS all with the same "fastest ever gaming CPU bar none" line which is old. For me the disappointment with Intel is how they took an efficient Sandybridge architecture which didn't need high clocks to be impressive and was light on the power usage and turned it into a "5ghz high octane gas guzzler". In many games that I have interest in the lower clocked lower power more cored AMD parts are within touching distance of the 9900's so don't really give a crap about 5ghz 300W stuff anymore.

Will be building a SFF Ryzen 3900X setup for HTPC, casual gaming which is basically realism modded Fallout 4 and music rendering and the 3900X is about 3K cheaper where I am. The only area Intel actually impress or interest me is in the notebook area.
 
Shit guys. In a pro editing workflow on modern hardware in the same application like DaVinci Resolve, some parts work best on a single thread so a 9900KS, some love Threadrippers, but Epyc's even more, and some looooove VII's more than even Quadros. In fact fuck quadros.

So if there is a lot of compositing then you probably will be happiest with a 9900ks. If you just shit out a lot of videos and render all day then a thread ripper will render in the background while you start trimming your new hotness.

Anything with a lot of resolution especially for grading you want four or eight VII's or 2080s on Linux, ideally on a TR or Epyc.

It kills me to see 63 cores sitting on their hands, but Fusion and After Effects etc are just like that.

The worst of all possible worlds is a Xeon. Suck it Mac Pro.
 
I know that you guys are only going on about the 9900 series, but seeing the fumbles of the 8600, 9600 and 9700 I would guess that they are showing some form of hyper aggressive branch predictor that doesn't leave anything in case it misses, and thus when a mistake is made the programs screech to a full halt until everything gets going again.
 
I have to say, the 12C/24T AMD processors are the only devices to temp me into upgrading in a very, very long time.

Are they planning on hanging onto the current socket as AMD tend to do? That's the one thing that really pisses me off about Intel.
 
And yet while Cinebench's utilization is, naturally, better - It's implementation is literally identical to my Doom 2016 example.

It isn't.
Furthermore, as stated, I'm getting great performance in a CPU limited scenario at a speed well under 4Ghz.

This is your opinion. I won't argue that you're not getting good performance, but the game isn't CPU limited to begin with, and you're not making scientific comparisons to back up your opinion.
 
Ohhh, competition! ;D



Actually, you were the only one here mentioning Linux. I was just quoting the facts. Once again, do you seriously have to turn everything into an us vs them argument? Can we just discuss the facts like mature technically savvy adults?

I've got nothing against you ManOdGod, but this seriously has to stop.

Nah, I was just basing my point off of what you said in the past, which I should have pointed out in my post but missed that, it happens. Also, there is nothing wrong with the NT kernel, it has been improved upon over the years and is not what you had back in the NT 3.1 days. Also, NTFS does not need to be replaced, either, since it has also been improved upon since its original release as well.
 
Meh, bugger! Perhaps I'll hold off just a little while longer.

Cheers for the reply.

Why hold off, we are talking the X370, X470, X570 and I would imagine for Zen3, the X670. The only reason you would not be able to use the 3950X in that is if DDR5 is released with it.
 
RDR2 shows that if you code highly threaded games this is not the case since trying to chase high fps the 9700k and the 9600k end up screwing themselves up and creating a stuttering mess, forcing you to cap the frames until it's low enough for those non HT cpu to handle (and this is only the latest game to show that, someone has mentioned already that FC 5 had similar performance quirks/bias against these cpus)

RDR2 only shows an odd behavior on a poorly coded game engine.

It seems to be somehow reliant on some aspect of HT, not on total thread count.

An old 7700K with only 4C/8T has no issues with stuttering, yet a much faster 9700K 8C/8T CPU stutters all over the place.

This odd behavior is only seen in Rockstars game engine and nowhere else.

Clearly not the result of good multi-threaded coding, but the total opposite, poor multi-threaded coding

This is NOT the future of game engines. It's some kind of glitchy imbalance in the game engine.
 
So, poorly coded same as in farcry 5, right... It isn't that it is overly aggressive behavior hmmhmm if you wanna believe that, do so, your choice.

That's two totally different game engines btw. And I explained what could be the cause, if the processor micro code is too aggressive with predictions/cache a miss can mean that it has to go all the way back to system ram for the data it should have had at hand but didn't.

I wonder going forward how many "bugged" game engines will show similar patterns.
 
Last edited:
So, poorly coded same as in farcry 5, right... It isn't that it is overly aggressive behavior hmmhmm if you wanna believe that, do so, your choice.

That's two totally different game engines btw. And I explained what could be the cause, if the processor micro code is too aggressive with predictions/cache a miss can mean that it has to go all the way back to system ram for the data it should have had at hand but didn't.

I wonder going forward how many "bugged" game engines will show similar patterns.

Mispredictions and cache misses do not cause multi-millisecond delays. They lower performance of course, but do not cause human-visible fits and starts.

The most likely explanation is that the engines which exhibit this issue are expecting a given thread to complete a task in a certain time quanta (for a full frame render), and it is not quite doing it all the time. This causes full-frame misses which of course are human-visible. You could blame that on the processor, but since it is not widespread, I'd be hesitant to do so.
 
So, poorly coded same as in farcry 5, right... It isn't that it is overly aggressive behavior hmmhmm if you wanna believe that, do so, your choice.

That's two totally different game engines btw. And I explained what could be the cause, if the processor micro code is too aggressive with predictions/cache a miss can mean that it has to go all the way back to system ram for the data it should have had at hand but didn't.

I wonder going forward how many "bugged" game engines will show similar patterns.

Where is Farcry 5 doing this. I checked some FC5 CPU reviews and they are not exhibiting this behavior.


Here what is happening in RDR2 :

9700K Stock (8C/8T):
Avg FPS: 138.2 FPS
1% Low: 32.9 FPS
.1% Low: 6.5 FPS (Huge stutters)


7700K Stock (4C/8T):
Avg FPS: 118.2 FPS
1% Low: 77.8 FPS
.1% Low: 68.0 FPS ( Butter smooth)

These are Both 8 thread CPUs. So this isn't a difference in thread count. But on the MUCH faster 9700K they are stalling out and stuttering.

That is an indication they didn't balance their load properly and/or they have some kind of race condition. It seems like they stall out some threads while working/stuck on ones that don't contribute to actually pushing out the current frame.

There is no question. This is poor coding. They didn't properly balance/prioritize their threads.
 
Nah, I was just basing my point off of what you said in the past, which I should have pointed out in my post but missed that, it happens. Also, there is nothing wrong with the NT kernel, it has been improved upon over the years and is not what you had back in the NT 3.1 days. Also, NTFS does not need to be replaced, either, since it has also been improved upon since its original release as well.
Yeah, the NT kernel is shit compared to other OS kernels, especially with thread scheduling and threads jumping all over the damn place.
NTFS was released in 1993, still has a 255-character limitation, and on HDDs still needs to be defragmented; no other modern file system in the last 15 years needs to be defragmented.

ReFS was going to replace NTFS, but now Microsoft has backtracked with that statement and has stalled all development and future support for it.
For what it is, the NT kernel and NTFS do get the job done, but the NT kernel just exhibits strange behavior compared to other modern kernels (and has for years, ffs Microsoft) and NTFS needed to be replaced 20 years ago.

There are far better and more robust solutions available for both of these, just not for any variant of the Windows OS; that's what everyone gets for being a loyal Microsoft customer. :meh:
 
Why hold off, we are talking the X370, X470, X570 and I would imagine for Zen3, the X670. The only reason you would not be able to use the 3950X in that is if DDR5 is released with it.

In that case I may start planning an upgrade.
 
Snowdog

Check game's nexus farcry 5 1080p normal cpu benchmark, the 9700, 9600 and 8600 all exhibit excessively lows 0.1 and 1% frames versus the other processors besides the thread ripper 2990wx in creator's mode.
Tests were done with a 2080ti ultra, I'm currently on mobile in the boondocks and can't directly chase the link down myself.

Gamersnexus.net


Actually do check that test out, there's something very interesting going on, it hints at the issues being instabilities due to the processor being run too near its limit since although lower than others the 9600k stock runs better than when overclocked, falling from 58fps to 19 fps (every other processor than the problems ones runs much smoother).


So it could be that there's nothing coded wrong, just that the code hits parts that most other games don't and trigger some bug of the processor/maybe a form of speed regression.

Edit :this isn't the specific one I have downloaded on the phone, but is an example :
https://www.gamersnexus.net/hwreviews/3407-intel-i5-9600k-cpu-review-vs-2700-2600-8700k
 
Last edited:
Too bad there was no 9700K in that case. If you only have 9600K, which is the lowest thread count part there, then it's hard to see a pattern. It could just be the lower thread count issue for FC5.

But the GN RDR2 CPU test includes multiple parts without HT. 4c/4t, 6c/6t, and 8c/8t all have issues with RDR2, yet even 4c/8t parts don't.

That kind of points to some kind of dependency on HT, more than thread count.
 
I don't understand the controversy? Games = human i/o = there's going to be a critical path somewhere = that path is going to be single threaded. Ergo, we absolutely want fast cores to deal with that problem. I have not seen any trend that suggests that 8 really fast cores *for games* (no but but productivity, RTF-Title) is slower than 12/16/? modestly slower cores. It may be academic in end-result, but I haven't seen any data to the contrary. It could happen in a game with an insane number of independent ai controlled elements, I don't know.

I'd honestly be surprised if we keep seeing much of a scale up in cores in the consumer market (including laptops), unless there's a specialization for different work loads. Heck, even in my professional world of CAD and FEA, scaling is atrocious, so if someone made as fast of an 8 core part with 4 channel memory, it'd probably be the fastest part for my work (*for solving a single problem at a time, batching out FEA into multiple simultaneous configurations takes better advantage of the cores)

I'm happily running an first-gen Ryzen, and AMD's value is hard to beat, but why can't we call a spade a spade?
 
Something is seriously wrong if your 8080 and 8" FDD (or HDD) was taking 10 minutes to boot.
Those normally booted within 10-15 seconds, absolute max after full checks and booting CP/M or any other OS.

HDD? FDD? We were using cassette tape to boot.

Not of I get to it first! :D

It was one of the computers we used in my high school electronics class before said school had a 'real' computer class. We would boot it up before class started so we wouldn't use all of class time waiting for it. Fun times!

Sorry for the thread derailment.:)
 
Last edited:
Back
Top