Riddle me this: Better IPC?

Discussion in 'Intel Processors' started by TXE36, Nov 29, 2019.

  1. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    There has been a lot of hype lately surrounding the new Ryzens "beating Intel", but after reading several reviews of both AMD and Intel offerings I can't help thinking "all sizzle and no steak". This is with a few personal caveats:

    1) I generally don't play modern games, and only one that I do play that is multicore won't max out a 4c/8t CPU.
    2) Games I do play tend to be simulations and are single thread heavy.
    - in my experience, the 32M SuperPI bench below tracks differences quite nicely.
    3) Generally don't do multicore stuff.

    So, on multicore stuff, definite improvement in performance and cost since the lowly 2600K circa 2011. However, on single core, there really doesn't seem to be much improvement at all. I've got a 2600K on an Asus Maximus IV Extreme with 16GB of Samsung low latency DDR-1600 memory that can do 2133. I looked around for some SuperPI32 benchmarks on the web and benchmarked my machine as well as my Dell 8700 at work:

    402.52 : 2600K Asus MIVE 5.1G 1866 Low Latency Memory (not stable for every day)
    403.18 : 2600K Asus MIVE 5.0G 2133 Auto Memory
    404.14 : 2600K Asus MIVE 5.0G 1866 Low Latency Memory
    410.19 : 2600K Asus MIVE 5.0G 1866 Auto Memory
    412.39 : OC I9-9900K
    412.46 : Bjorn3d I9-9900K
    424.57 : 2600K Asus MIVE 4.7G 2133 Auto Memory
    437.81 : Bjorn3d I7-8700K
    463.58 : Dell Optiplex 5060 I7-8700
    529.08 : OC 3700x

    Bjorn3d from here: https://bjorn3d.com/2018/10/intel-core-i9-9900k-review/8/
    OC from here: https://www.overclockers.com/amd-ryzen-9-3900x-and-ryzen-7-3700x-cpu-review/
    1866 Low Latency is 9-9-9-27
    2133 Auto I believe is 11-11-11-33, I just left the setting on auto
    2600K air cooled with some Thermalright TRUE, variable vcore max 1.4V (1.437V to hit 5.1G).
    Video card is 1070TI even though that really doesn't matter here.

    My 2600K is just a game machine running Win7, so I'm not running any of the Spectre or Meltdown "fixes" and perhaps this is why the results above show an 8 year old system beating modern HW on a single threaded benchmark. I also don't see AMD beating Intel just yet.

    If I ever get hooked on some demanding multi-core game perhaps my attitude will change, but with an upgrade at this point requiring motherboard, memory, and CPU the bank for the buck just isn't there and the wallet stays closed. To me, the CPU market feels very much like when Intel was recalling 1.1G P3s and the year or so before AMD launched the 1.4G T-Bird.

    Feel free to pick at the thinking behind this as I'm curious what others think. At this point, I have no reason to care about multithread beyond 4 cores, so not so curious about 4+ cores at this time.

    -Mike
     
  2. N4CR

    N4CR [H]ardness Supreme

    Messages:
    4,129
    Joined:
    Oct 17, 2011
    If you don't use modern games or apps you don't need to upgrade.
    My 2600k @ 4.4 also does fine for 1440 60... Most demanding game I have is no mans sky lol. That said it hitches a little on some loading points, so a new CPU would help there.
    But I do video so a zen2 or zen3 is the smartest choice going forward, as good in games and excellent in MT.

    AMD is a no compromise solution these days unless you are playing 240p 3r337 #1 esports champion in da worlstar (which everyone always likes to think they are).
     
    Last edited: Nov 29, 2019
    Sulphademus, IdiotInCharge and TXE36 like this.
  3. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    It is worthwhile reading the following article
    https://www.anandtech.com/show/1404...el-core-i7-2600k-testing-sandy-bridge-in-2019

    The key thing the intel 2xxx series doesn’t have is AVX-256, which when used can speed things up by a fair (5-25%) amount, sometimes more depending on the workload.

    So for single thread performance if you have similar clocks and don’t use it, then you won’t see a huge difference. (5%-10% tops)

    What this is not is commentary on is the speed of the IO, Eg memory, storage etc, which has become faster in leaps and bounds
     
    Last edited: Nov 30, 2019
    TXE36 and Dan_D like this.
  4. Mav451

    Mav451 [H]ardness Supreme

    Messages:
    4,549
    Joined:
    Jul 23, 2004
    I don't put much weight in AVX - and that's coming as a Haswell user who is *very* cognizant of the unprecedented thermal loads that AVX creates.
    Based on the OP's typical workloads (simulation, ST-heavy), I don't believe AVX or Ryzen's strengths are at all applicable to him.

    That said, while IPC has basically stagnated post-CFL, I think the combination of both high clocks (5GHz) and the IPC bump on current Intel platform could present some value. I wouldn't buy new, but used. But even then, there's no rush to upgrade either. I'm only on 4/4, and NGL, that Ryzen 3000-series roll out was disappointing. Maybe Ryzen refresh (Zen 3, 4000-series) gives Ryzen the clock headroom for the undisputed lead for good.

    And Keljian is right - there is still a big benefit to staying on a modern platform.
     
    TXE36 likes this.
  5. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    55,081
    Joined:
    Feb 9, 2002
    The lead AMD establishes will never be something we can consider to be permanent. Intel simply has too much money and too many resources to allow this to go on indefinitely. Although, Ryzen 3000 was anything but disappointing. Other than those points, I generally agree with you.

    The point is, processors are getting faster, they are just going about it differently than they did several years ago. I wouldn't dream of going 4c/8t for gaming today, and frankly, there is no need to. While going beyond 8c/16t now doesn't make much sense, this may change in the future. We aren't getting any faster in terms of clock speeds with AMD slipping in clocks from Bulldozer and Intel likely doing the same when they get off 14nm. But today, compared to a 2600K, you end up with clocks that are the same or slightly better on a 9900K, four more cores, eight more threads, more memory bandwidth and a platform with far better I/O and more PCIe lanes.
     
  6. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    AVX may not be big on your list of wants, but without it you can exclude yourself from some modern software. It is a pretty big deal from a technical standpoint
     
    TXE36 likes this.
  7. Randall Stephens

    Randall Stephens Limp Gawd

    Messages:
    504
    Joined:
    Mar 3, 2017
    Mike, just stick with a phone.
     
    sabrewolf732 likes this.
  8. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    Thanks for the thoughtful and interesting replies. In reflecting back, it's not only an old platform I'm running, but the gaming SW I'm running is quite old as well - very unlikely using things like the newest AVX instructions. No doubt the newer processors and platforms have made improvements, but I think the YMMV part of this equation is a lot larger than it used to be. Going from P3-800 to 1400 T-Bird to 3.0C to Core2Duo to Sandybridge yielded improvements across the board. I'm pretty well convinced that the stuff I care about will see near zero improvement with the latest offerings from Intel and AMD.

    The biggest performance issue I have with my current machine is saturating the 1070TI and having to dial back settings. I figure I need about a 40% in-game performance boost which means I'm skeptical a 1080TI/2080 could deliver it. This drives me to consider SLI, as a second 1070TI is definitely within reach for less than $250. Upside potential is nearly a 100% performance boost, downside potential is it doesn't work and I sell off the extra 1070TI. 2080TI isn't a consideration as I'm not investing a grand in a video card.

    IMHO, the latest Intel and AMD releases are overhyped and the result of a market begging for a real upgrade path. I also agree with the opinion that the demise of sites like [H]ardOCP are likely due to the lack of truly interesting new product coming down the pike at realistic prices. The IPC/raw speed problem is a tough nut to crack these days. AMD and Intel has spent at least a decade on it and I know personally that Texas Instruments' DSP business has been gutted by it.

    -Mike
     
  9. thesmokingman

    thesmokingman [H]ardness Supreme

    Messages:
    5,357
    Joined:
    Nov 22, 2008
    Seriously? o_O
     
  10. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    In the context of this discussion, yes.

    8 year old 2600K 32MSuperPI: 404.14s
    Brand spanking new 9900K 32MSuperPI: 412.46s

    Seriously? o_O

    I'll give you that I don't see as much hype for Intel as I do AMD. Still neither camp is offering me much of anything. Of course, YMMV.

    -Mike
     
  11. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    55,081
    Joined:
    Feb 9, 2002
    IPC is quite a bit further ahead of where it was when your 2600K was new. Sure, older software might not be much faster, but everything else is. With your 1070 Ti, I'd wager a 9900K at 5.0GHz would be a huge step forward.

    As for sites like HardOCP dying off, it has nothing to do with not having interesting product. We never had a shortage of articles to do. True, the money isn't what it used to be but you can blame YouTube and the ADD mellenial crowd for that. Still, sites like Anandtech and even TheFPSReview are viable because you can go deeper in an article than you can in a video. It's also easier to consume while you are at work.

    HardOCP shut down for different reasons and Kyle has already covered that so I won't get into that.
     
    TXE36 likes this.
  12. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    Basing performance solely on superpi is very nearsighted. Really. If calculating digits of pi is all you do with a computer, then sure, go for it.


    I don’t, and won’t.
     
    TheSlySyl likes this.
  13. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    Reading is fundamental:

    My measurement techniques are very tied to what I actually do with this machine, never suggested they were appropriate for other use cases.

    Which is my point for my use case, I can't go back and add new instructions to my single threaded apps -> I want my platform to run my older software faster as newer software isn't realistically on the table, thus no IPC nor performance bump for me.

    This is a big reason YMMV is so significant this time around. Core2Duo made everything faster, Sandy Bridge made everything faster, no new software required. This time around the software needs to use the new instructions and/or needs to be multi-threaded to really get a lift.

    I wrote that poorly, I didn't mean to imply that was the only reason for the demise of [H]ardOCP and I'm aware of what happened. That said, a lot of other sites have disappeared as well. Perhaps you can blame YouTube and shifting demographics, but I don't think that is the whole story. I, for one, can't stand video reviews. For today, Anandtech is decent, but IMHO, it ain't what it used to be. For the record, I already miss the [H]ardOCP reviews.

    Again, thank you all for the replies.

    -Mike
     
  14. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    Reading may be fundamental, and I do read, but I cannot accept that software for calculating pi is analogous to single thread performance in a game, regardless of what said game is.

    Your assertions beg the following questions: what performance increase are you trying to achieve and why?

    Certainly some games will benefit from storage/IO improvements, certainly some single thread software is storage bound. Without knowing what these games are, we can’t advise- so what are they?

    It really sounds to me like you are trying to justify spend on a new processor, but if things are “fast enough” you really don’t need to.
     
    sabrewolf732 likes this.
  15. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    55,081
    Joined:
    Feb 9, 2002
    Well, unless your 2600K can hit 5.0GHz, a 9900K will be faster. It took a lot to get a 2600K to 5.0GHz but its relatively easy on a 9900K. You can do it with a simple decent AIO. There are also other advances that help. Cache improvements and so on. In any case,

    As for review sites, there are a number of reasons why many of them dried up. That would be its own post if I wanted to get into that. Each site would have its own story, but in general YouTube was a major factor for the shift away from the traditional review site. Many of them died off because they were mismanaged.
     
  16. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    I'm actually going the other way :). I don't think the spend on a cpu is worth it at this time. I also wasn't really asking for game advice. I didn't want to clutter the thread with game titles that are not likely of interest here, but here it goes:

    My monitor is a Samsung 40" 4K TV.

    Auran/NV3 Games Trainz TS12 and Trainz and New Era. Both of these games are real train/model train simulators that have a very small market and certainly are never part of review site's selection of games for benchmarking. Years ago I found SuperPI is very good for predicting TS12 performance when CPU bound and it was a common benchmark seen in reviews. I have yet to find a common benchmark for the video card, but Heaven tracks video bound performance decently, but not as good as SuperPI tracks CPU.

    TS12 is extremely CPU bound running code that isn't all that much different than the first version in 1999. Never been able to run it over 30 FPS smoothly. 30 FPS is okay for this but not great. To be smooth, it cannot drop below 30 FPS - it is just the way the game engine works. It can run into limits both in video and CPU. An X% improvement in 32M Super Pi pretty much means an X% improvement in TS12 if it is CPU limited. Thus, if TS12 is capping at 25 FPS and 100% CPU, a 20% improvement in SPI score will get it up to 30 FPS. I've been playing this single threaded SPI/Trainz thing for a very long time going back to the P4 and previous versions of the game.

    Ironically, that 30 FPS limitation helped the video card because I like lots of antialiasing. When TANE came around, NV3 Game actually made some serious improvements in the CPU side of things, and running TS12 assets in TANE is no longer CPU limited. TANE will hold a constant 60 FPS and might intermittently load one core to 70% at times, while the others bobble around 35%, and most of the time the primary thread is at about 48% load with the 2600K running at 4.9 GHz. Game is clearly not limited by the 2600K.

    That flips me back to the video card, because now I run the game at 50 FPS and there are places on the map the 1070TI cannot keep up. It also raises another problem, the antialiasing on TANE in DX11 is horrid compared to DX9. Also note that all those nice forced antialiasing modes in Nvidia Inspector don't work in DX11. The game content is still quite dated, so DX11 doesn't look all that better. TANE is quite capable at maxing out the 1070TI at 4K. If I don't run it at 4K, the image quality is terrible which I suspect is from DX11, as TS12 in DX9 looks great at 1080P.

    Makes me wish I could get enough CPU to run TS12 at 50 FPS and that would require about 125% improvement in IPC. For those old enough, a doubling of performance (100%) used to happen in a few years, so waiting for 125% more didn't used to be such a big deal.

    NV3 Games has gotten into the DLC/DRM in a big way while are extremely slow to address fundamental issues, so my spend with them is done as I don't expect much from them to improve DX11 antialiasing. So it's either TANE or TS12 with enough CPU.

    Aside from Trainz, I'd also like to see FSX run like FS9 does today. While P3D represents further development past FSX, FSX has some features the P3D does not. FSX was designed single threaded and I believe they targeted a 10GHz CPU.

    So there's the why. What I was really hoping for was somebody to pipe up that the Meltdown and Spectre fixes were holding back SuperPI scores. It looks like my next spend will be in the video card department.

    As for current games, I've got a backlog of now old games to still try and life gets in the way of the rest.

    -Mike
     
  17. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    Years ago doesn't work for today based on

    Which suggests that the software is minimally but actually multithreaded, thus single threaded pi32 isn't going to give you a reasonable benchmark. This is based on the 2600k having 8 threads, 7 threads at 35% + one thread at 48% = 293% not 100%. (Rough calcs)

    Which means that while it is not fully optimised, but incremental improvements may be made by more actual cores (rather than just HT) to throw threads at. You may see a significant enough difference with a CPU with 8 cores.

    Here we get to the real story, the 1070Ti is not really a 4k card of any measure. DX9 and DX11 are completely different - DX11/12 are both much better threaded than DX9 in general. The NVidia drivers do a good job of threading in general, AMD didn't at last count but I haven't had an AMD card lately to compare. To get 60FPS@4k generally you need the equivalent of a 1080ti, especially if you want AA. The 2070 is pretty good as an alternative.

    NVcontrol panel will potentially give you some extra features like TSAA/FXAA etc which may give you some benefits, there's also some shader addons for games you may be interested in.

    Nvidia drivers use AVX256 if it is available, regardless of DX version.

    Per previous posts AND this post, superpi cannot be used as a yardstick..


    To summarise, a more modern processor with more cores would give you:
    -AVX256 to use with Nvidia drivers
    -more actual cores therefore better core population based on the load
    -more performance
     
    Last edited: Dec 2, 2019
    sram, Dan_D and IdiotInCharge like this.
  18. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    DX11 antialiasing is noticeably poorer than DX9. TSAA/FXAA and shader options don't come close and there is little hope for improvement:
    https://www.nvidia.com/en-us/geforc...for-directx-11-anti-aliasing-driver-profiles/

    TANE suffers from this decontenting problem. DX11 has taken away features that were in DX9.

    It sure can for TS12. In TANE I'm not seeing the CPU limiting performance, steady 50 FPS is all I want. However, I'd really, really like to run TS12 at 50FPS, but that isn't likely to happen at all until one of these processor companies actually improves IPC without resorting to new instructions.

    Don't make the mistake of discounting the results of SuperPi simply because you don't like what it is telling you. Improvement in SuperPI scores has flatlined, that doesn't make them irrelevant due to all the existing SW. I've actually got 1M SPI scores going back to Tulatin P3 that I grabbed when convenient.

    This is how IPC used to change with generations uses the venerable E8400 as a reference:

    upload_2019-12-2_7-54-57.png

    The 2600K is only 28.5% faster than the E8400, but the E8400 is a whopping 236.3% faster than the P4 Prescott. All of those scores are at reasonable max overclocks for those processors.

    Nvidia driver performance based on AVX256? <---- Now that's interesting to me.

    -Mike
     

    Attached Files:

  19. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    Ok rather than speculate and use superpi, I have a system as per my signature (4k monitor, 1080ti, 9900k). I am prepared to buy TS12 and benchmark it if you would like a comparison point. Just tell me what settings and what I need to do, and I will run it.

    I'm downloading it now. Note there is a 2019 "version" which may be the simple rail to performance if it has the same features.

    Also, I am not making mistakes re superpi:
    You are comparing results that you said scaled with superpi back in the 00s, for something that was released in 2012, that uses graphics (therefore graphics loop), that has variable load, and is by your admission multithreaded but you insist on single threaded pi.

    It’s like comparing an apple with a plum because it is spherical. :D
     
    Last edited: Dec 2, 2019
    Dan_D likes this.
  20. vick1000

    vick1000 [H]ard|Gawd

    Messages:
    1,929
    Joined:
    Sep 15, 2007
    I just "upgraded" from a 2600K@4.0ghz, Z68X, 1600DDR3, on Win7, to a 9600K, z390, DDR3200 on Win10. I ran Heaven 4.0 just before the "upgrade", and saw a small drop in score. So I was concerned that I might have a "sidegrade" on my hands, but that would be rediculous right?

    It really depends on the software. Heaven is old code, runs really well on old tech. Deus Ex MD was a completely different story, I was able to go full Ultra settings with the new system, partially due to DX12 I am sure. I hate the idea of Win10, but after a lot of tweking, it's a good experience so far.

    I think the slight increase in single thread performance is not really significant enough for me, but the platform I/O is a significant upgrade. I was able to get a 660p in there, and I find that PCIe storage matters, it's a lot faster than SATA, I doubt the old system would handle Win10 as well as this one, and I had no choice but to change, since I want to play some DX12 titles.

    Plus, I have yet to OC this thing yet. That will probably reveal the biggest difference, since the 9600K will do ~5.0ghz with ease, where the 2600K maxed out at ~4.6ghz.
     
    TXE36 likes this.
  21. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    Hmm, interesting as I'd be really curious how TS12 would run on a 9900K and 1080TI. I don't do Steam, but I'd be willing to for the $2.50 to buy TS12 and then we would be running the exact same version of TS12. I could then setup a save game file for you and we could run some fraps benchmarks that I have found to be very repeatable. So if you are willing, I'm willing. I think the results would be very interesting both from the CPU and GPU perspective. Let me know ASAP and I'll put the test files together.


    Just make sure you are looking at the right version. This is TS12 I'm referring to:

    http://www.trainzportal.com/product/view/trainz_simulator_12

    On steam its only $2.50 for the next 17 hours:

    TS12 is Trainz Simulator 12. The other versions are Trainz a New Era (TANE) and TRS19. I'm not interested in TRS19. TANE would be difficult to compare as I run jailbreaked DLC content on it that I don't feel comfortable sharing. I had to jailbreak the content because NV3 broke it in TANE - long story and one of the reasons for disinterest in TRS19.

    Intesting benchmarks for TS12:
    1) With low video settings, CPU limited framerate is very apparent, this would remove differences between our video cards.
    2) A high video settings it would be interesting to me to see the difference between the 1070TI and 1080TI. If the performance increase is more than expected due to core count/memory bandwidth/published benches then one could assume that is due to improved CPU instructions helping the Nvidia driver.

    There is a piece here that you are not understanding in my previous replies: I'm actually talking about two very different versions of the same game:

    TS12 is old code, going back to 1998 or so. It is single threaded. Some slight improvements have been made, but they are minor. Trainz customers have been complaining about this for a very long time and the underlying code base of TS12 is 21 years old. CPU limited performance of this version is closely tracked by SuperPI differences. When I'm referring to SuperPI and performance, I'm referring to this version only and it is very much apples to apples.

    In TS12, one can turn on an extreme amount of anti-aliasing and get really good video quality, albeit in DX9. TS12 only does DX9.

    IMO, the most important thing is for the game to run smoothly - 100's of tons don't roll by jittering. Thus, VSYNC or frame rate limiting is done to cap the frame rate. 50 FPS is sufficient - this is not a first person shooter. Triple buffering is also used. TS12 will run smooth at a given frame rate with the CPU at nearly 90% and the video card at 95% at the same time as the video card is rendering what the CPU came up with during the previous frame. It is very easy to tell just what is maxing out when the frame rate dips. TANE is a totally different animal.

    TANE truly is a new era as the engine code was rewritten and is now 64 bit. It has its own issues, but I've found it is a pretty good simulator base to run TS12 assets (long story). TANE is multi-threaded, and thus, SuperPI does not relate to it at all. TANE's primary issue with me is that it is only DX11 and has fairly poor anti-aliasing that appears to use a lot of GPU horsepower. It appears to run well on my 4c/8t 2600K and has never appeared to be CPU bound. I'd be very happy with it if the anti-aliasing worked as good as TS12 under DX9.

    I've always had two questions about these simulators:

    1) Is there a CPU powerful enough to run TS12 at 50FPS, and if so, how much video card is required to run at high image quality settings?

    2) How much video card is required to run TANE at 4K with high video settings to maintain 50 FPS?

    I doubt there is an existing CPU that can satisfy 1) today or in the near future. I think 2 may be satisfied by a 2080TI or maybe SLI'ed 1070TIs if SLI can be made to work in TANE.

    I hope I've clarified this and let me know. I think the results could be very interesting - just how much does the newest CPU arch help on old games? I'm thinking not a lot, but I would love to be proven wrong.

    -Mike
     
  22. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    That is the version I have, I am good to run it whenever you have a save ready
     
  23. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    Ok, I've got it too. In all likelihood it will probably take a week to get this setup in my spare time. Previous versions of TS12 could be tricky to setup, I'm hoping the steam version runs well enough out of the box to use one of the built in sessions and FRAPS.

    Meanwhile, how about a SuperPI32M from your machine?

    https://www.techpowerup.com/download/super-pi/

    Grab Super Pi Mod v1.5 XS and run the 32M version. Note the last two scores.

    -Mike
     
  24. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    9900k
    Total time 7m 6.744s - I do note my memory isn't the fastest on the planet, but I need quantity over speed for what I do. (3200CL16 CR2) and memory makes a big difference to pi

    So the previous two were
    6m 51.077s (24)
    6m 36.085s (23)

    Note I changed settings to -2 AVX offset from 0, and ramped up all cores to 5.0ghz
     
    Last edited: Dec 3, 2019
  25. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    TANE is $13.95 AUD (on special) on steam, that's more than I'm willing to commit to this just to prove a point - but if you want to paypal some funds I'll run that too.
     
  26. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    55,081
    Joined:
    Feb 9, 2002
    I'll fire the Super Pi benchmark off on a 5.0GHz 9900K and a 4.7GHz 10980XE and report back.

    EDIT: Here are the numbers:

    Core i9 10980XE @ 4.7GHz (All core)
    6m 49.669s

    Last two numbers:
    6m 20.790s
    6m 35.146s

    Core i9 9900K @ 5.0GHz (All core)
    6m 48.140s

    Last two numbers:
    6m 20.360s
    6m 34.590s

    I'm not sure why TXE36 cares about those numbers as that's not the reported result, just what it reports for those last two loops.
     
    Last edited: Dec 3, 2019
    Keljian likes this.
  27. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    The number I always use for a 32M SuperPI is from loop 24. The PI value output seems to be dependent on disk speed so I don't care so much about it.

    TANE isn't worth the trouble.

    In my experience, memory speed and latency impact SPI scores. Low latency memory equivalent to about another 100MHz in clock speed:

    upload_2019-12-3_7-40-34.png

    Getting into the 6:20's is a good 5% improvement in the benchmark.

    -Mike
     
  28. sabrewolf732

    sabrewolf732 2[H]4U

    Messages:
    4,044
    Joined:
    Dec 6, 2004
    lol

    You're complaining about antiquated/poorly optimized game code running poorly on new hardware being indicative that new hardware hasn't advanced?

    Also, super pi isn't totally indicative of IPC

    i.e as per your posts the athlon 64 was comparable to the northwood and prescott in superpi times but was drastically faster in games from that time period.

    Silly post is silly. Enjoy your old games.
     
  29. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    55,081
    Joined:
    Feb 9, 2002
    Agreed. Super Pi is a computation and a benchmark. The kind of task that it is doesn't resemble gaming engines in the slightest and is a poor metric to go by.

    You are also picking the number you like for........................reasons. The final number is the number it reports as being the actual result. It's also meaningless. As much as I trash 3D Mark for not using an actual game engine and being a poor metric to use for game performance, its far better than using Super Pi for the same thing.
     
    Last edited: Dec 3, 2019
    Keljian and sabrewolf732 like this.
  30. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    The reason the 10xxx processor ranks so fast is number of memory channels and speed. WRT the difference between my 9900k and Dan’s it is memory speed/latency.

    All of that said, I have yet to find software that is limited in some way by the 9900k in my use
     
  31. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    55,081
    Joined:
    Feb 9, 2002
    When I ran that test, I had switched to some G.Skill TridentZ NEO RGB DDR4 3600MHz modules. These are 2x16GB modules with CL16 timings vs. 4x8GB CL18 overclocked to 3866MHz. Those modules are on the test bench in the 10980XE system.
     
    Keljian likes this.
  32. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    Yah - regardless, I still think it's a pointless benchmark for this use case.
     
    Dan_D likes this.
  33. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    Leaving out the PI write and the end makes the BM more repeatable because it removes the uncertainty of the disk access. Nothing wrong with comparing one machine's loop 24 to another's.

    Is it really that hard for you to believe that one game engine could see the same performance deltas between CPU clock speeds and X86 architectures when one program is that game engine while CPU limited and the other a benchmark of CPU throughput? Especially for a game you're very likely not familiar with? Really??? Can't *ever* happen?!??!

    Think about it...everything a computer does is a computation. I never suggested extending SPI results past this single game when it is being CPU limited.

    -Mike
     
  34. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    Ok, I've got it downloaded and running. Never have tried this particular version, but it is acting as expected. I'll put some sessions together to gather Fraps framerate data for the classic a la [H]ardOCP "maximum playable settings" type graph. In the meantime, here is quick and dirty test that can be used to synchronize settings:

    Keep the 1080TI as fast as possible, don't enable any extra anti-aliasing settings, the goal is to see how the CPU is limiting frames. No vsync either to cap frames. Make sure the settings in the Nvidia driver for the game are at defaults. I'm using Win7 64 and Nvidia driver 436.02, but I don't think that will matter.

    1) Start the game and click Options
    2) Don't need to change General or Planet Auran Tabs.
    3) On the Display Settings tab select Directx, 1920x1080, 32 bit, fullscreen, Aspect Ratio Auto, Antialias Mode 2
    4) On the Advanced Options tab select Vertical Sync Auto, Frequency Auto, and uncheck Shadows.
    5) On the Developer Tab set the Asset Backups to 0
    6) Click Ok
    7) Click Start (The first time you run this it will do a database rebuild)
    8) Click Select Route
    9) Select Route Norfolk & Western - Appalachian Coal
    10) Select Eastbound Coal Train
    11) Close the dialog boxes
    12) Go into the menu in the upper left corner
    13) Select video settings and Set them to:
    Max Draw Distance: 4000m
    Scenery Detail: High
    Tree Detail: Ultra
    Texture Detail: High
    Anisotropy: 16 Highest
    Close the settings
    14) Wait for the opening scene to fully load. Once the view stops populating, what is your frame rate? What is your video card load?

    As the scene loads the frame rate will drop. I end up at 32FPS with a video card load of 20% at 1607MHz on my 1070TI. 1080TI should be able to do this no sweat.

    I'm curious what you get with the 9900K and ~35% more video card. Thank you for doing this.

    -Mike
     
  35. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    55,081
    Joined:
    Feb 9, 2002
    Since most people are likely executing this off an NVMe drive, I don't think that matters. Of course, that negatively impacts you because your running a 2600K and won't have access to bootable NVMe drives on that platform.

    No, I don't think it works that way. There are a ton of variables that impact a game's frame rates. We've seen memory speed and latency impact frame rates depending on a CPU's architecture. The PCI-Express bus has some impact, although its often a small one. However, in the era of 2600K's we saw chips on motherboards that multiplexed the PCIe lanes, handing switching and "adding" lanes while still being constricted to the same x16 lanes the CPU had to offer. The penalty was latency. Newer systems lack those as they've fallen out of fashion years ago. You can't correlate a Super Pi result by itself into a meaningful data point as it relates to any one game.

    I have no interest in the games your talking about, but I'd wager that at a higher resolution, or on significantly more modern hardware and higher end graphics cards, the experience would be better. It pretty much always is. Sure, CPU's like the 2600K were remarkably relevant for an extremely long period of time relative to any preceding processors I can think of. Even with an older game, I still think your going to get better performance on a more modern processor. If not for the CPU itself, for all the other benefits a modern platform has to offer. Again, memory bandwidth, I/O bandwidth and so on all factor in.

    IPC improved roughly 3% per generation after Sandy Bridge. You seem to forget that. Intel made stronger claims of anywhere from 7-11% a lot of times, but this did often require a "best case scenario" where additional instructions or specific workloads were required. But the IPC improvements were there even without that due to optimizations in cache design and other architectural improvements. The real reason why CPU's stagnated so long is because Intel concentrated on performance per watt and not raw performance. This allowed IPC gains to occur at lower clock speeds and with less power consumption as that benefits both mobile and server markets which are Intel's bread and butter.

    We would see a 2-3% performance improvement in IPC, but lose 200MHz of clock speed each generation resulting in a wash for performance. The platform changes were also incremental. However, in case you hadn't noticed we are either at or beyond Sandy Bridge level clock speeds. Those used to clock to 5.0GHz and in some cases more. I've got a 9600K that can do 5.1GHz. The 9900K or 9900KF can achieve 5.0GHz pretty easily. 9900KS's can achieve upwards of 5.2GHz. Previously, I hadn't seen 5.1GHz or anything on that level since Intel's utterly craptastic Core i7 7740X. Since Skylake's release, Intel has done many process improvements to bring the clock speeds up. IPC hasn't changed much if at all since, but the clocks have only improved so now we have architectures that are significantly faster than Sandy Bridge with the same or better clock speeds.

    You also discount the other benefits an upgrade offers. Namely, the additional cores and threads improve overall system performance. You'll have more resources for background tasks and Windows optimizations for newer OS'es are in place for newer processors and platforms and the features they provide.

    Look, I don't know what your trying to do here. Are you trying to convince us that there is no point in upgrading, or yourself? Spend the money, or don't. We don't really care. If all you do is play those old ass simulators then maybe its not worth it to you, but any notion that it isn't worth it based on Super Pi results is misguided at best. You are misinformed if you think that stupid benchmark has anything in common with a game engine, even an old one.

    Real quick, I wanted to address this point since I test CPU's with high end graphics cards at both low and high resolutions literally all the time for reviews. While keeping AA off would make the game more CPU limited, if you are truly held back by a CPU, this will be evident at higher resolutions as well. I've done the tests and games like Ghost Recon Breakpoint and Destiny 2 run substantially worse in some configurations using a high end graphics card, at both 1920x1080 and 3840x2160 with different CPU's.
     
    PhaseNoise likes this.
  36. Masfeo

    Masfeo n00b

    Messages:
    10
    Joined:
    Dec 3, 2019
    We will see son...
     
  37. TXE36

    TXE36 [H]Lite

    Messages:
    80
    Joined:
    Jun 14, 2018
    Don't care, never asked about speeding up storage, and for the question at hand, loading time is a non-issue.
    How would you know? Have you ever tweaked *this* game - I'm assuming no by your expressed disinterest.
    All I really asked was a simple question: I have this game here that when it is CPU limited changes in performance track this common benchmark. When I look up scores for this common benchmark on the latest CPUs, the scores aren't really any better that what I've got. Is this what I should expect? I asked nothing about other upgrade advice, I was curious about this one thing, that's all. I haven't discounted the other benefits, I haven't said anything about them because I don't care about them. I haven't suggested that anybody else not upgrade.

    Don't need the Intel sales pitch for features I'm not looking for, just a simple, direct answer to the question. I really don't mean to threaten your security, but that response quoted above makes me think I hit a raw nerve. I mean all of that, because I asked:
    Again, really?

    Patience grasshopper, that is only a starting point. I have been tuning this game for a very long time, what works is tuning the CPU w/o GPU limits, cranking up the resolution and anti-aliasing, then tweaking a thing here and there. The most difficult part is finding the parts of the game that really load up either the CPU or GPU as it can max out both at different times.

    With Keljian's offer, we have the potential to find out.

    I'm genuinely curious and willing to let the chips fall where they may. Are you?

    -Mike
     
  38. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    55,081
    Joined:
    Feb 9, 2002
    You brought up storage as the reason not to look at the actual number Super Pi gives you as your result and instead view loop 24. That's the only reason I brought it up. I'm not even sure that's accurate as I ran the test from a mechanical hard drive in my case.

    You mean; How do I know Super Pi doesn't translate to game performance? Because it doesn't. There is a big difference between what's done in a game engine and calculating Pi over and over again. Also, experience tells me that modern processors are faster than older ones even if Super Pi doesn't really showcase this. There are plenty of other benchmarks and applications that do.

    For example, cache sizes and cache design greatly impact game performance. This is literally how AMD raised the performance of its Ryzen 3000 series to be much more in line with Intel's gaming performance than its 2000 series was. AMD even calls the increased L3 cache; "Gamecache."

    Changes in that benchmark and changes in Super Pi are coincidental. I'd be willing to bet they do not correlate into anything useful. Meaning, you can't use Super Pi as an indicator for how many FPS your going to get under xyz circumstance. It's the same for something far more sophisticated like 3D Mark, which is actually designed for that purpose. The variables that impact it do not necessarily translate to games. You can't say I get 8056 3D Marks, so I can get 120FPS in CoD:MW 2019. It simply doesn't work that way. Again, calculating Pi isn't the same as running a game engine. That goes for any engine.

    For Super Pi? Yeah, I'd think so. Clock speed is going to matter here more than anything and the architectural changes made since Sandy Bridge are probably not going to have much of an impact here. Calculating Pi is a pretty simple task. It's hard on the CPU in a sense, but it doesn't utilize that much of it. Again, this is why you can't look at Super Pi results as an indicator of game performance. You are comparing Apples and 1973 Mustang II's. There is nothing meaningful in the comparison.

    I brought up the other things because those other things you aren't interested in can and do directly impact game performance. Again, Super Pi doesn't make sense as an indicator of game performance. You are putting way too much stock in it. Again, your misinformed in thinking that Super Pi in anyway shape or form has any bearing on how different CPU's behave in games. My Super Pi times are similar on my 10980XE and the 9900K. However, the latter is far better at actually playing games.

    What I said wasn't an "Intel sales pitch." First off, I wouldn't recommend an Intel processor in most cases right now. Secondly, I am trying to make a point that Super Pi doesn't mean jack shit in the realm of gaming. How do I know this? Its simple. I've literally been reviewing and working with this hardware for more than two decades. I've had all of these generations of CPU's on my test bench and I can tell you that the benchmarks show case the differences and how far we've come. Sure, its not the same as taking a Pentium III at 233MHz and comparing it to one at 1GHz, but there has been major advancements since Sandy Bridge. Many of those advances will impact your gaming experience.

    It seems I'm the one who hit the raw nerve.

    Tuning every game is pretty much the same. You find out what settings have the most impact on performance and decide what trade offs to make regarding performance vs. visual fidelity. As for the hardware side, its not that complicated. You get your CPU, RAM and GPU running as fast as possible while being stable. The only part of the equation that's difficult to manage sometimes are the game settings themselves. Certain shadow options or other features may impact one engine more than another in terms of performance, but due to implementation, may or may not impact visuals very much.

    The data is what it is. I'm curious to see what his findings are. Either way, I'm still fairly certain that Super Pi can't be used as a meaningful benchmark for determining game performance. This is where you've seriously gone wrong here. Even if the 9900K isn't much faster with this game than your 2600K, it still wouldn't prove that Super Pi is a good metric to go by. Again, the Super Pi benchmark results are virtually the same between my 10980XE and the 9900K. Yet, for gaming, the latter is considerably better in most cases. While average frame rates may report the same, getting into the lows, maximums and frame times, you'll see that the 10980XE is in some cases, vastly inferior to the 9900K. In other words, if all I get out of you two is an average frame rate for each system, I won't be convinced because that by itself is virtually meaningless.

    i actually test performance for a living and I don't limit my scope to a single ancient game or ancient hardware. I don't judge hardware's viability for gaming by a benchmark that has nothing to do with gaming that wasn't designed as a metric for gaming performance in the first place.
     
    Last edited: Dec 4, 2019
    Keljian likes this.
  39. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    I have had a particularly hectic day. I will get to this as soon as I can.
     
  40. Keljian

    Keljian Gawd

    Messages:
    701
    Joined:
    Nov 7, 2006
    For the record doing the things asked for above:
    13.5% cpu usage (if it were single threaded it would be <10 like superpi, 100%/16 threads= 6.25% per thread)
    15-16% GPU usage
    30-31 FPS in fraps using the same setup

    Based on what I see on the screen, I think what is happening is that it is trying to continuously, serially decompress textures/objects and that is locked to some kind of cycle in the game loop which limits the performance. Whether it does this on purpose is another thing, it may be that it's locking the frame rate to 30~ FPS on purpose for game mechanic reasons- Eg physics

    It is highly possible that the draw distance or some other feature is limiting it also..

    I don’t believe that this is actually cpu limited, at least no more than I believe UT3 was at release where it had a leak which pegged as many threads as it could at 100% but most of that was engine waiting on things to happen, this was fixed by later versions.

    Based on my many years of game/programming knowledge, I truly believe the issue here is bad coding and/or lack of optimisation due to locking code - not lack of cpu power.

    If, in the extremely unlikely event I am incorrect, it is possible that a Ryzen chip of decent clock (say 4.4-4.6ghz single thread) would beat out the intel equivalent for this, being that they do have more execution pipelines/resources per core, and therefore potentially fewer opportunities to stall. This does rely on what the compiler did when it compiled the program though.

    I also ran the game off the nvme (per sig) - disk usage was negligible
     
    Last edited: Dec 4, 2019
    TXE36 likes this.