Optane Memory - Is it for you? <Opinion/Review>

Keljian

[H]ard|Gawd
Joined
Nov 7, 2006
Messages
1,829
Over the last month or so, as a result of a thread on the Optane 900p, I have been experimenting with Optane memory and Primocache.

What was established in the thread was that Optane has very low latency and high performance for low queue depth loads- which generally means that for desktop database applications (including things like browsers and games which use sqlite) there might be a benefit in having some Optane cache.

I have not been sponsored by anyone - all costs for these tests have come out of my own pockets

The system I'm running is:
  • Ryzen 1700, with a mild overclock (3.7 all cores)
  • Asus Prime x370 pro
  • 16 gig of ddr 3000 running at 14-14-14-34
  • Zotac AMP 1080ti (for completeness - not that it matters)

In terms of what disks I have to test with - I have the following:
  • 950 -- 512gb Samsung 950 pro with cheap and cheerful heatsink from ebay
  • 850 -- 512gb Samsung 850 evo
  • MomentusXT - 512gb Seagate momentus XT (the original - with SD23 Firmware) - note updating the firmware on this has been a royal pain in the behind which I have spent hours trying to do without success over multiple machines, but it has been reliable.
  • Tosh128 -- 128gb Toshiba which I pulled out of a Dell Inspiron 11 3000 (THNSNK128GCS8)
Software wise I'm using the Samsung NVME driver for the 950.

As for the optane, it's a 32 gig original Optane memory drive, residing on a riser in the second 16x pci-e slot.

Regarding configuration of primocache:
  • Blocksize is set to 4k
  • Ramcache is set to 1GB -as this is the minimum someone would likely use in a general use case
  • For both L1 (Ram) and L2 (optane) cache the cache is set to shared - so both Read and Write for the entire cache space
  • Deferred Write is set to on (10sec)
  • Prefetch is set to on
  • L2 size is set to Maximum
FAQ
Why haven't I used ATTO/IOMeter/HDTach?

Because they're not relevant for the majority of people, and the sheer volume of data they produce with just four devices is obscene - they won't add anything relevant to the results

What am I looking for?

Something thing will justify to Joe/Jane Average that Optane Memory is worth it for them, over or with an NVME drive.

Why do I care?

Because the combination 32gig of optane + Primocache is a relatively inexpensive investment, it's probably in the range of a 256 gig SSD in terms of price - it's accessible. It's new technology, and I like playing with new technology

Why am I doing this?

Because no reviewer to date has put together a test like this. They've tested SATA SSDs and SATA HDDs but they have not:
Run low blocksize cache (at block level)
Tested with a reasonably fast NVME drive being cached
Tested on a Ryzen system

....and few have tested with commodity SATA SSDs

Will you Bench X/Y/Z?

In short, yep, I'll keep adding to this thread as I go - suggest a test and I'll give it a whirl.



On to the testing...

CrystalDiskMark

For CrystalDiskMark - I wanted to set it up such that the size of the dataset was larger than the ram cache - so we're focusing specifically on the optane. Thus I set the dataset size to 2GB from the original 1GB setting.

CrystalmarkBaseline.JPG


upload_2018-5-31_15-44-31.png


So a few things become immediately clear

  • The MomentusXT speeds increase dramatically (as expected), as do the other SATA drives including the Samsung 850 evo
  • The 950's read speeds don't change in any great way at the top end, but beyond that, in the 4k range they improve dramatically
  • The 950's write speeds decrease in a big way at the top end (roughly a third of the performance of without cache).
AS SSD Bench

For the AS SSD Benchmark I did a similar thing - Set the working set size to 3GB, as 2gb isn't an option, so that it would saturate the 1GB ram cache. I also rearranged the results for readability, and ran the tests a few times each to make sure the results were consistent
AS SSD Baseline.JPG
AS SSD Results.JPG


AS SSD Access times.JPG



The things that stand out here:
  • Threaded 4k access - essentially a huge jump in performance for anything that uses a database.
  • Access times - don't quite double, but things get slower with cache direct access. I assume this is due to the fact you're adding some smarts to the storage layer.

So in Conclusion
SATA
If you have a SATA drive as you primary data drive, no matter what it is, the combination of Optane and Primocache is worthwhile. You'll notice it when it comes to loading times, on windows boot times and you'll see a jump in write speeds.

NVME
On the surface, things are a little less clear with NVME. In addition to the tests I've run here, I've run several different cache sizes and the results are essentially the same. Write speeds go down. On the other hand you see read speeds, especially for 4k threaded loads, rocket up. The answer to what is better here depends on your use patterns.

If you play games, web browse, and generally are a "light" storage user, then Optane will give your system a kick in the pants. If you are doing more heavy user type work where lots of files continuously need to be copied/written and read then stick to not using cache. That said I can't see an instance where you'd have this circumstance in a home environment.
If you are doing this, it makes sense to limit the cache to 2-3gb or so, so that it fills quickly and overruns the data directly to the drive while still getting the benefit when you're using other desktop software.

I guess this all says that even if you have NVME drives, it is worthwhile having the Optane/Primocache combo.
 
Last edited:
I've always wanted an Optane for some reason. I think the theory is neat.

Looks like I could possibly justify one for my 960 Evo, based on your results.

How easy is the software to set up and tweak? I have multiple drives I'd like to try it on. Will that be an issue?

The software is reasonably easy to tweak and you can cache multiple drives with one Optane (I have).

upload_2018-6-1_13-4-38.png upload_2018-6-1_12-56-18.png upload_2018-6-1_12-58-19.png

The key is the 4kb block size which plays to the Optane's strengths.

Regarding setup. You have to set the L2 storage space up (format as NTFS, then allocate) and reduce the preset L1 cache size. I like defer write and am prepared to shoulder the risk as the L1 cache size isn't too big. Prefetching last cache makes things a bit quicker for booting
 
Last edited:
Appreciate the effort, but this needs real life tests.
 
I have the Momentus XT 750 and it is a good drive. It's too bad they never expanded on this in any meaniful way. The newer 3.5" drives still have a small amount of flash (8GB).

I used the Sandisk Ready Cache 32GB on my work computer while it still had a spinner and that was a pretty good implementation of cacheing. Especially where I was able to daily see the difference of my computer opening programs versus my co-workers and their hard drive thrashing away.

I do like the idea of caching. It would be great if the OS was aware and could optimize like Apple's Fusion Drive does (which also has no limit on SSD size). I don't know if it's really applicable to power users like much of us that have all SSD storage, but for regular people that want an OS drive with 1 - 2 terabytes, this is a great solution.
 
Ok with 950 pro - FINAL FANTASY XIV: Stormblood Benchmark

cache on 12.283 sec
cache off 15.287 sec

This correlates reasonably well with the results that Tweaktown managed, essentially showing that there's a 20% gain in performance with the optane caching the 950.
 
Last edited:
Ok with 950 pro - FINAL FANTASY XIV: Stormblood Benchmark

cache on 12.283 sec
cache off 15.287 sec

This correlates reasonably well with the results that Tweaktown managed, essentially showing that there's a 20% gain in performance with the optane caching the 950.
After which run?
 
I thought Optane didn’t run with Ryzen? That it needed a 7th gen Intel? What voodoo is this?
 
After which run?

Run 2 (is it surprising that cache only works on subsequent runs? isn't that the idea?) - single run speeds are identical to no cache.

I thought Optane didn’t run with Ryzen? That it needed a 7th gen Intel? What voodoo is this?

Optane works with Ryzen, just acts like a normal disk
 
Last edited:
Run 2 (is it surprising that cache only works on subsequent runs? isn't that the idea?) - single run speeds are identical to no cache.
It's not, but it is hard to be sure of the benefits if the cached data gets swapped often. And then there is the factor of how much of the game's assets are you loading multiple times and similar.
 
Yeah, that's why a 16/32 gig cache makes sense, you'd think it would swap less frequently
 
I thought Optane didn’t run with Ryzen? That it needed a 7th gen Intel? What voodoo is this?

Intel's cache software does not run in Ryzen or older Intel boards. However you can use other software in windows and certainly in linux that provides similar (even better) features to the cache software from Intel.
 
I'm looking into adding a 58GB 800p and primocache to my system acting as cache for both my sata ssd's. Would you recommend it?
 
I'm looking into adding a 58GB 800p and primocache to my system acting as cache for both my sata ssd's. Would you recommend it?

Sorry for the late reply - yes I would recommend it, but I think 32 gig would be enough for general purpose things and be less than half the price
 
Last edited:
So... use it?


Got no use for it. Probably go on Ebay or thrown out in a few years time. I have two NVME drives that are far faster and larger and I can't detect the difference in latency. Plus I don't use spinning rust anymore so...

It was bought for an now abandoned experiment that had little long term pay off.
 
Sorry for the late reply - yes I would recommend it, but I think 32 gig would be enough for general purpose things and be less than half the price

Would it make sense to make a small partition for swap file and then use the rest for read cache?
 
Please note these kind of test runs into the issue of testing oranges to apples

The test ask for uncached read/writes to bypass the windows cache so in noemal mode its uncached read/writes
Primo cache do not adhere to this and still cache the read/writes

So you are comparing cache red/writes to uncached /read writes/ which naturally make the difference appear a lot more than what real world would have ( since they would used cached as well)
This is not a good area for synthetic testing.

Following up with real world application would put a lot more weight behind the claims.
 
Please note these kind of test runs into the issue of testing oranges to apples

The test ask for uncached read/writes to bypass the windows cache so in noemal mode its uncached read/writes
Primo cache do not adhere to this and still cache the read/writes

So you are comparing cache red/writes to uncached /read writes/ which naturally make the difference appear a lot more than what real world would have ( since they would used cached as well)
This is not a good area for synthetic testing.

Following up with real world application would put a lot more weight behind the claims.


Rather than just saying this -why not suggest a test?
 
Rather than just saying this -why not suggest a test?

I could but I was not runinng the testing here and ppl tend to be testy when you impose on "their work" so its a hard balance to do

but really anything that has a real world I/O and tres to get dta as fast as possible

7-z compresion in fast mode
srep data depuplication
game load times
Xcopy of different file sets
.par file creation
MD5/sha1 file creation

Some of these have a strong cpu utilization as weel so need to keep notice that you are not cpu bottlenecke
 
Back
Top