Real Experience with QLC, TLC, MLC?

/dev/null · Oct 18, 2019

My heavy write SSD, a Crucial MX500, is at 9500 hours power on time & down to 76% life after 13 months & 30.5TB of writes. It is writing around the clock doing packet captures. It cost $200 for 1TB. I expect that within 2 years I can buy a similar consumer drive for ~ $50 or so for 1TB.

oleNBR · Oct 19, 2019

EniGmA1987 said:
Ive had the opposite experience. Any drives of other brands using a Sandforce controller were a deathtrap. Even Intel back then was more reliable than the other brands using them but I still had 2 sudden and random failures from Intel with SF. Also had 1 failure from Intel with a non-SF controller a few years ago. Had 2 of 2 failures on Crucial drives with Marvel chipsets. And out of the few dozen Samsung I have used, only a single failure with an OCZ Summit drive (Samsung hardware). The other half dozen Summit drives were fine until they were retired, and havent had any failures of any 470 series, 830 series, 850 EVO series, 860 EVO series, or my 970 EVO series. So ya, been using Samsung SSDs continually for 10 years now and only had the single failure.

damn my experience is the opposite rofl. none of my SF drives died, even to this day its been 5-6 yrs already. on the other hand i have had samsung and sandisk the pro version dying after few yrs or having weird problems where can't delete files and force me to secure erase and hope of the best.

the issue with SF is the firmware, need the right firmware and you'll have no problem with it.

IdiotInCharge · Oct 19, 2019

oleNBR said:
damn my experience is the opposite rofl. none of my SF drives died, even to this day its been 5-6 yrs already. on the other hand i have had samsung and sandisk the pro version dying after few yrs or having weird problems where can't delete files and force me to secure erase and hope of the best.

Same... have a Vertex 2 that had to be RMA'd back within warranty and the replacement is in current service. Have a pair of Intel 320's that had the 8MB bug, lost one install to that, both still in service.

daglesj · Oct 22, 2019

I must admit when I install a SSD for a customer and myself I still leave a bit unpartitioned to boost the overprovisioning. I have no idea how long some of these machines will be 'in the wild'.

120GB 1GB
250GB 2GB
500GB 4GB

sunruh · Oct 22, 2019

daglesj said:
I must admit when I install a SSD for a customer and myself I still leave a bit unpartitioned to boost the overprovisioning. I have no idea how long some of these machines will be 'in the wild'.

120GB 1GB
250GB 2GB
500GB 4GB

just curious how do you know for sure that this unpartitioned space is/will be used for over provisioning?

IdiotInCharge · Oct 22, 2019

sunruh said:
just curious how do you know for sure that this unpartitioned space is/will be used for over provisioning?

I can't answer the question with proof, but this is also my understanding of how SSDs work: unused (unpartitioned) space is used by the controller to maintain data integrity through wear leveling.

Note that unlike spinners, SSDs don't have contiguous partition 'blocks'. They stripe data across channels among many other intelligent means of maintaining data integrity and drive performance over time and under load. Unpartitioned space isn't tied to any particular flash cells or blocks directly, and it gives the controller more room to 'maneuver'.

drescherjm · Oct 22, 2019

just curious how do you know for sure that this unpartitioned space is/will be used for over provisioning?

If you did not write to a block the SSD will know it can be used for wear leveling.

daglesj · Oct 22, 2019

Yeah it's just an old habit I have. Probably doesn't make any difference but you never know.

oleNBR · Oct 22, 2019

daglesj said:
Yeah it's just an old habit I have. Probably doesn't make any difference but you never know.

it does make pretty big difference especially when drive is filled up.

Maxx · Oct 24, 2019

daglesj said:
Yeah it's just an old habit I have. Probably doesn't make any difference but you never know.

Modern controllers engage in what is known as "dynamic overprovisioning" which means they can leverage any unused space as OP. There are global algorithms that manage this based on wear-leveling metadata (stored in SRAM, DRAM, and on the flash, along with the mapping/addressing information). Because flash is addressed logically it doesn't matter what the OS thinks or sees; simply leaving space free is adequate. There are certainly exceptions with heavier use, I'm talking for normal/consumer usage - there's basically no difference. However, partitioning away some space may be useful for your clients who might overfill the drive otherwise.

daglesj · Oct 25, 2019

Maxx said:
Modern controllers engage in what is known as "dynamic overprovisioning" which means they can leverage any unused space as OP. There are global algorithms that manage this based on wear-leveling metadata (stored in SRAM, DRAM, and on the flash, along with the mapping/addressing information). Because flash is addressed logically it doesn't matter what the OS thinks or sees; simply leaving space free is adequate. There are certainly exceptions with heavier use, I'm talking for normal/consumer usage - there's basically no difference. However, partitioning away some space may be useful for your clients who might overfill the drive otherwise.

Yeah that last part is the reason I do it on customers drives. Just in case they just fill them up some way (let me see - setting Windows backup to backup the entire SSD every week, some bug causes 80GB+ of event error logs, Office365 creates masses of logs etc. yep seen all of those and more) the whole thing doesn't grind to a halt.

Maxx · Oct 25, 2019

daglesj said:
Yeah that last part is the reason I do it on customers drives. Just in case they just fill them up some way (let me see - setting Windows backup to backup the entire SSD every week, some bug causes 80GB+ of event error logs, Office365 creates masses of logs etc. yep seen all of those and more) the whole thing doesn't grind to a halt.

It's a good policy. Drives will have some native/physical OP which varies depending on SKU. 480/500/512GB drives usually have 512GiB of physical flash (there are exceptions - ADATA SU800, for example) which relates to 7%/10%/15% OP; an exceptional amount would be the SU800 as mentioned (20% OP - 576GiB of flash). Anything over the normal 7% - 512GiB vs. 512GB - is marketed OP. You can see subtle difference between drives that share the same hardware, Phison E12-based drives for example. AnandTech even has a review of one that shows the consumer benefit is negligible.

That being said, drives with a large SLC cache (the SU800 again) tend to do worse when fuller because their steady state performance is crap. Dynamic SLC requires conversion to/from TLC and interferes with dynamic OP. This is one reason the SU800 is OP'd so much out of the box. Same with the SX8200NP. The SX8200 Pro doesn't do this, and suffers when fuller partially as a result (see AnandTech's article on that, as well). DRAM-less drives also tend to be 15% out of the box (e.g. 120/240/480) to mitigate the lack of DRAM.

So why am I getting so technical/specific here? Because the amount you would leave unpartitioned should relate to the specific drive, usage type, and native/physical OP. I usually tell people 20% which would be a considerable amount (480/500/512GB -> 460GB) but depending on hardware and usage the 15-25% range is good. That would be 110-120, 220-240, 440-480. But it's easy to get GiB and GB mixed up with that.

etudiant · Oct 25, 2019

Maxx said:
PLC will offer 25% more capacity than QLC with at most half the endurance, plenty for WORM and specific workloads....... ).

I don't understand that 25% estimate, should not capacity double with every additional cell level?

Maxx · Oct 25, 2019

etudiant said:
I don't understand that 25% estimate, should not capacity double with every additional cell level?

4-bit to 5-bit, 25% more. It doubles in the number of voltage states: 2^4 (16) vs. 2^5 (32). This is why endurance drops to one-third (roughly) every time you bump up a bit. There's also physical density which often confuses people: most consumer TLC today is 256Gb/die while the most common QLC is 1Tb/die, or four times as much. So QLC holds 33% more than TLC per cell (4-bit vs. 3-bit), twice the voltage states (16 vs. 8), has about one-third the endurance (1K vs. 3K P/E), and is four times as dense (1Tb vs. 256Gb).

IdiotInCharge · Oct 26, 2019

Maxx said:
So QLC holds 33% more than TLC per cell (4-bit vs. 3-bit), twice the voltage states (16 vs. 8)

Aren't the 'voltage states' the actual 'bits' themselves? So in true binary fashion, one more bit means double the total information since we go from 2^4 (16bits) to 2^5 (32bits), right?

Snowdog · Oct 26, 2019

IdiotInCharge said:
Aren't the 'voltage states' the actual 'bits' themselves? So in true binary fashion, one more bit means double the total information since we go from 2^4 (16bits) to 2^5 (32bits), right?

More accurately it's 2^4 bits (16 states) to 2^5 bits (32states).

You are describing how in the digital domain, that we are used to thinking in terms of, you add one more cell containing one bit in parallel to other bits and doubles the states. It doubles the states, not the bits.

So in this case, you do only get 2^3 bits (8 states) to 2^4 bits (16 states). We don't buy storage in terms of states. We buy it in terms of bits. So you are only getting 33% more storage(bits), for the doubling of Voltage levels.

Here, trying to increase cell density, we are working backwards, and trying to create all the outcomes/states of having that extra parallel bit, but as analog voltage levels in a single cell. This kind of works against us.

Digital domain:

Two bits can represent:

00
01
10
11

4 States (4 distinct voltage levels in NAND cell)

Add one more bit/cell in parallel and you get.

000
001
010
011
100
101
110
111

8 States (8 distinct voltage levels in NAND cell)

(and for QLC 16 states, and 32 states eventually for Penta-level Cells, So PLC will only be 20% more storage for another doubling in voltage levels)

But they are trying to do all the states in a single cell. So for QLC you now need to differentiate 16 distinct voltage levels.

Say you have 3 volt SLC. All you have to distinguish is between 0 Volts and 3 Volts. This is why it is So durable. It can degrade by a significant amount (on the order of 1.5 volts, but below that) and still be read.

At QLC: 3V/16 = You have to have signal degradation below 0.187 volts to be readable, which is why these cell fail after only relatively few cycles. It takes only a minor shift in voltage to make the cell unusable.

I would avoid QLC like the plague.

Maxx · Oct 26, 2019

To add on to what Snowdog said, NAND-based SSDs are also arranged in pages and blocks where you can write at the page level but erase at the block level. Sizes of these go up with levels, for example Hynix's 96L/1Tb QLC has a 64KB page size and 18MB block size, whereas older MLC drives were 4KB and 256KB respectively. This has become increasingly important because there's a larger reliance on "SLC cache" (native NAND in single-bit mode) and dynamic SLC cache converts to/from native flash with the added problem that SLC block erasures count as erases for all the underlying flash (e.g. four QLC blocks). Static and dynamic SLC wear differently (as on the Intel 660p) but current drives treat them equally (this will change in the future). Upcoming NVMe 1.4 drives will be better able to manage these things due to zoning, persistent memory areas, etc., which is why QLC and eventually PLC are even reasonable despite all these issues. Sensing through voltage (generally floating gate, but Micron's upcoming will be charge trap) does not alter the fundamental nature of binary digits (bits), it is a lossless storage medium (some controllers do compress on-the-fly which can lead to a <1.0 write amplification factor, but that is a different discussion). Writing/programming (especially at lower temperatures) damages the cell structure and modern drives rely on error correction (usually LDPC - the move away from BCH came once ARM microcontrollers became powerful enough for it to be worthwhile) to read "fuzzy" values, the threshold states as mentioned by Snowdog. There's ECC in all the data paths (e.g. DRAM) as well and further data is re-read and re-written as necessary (static data refresh) to keep performance and endurance high. In any case, this wraps back around to the SLC caching point because acting in single-bit mode basically reduces the threshold value which makes it easier to read and write (a simplification) but you clearly have the 4-bit QLC becoming 1-bit SLC at four times the capacity cost; there are literally drives in existence that prove states do not equal capacity (the Phison E16/4.0 drives for example have up to 360GB of SLC cache which is only possible with 1:3 from ~1100GB of raw NAND).

Jandor · Nov 4, 2019

Regarding SSD, QLC is better than TLC, which is batter than MLC, which is better than SLC. Not better for you, better for the manufacturer.

x509 · Nov 6, 2019

Jandor said:
Regarding SSD, QLC is better than TLC, which is batter than MLC, which is better than SLC. Not better for you, better for the manufacturer.

So what should a shlub

like me do, if all I want is an NMVe drive of say 500 GB or 1 TB for my next build, which I will do early next year? This drive would be my Windows and data drive, including swap and hibernation. Which brands/models to focus on, which to avoid like the plague?

IdiotInCharge · Nov 6, 2019

x509 said:
which I will do early next year?

Ask again early next year

Maxx · Nov 6, 2019

x509 said:
So what should a shlub like me do, if all I want is an NMVe drive of say 500 GB or 1 TB for my next build, which I will do early next year? This drive would be my Windows and data drive, including swap and hibernation. Which brands/models to focus on, which to avoid like the plague?

Here and here.

Jandor · Nov 6, 2019

x509 said:
So what should a shlub like me do, if all I want is an NMVe drive of say 500 GB or 1 TB for my next build, which I will do early next year? This drive would be my Windows and data drive, including swap and hibernation. Which brands/models to focus on, which to avoid like the plague?

My bet is on Samsung or Crucial. Stay on TLC not QLC. Lower cost : MX500 or 970evo. More expensive : 970 pro (MLC and better endurance). However if you want to keep your data make backups on hard drives (not SSD).
And be aware it won't make big difference in real world use, between SSD on Sata and SSD on Nvme or PCIe.

daglesj · Nov 6, 2019

I'm quite surprised after all this time people are still wringing their hands over such tech. It's all pretty much of a muchness in terms of real world experience.

IdiotInCharge · Nov 7, 2019

daglesj said:
I'm quite surprised after all this time people are still wringing their hands over such tech. It's all pretty much of a muchness in terms of real world experience.

I just look at it as 'price per capacity per performance class'

daglesj · Nov 7, 2019

IdiotInCharge said:
I just look at it as 'price per capacity per performance class'

Yeah at the end of the day just doing a quick search on "Best NVMe drives 2019" would give you a clear indication of the best 2-3 drives to choose from within a couple of minutes. The usual suspects will all be there and then as you say go on price and size.

Real Experience with QLC, TLC, MLC?

[H]F Junkie

Limp Gawd

NVIDIA SHILL

Supreme [H]ardness

Limp Gawd

NVIDIA SHILL

[H]F Junkie

Supreme [H]ardness

Limp Gawd

[H]ard|Gawd

Supreme [H]ardness

[H]ard|Gawd

n00b

[H]ard|Gawd

NVIDIA SHILL

[H]F Junkie

[H]ard|Gawd

Gawd

2[H]4U

NVIDIA SHILL

[H]ard|Gawd

Gawd

Supreme [H]ardness

NVIDIA SHILL

Supreme [H]ardness