Testing pSLC cache size

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
Does anyone know the best way to test the pSLC cache size on an SSD? Tom's Hardware seems to be one of the few that reports on this in reviews, and they use IOmeter, but I can't figure out exactly how they do it. I didn't get a reply when I asked on their forums. I tried playing with different configurations for runs of IOmeter, but I can't get anything that gives me a result that seems to show a difference when cache is used up, or gives me multiple data points that could be graphed like Tom's does.

The big thing is I was hoping to see what happens when you have a single partition filling the drive versus smaller partitions, but I suspect that IOmeter will only be able to test it when performed against an unpartitioned drive.
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
Thanks. Ugh. I thought IOMeter itself would record data points in its log at intervals, given that you can select the interval to update the status display. I don't really understand what that app is supposed to do (at least in the GUI) if all it does is report a single result for a drive/volume.

I still can't figure out for sure what settings to use to "hammer the drive" though. I don't know what IO size would be best for it, or how long to run it. Without the custom application, I guess I'd have to sit in front of the computer watching the updates for however long it might take to write say 166GB on a 500GB drive, with the run time set much higher, and see if the speed suddenly drops? I suppose TechPowerUp and Tom's apps run on command line and can capture the text of the result updates to build a graph. And of course they probably wouldn't make it available for anybody else to use.

Theoverclockingpage is just using Crystal DiskMark and ATTO as well as IOMeter, but they're actually disabling the pSLC cache by modifying the drive with a tool specific for the controller rather than running out the full capacity.
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
If you have access to a linux install then dd reports the write speed of a copy so something like...
Does it record the speed at set intervals, like every 10 seconds? Or every 512MB (which would be better since it would show me what capacity the speed dropped at directly)? Or would it do it at every "block size"?
 
It would do it on the completion of the command - which is one of the reasons this loop runs the commands repeatedly but to a different output file each time. Setting the block size ('bs'), to something like 32M or 64M would give you reasonable fidelity.

You'd need to play about with it and test for consistency. Unfortunately I don't have an idle linux box to hand to properly bash it about and to see if it'd be consistent.
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
That seems like it wouldn't do me any good. I need to see what the speed is after it writes like 20GB, 40GB, 60, up to 160-ish where a 500GB drive's pSLC cache would be maxed out and it would start needing to fold to TLC while writing new data, and see the speed as it writes to 200, 220, etc. and then reaches the full drive usage. (Samsung drives have a dedicated SLC cache, then some pSLC, so they'd have a different report from others.) And it all has to be done in one stream, so that there's no chance for the drive to fold any data that was already written, and no deleting of data so that blocks become free.
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
Source? Which models (some? many? all?) Not sure I recall this mentioned anywhere else!
Samsung calls it Intelligent TurboWrite and on some models allocates a fixed pSLC cache amount of like 4GB for 500GB drives, 6GB for 1TB, 10GB for 2TB, which never converts to TLC/QLC, then uses some of the rest of the space as dynamic pSLC cache. On other models, they use a larger amount, but never the full capacity. Pretty much everyone else just uses the full drive as dynamic cache, although one Tom's article mentioned others using the "hybrid" scheme like Samsung.

I think Samsung doesn't use the full space in order to prevent the need to fold to TLC/QLC at the same time as new data is being written; there is always some TLC available that hasn't been used as pSLC, but it means there is less pSLC to work with in the first place. But when you're talking about needing to write a full 100+ GB in a single blast, the times that users are going to run up against that are kind of limited. Most drives write pretty fast even while folding now, but I think Samsung's is slower writing direct to TLC/QLC than others so they use this method to avoid having both happen at the same time.

https://www.tomshardware.com/reviews/samsung-980-pro-m-2-nvme-ssd-review
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
You want it to be the internal, Block Size of the device. Typically that's 4KB for SSDs, but it could be 512B. Check with the manufacturer's specs.
Would that apply even if it's doing 512-byte emulation? I assume so, as writing 4KB would mean a single write action on an SSD rather than the rigmarole required to make 512-byte writes (probably no difference on a mechanical drive). I've been changing my main SSDs to 4K before use recently but the ones I use externally I've kept at 512e to ensure compatibility wherever they go. (Although I did recently learn that some enclosures/controllers abstract the sector size to 4K no matter what the actual drive uses, causing compatibility issues themselves. All mine seem to do 512-byte abstraction.)
 

continuum

Ars Legatus Legionis
96,306
Moderator
Would that apply even if it's doing 512-byte emulation? I assume so, as writing 4KB would mean a single write action on an SSD rather than the rigmarole required to make 512-byte writes
As you surmise, 4K would be smallest block size even under emulation. If you are doing 512e I am actually surprised you have not taken a significant performance hit.

As far as Intelligent Turbowrite, I had forgotten-- thought Samsung had killed that after the 840-series! (I see it's even in the current 990 EVO Plus (!) and 990 PRO, although they are now painfully light on details/criteria, especially for the 990 PRO...).
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
As you surmise, 4K would be smallest block size even under emulation. If you are doing 512e I am actually surprised you have not taken a significant performance hit.
Every consumer drive I've ever seen uses 512e by default, to ensure compatibility with every device and OS. That insecurity about compatibility is preventing moving beyond it, but with SSDs the performance hit is not all that huge because it only actually reuses a block if there are no free ones available anyway (where it would have to read/erase/rewrite). Otherwise it just marks the original block as free for erasure and uses an empty one just like if you'd modified 4K of data (the only additional delay is needing to read the whole original 4K block, but that's done internally where performance is much faster than even the NVMe interface). HDDs take a bit more of a hit I believe.

On the Tom's forums though, there are a couple of threads with people who just got SN770 SSDs that seem to have come with 4K logical sectors enabled, which seems to be causing some compatibility problems as expected.

I've pretty well lost trust in Samsung for SSDs anyway. Tom's acknowledged the lack of detail about "read caching" with the 990 Pro, but their testing seems to show it's the same limited amount rather than the entire drive (Samsung seems to like using weird amounts, like 240GB would be 7.5x3=22.5% of the total (binary) capacity). I'm not sure how they determined 10GB is static other than assuming because it uses Intelligent TurboWrite 2.0.
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
tl;dr: Splitting an SSD into multiple partitions results in the pSLC cache size for each partition being split proportionally as well.

SO! This is interesting to me, and not what I expected. I wanted to test how pSLC is affected by partitioning, because the tests that are performed are only done on empty drives in reviews. I bought a new Orico e7400 500GB drive (476 binary) drive for my laptop, and before putting it in, I tested it in my desktop, an X570 board with the drive connected to a PCIe adapter in an x4 slot via the chipset (which seems to cause some limits as it does not reach the max read speeds). I got the information from one of Tom's people on the IOMeter test they run (just run on command line, with a switch to output an extra result file that records the speeds at intervals).

Testing the full, unformatted drive shows it uses the full space for pSLC, with 4800MBps up to 125GB, which is exactly right for a QLC 500GB drive. Then it drops to about 860 for a long while, even peaking at 900, which seems to be where it is folding while writing, which is a pretty big drop. Eventually the folding seems to fall behind, as it drops down again to about 400, direct to QLC speed. Then after 3 minutes exactly, it jumps back up to the 800 range and stays for the rest of the test, 30 minutes total. I wonder if a longer test would show it repeatedly going through these short periods where the folding seems unable to keep up. Still, pretty good for a cheap QLC drive I think, and great for my laptop usage. The speeds do have a lot of random very low points which then jump back up to normal (like 500MBps slower at times) but it's mostly within acceptable variance.

Then I tested with the drive formatted with one full partition. IOMeter fills the drive with a data file for this (I haven't investigated why it needs to), so essentially there are zero free blocks to be used as pSLC cache, and it writes around 860 through the entire drive.I would have expected it to hit 400 since there's no space to use as cache and no space to fold anything out, but maybe it's actually got some overprovisioning for this.(IOMeter doesn't report the speed while it is generating the test file, which WOULD show the pSLC speed and the drop due to folding, I'm sure.)

Then I shrank the partition to 250GB. This time, the pSLC cache ran out at only 62! That's half the cache size available, matching the partition percentage. The drive does NOT use all that unpartitioned space as a pSLC cache area. It only uses the space proportional to the size of the allocated partition. I would have expected it to just use translated addressing for that entire 250GB of unpartitioned space to use it as pSLC then fold it over to the normally addressed QLC blocks after all the work was done. I guess perhaps this avoids the possibility of the space being in use when the user tries to immediately allocate it and write to it immediately after having filled the first partition, but that seems like an unlikely scenario and would otherwise just result in it writing new data while folding. It also showed a similar drop to native-QLC speed then back to 800-ish as the unformatted drive did (but the full partition did not), but it happened immediately after the cache ran out and lasted 1 minute exactly.

Then I tried it with two partitions filling the drive. The two both performed the same as the single smaller partition, so having that second half allocated seems to have made no difference.

So for anyone that partitions their drives, you are in fact reducing the potential performance of your SSD within each partition because each one has proportionally less pSLC cache space. This would really screw with Samsung drives which have a limited amount of dynamic pSLC cache in the first place rather than using the whole drive, or a smaller drive like 250 or 500GB. If you've got a 1TB Samsung drive and make two equal partitions, you go from having 240GB of cache to only 120 for each partition. If you're the type that likes to make a small partition for the OS (say 120GB and then a large partition for data and games, the Samsung drive would give you only 28.8GB of pSLC cache on the OS (which you might not notice since most of your large files would be going to the second partition, but it would depend on your usage, and Windows loves putting everything in Documents and Downloads and the like).
 
  • Like
Reactions: SportivoA

continuum

Ars Legatus Legionis
96,306
Moderator
I think you‘d want to test that behavior with, say, a Samsung drive before making the claim that your Orico E7400 drive behavior is applicable. The fact that the Orico E7400 will use the entire NAND capacity as pSLC is already a substantially different behavior than many other drives.

as it drops down again to about 400, direct to QLC speed.
Also that’s not direct to QLC speed that I would expect, that typically is less than half (even barely a third) that.

Not your particular model, but should be representative.
https://www.techpowerup.com/review/orico-o7000-2-tb/

Write speed starts out at a solid 4.5 GB/s. These speeds are sustained until 464 GB have been written, which means the drive will fill 90% of its capacity in SLC mode first. Once the SLC cache is full, write speeds fall off a cliff and reach only 160 MB/s, which is very slow—comparable to s HDD. Filling the whole capacity completes at 167 MB/s on average, which is the worst result in our test group, even slower than most SATA drives.

So I wonder what exact behaviors you are seeing. Sounds like more testing is needed!
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
I think you‘d want to test that behavior with, say, a Samsung drive before making the claim that your Orico E7400 drive behavior is applicable. The fact that the Orico E7400 will use the entire NAND capacity as pSLC is already a substantially different behavior than many other drives.


Also that’s not direct to QLC speed that I would expect, that typically is less than half (even barely a third) that.

Not your particular model, but should be representative.
https://www.techpowerup.com/review/orico-o7000-2-tb/



So I wonder what exact behaviors you are seeing. Sounds like more testing is needed!
Perhaps that is high for QLC and it's just the average within those seconds between direct to QLC and the higher "pSLC plus folding". It does drop down tot he 200MBps range fairly often, but 1 second intervals is all IOMeter offers. Maybe it's 100 for half a second then 800 for half a second as the folding jumps ahead. I don't think further testing will find anything more with that interval.

I do have a Samsung drive, but it's a fairly shitty OEM 256GB model. Very easy to saturate (1000MBps writes) so I'm not even sure how much slower it could get when writing direct to TLC without just giving me a virtual finger and refusing data, and the pSLC cache will be tiny. Tomorrow when I pull the Orico for the laptop maybe I'll put in the Samsung. Oh wait, the drive in the laptop is a Samsung PM991, which isn't completely awful, and it's only 512GB so there will be a larger cache. (There may only be a few seconds worth of cache even when writing to the full drive.)

I'd be willing to bet the behavior is still the same. Although they may not use the entire drive for pSLC, I think the logical allocation of it will still be the same, otherwise they'd risk giving one partition more of the pSLC while transfers are happening to both, even if they both need equal amounts, due to one having a higher transfer rate from the source maybe.

The e7400 is also supposedly just the O7000 with a copper heatspreader label on it.

I'm inordinately pleased with myself for this discovery. The miracle of mental illness and modern medicine working together!
 
Last edited:

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
I couldn't wait so I started testing with the 256GB PM991 last night. Performance on this drive is AWFUL, but it has a 34GB pSLC cache during which it writes at 900MBps or a bit more. After that it drops with a peak of about 180MBps, native TLC speed. After 20 minutes, it burst to 900MBps for a few seconds then dropped and ran at about 225MBps for the rest of the test. It seems like the folding process can't really keep up very well on this drive, maybe doesn't even start until late, and at that point it had almost completely filled the drive at the lower speeds.

This was my sister's drive in her HP machine, which I replaced after realizing how bad the drive was (on top of the x2 slot in the machine). It's also running at 81C under testing! Health dropped was at 95% after being used for about 2 years, and dropped another percent during testing. Probably not a drive I'll use for anything due to that, even though it could just barely be a good USB drive if I just needed a small amount of space.

Testing with a single full partition started out in the 210 range, so it was seemingly managing to fold while writing this time, and stuck with that for the whole test. This is similar to the Orico drive, but with the minimal performance to work with, folding can barely improve the speed. There were no bursts above that; it never reached peak pSLC speeds, even when I ran the test a few times with long intervals of idle and after forcing TRIM (though obviously I don't know that the drive did garbage collection in reality).

Testing with a single partition using half the space did not look good. It started off at only 175MBps and stayed that way for most of the test, then had random bursts up to 500 to 900. Impossible to tell more precisely when and how much it was making use of pSLC with or without folding with only a 1 second interval log.

Oddly, when I added a second partition, its performance was consistently a bit higher, with much of it in the 200-210 range. But there were zero bursts above that, not even at the beginning; it never got up to the pSLC speeds. I ran this one a couple of times as well.

I really don't know what to make of this, so I'll see what happens with the PM9B1 in my laptop when I get a chance to swap it, which should have a larger cache to work with and more than double the speed.
 
  • Like
Reactions: continuum

teubbist

Ars Scholae Palatinae
952
Then I shrank the partition to 250GB. This time, the pSLC cache ran out at only 62! That's half the cache size available, matching the partition percentage. The drive does NOT use all that unpartitioned space as a pSLC cache area. It only uses the space proportional to the size of the allocated partition.
Did you actually trim/blkdiscard the unused partition space? Because it sounds like you shrunk the partition and let the new smaller partition be trim'd while leaving the rest of the space allocated(as far as the SSD was concerned).

There's also the issue with some controllers not liking large discards, but I think that's mostly historic.

Otherwise that SSD is showing very different behavior to pretty much every other model on the market when it comes to short stroking.
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
Did you actually trim/blkdiscard the unused partition space?
I did, and then let it sit for like an hour before shrinking and testing. Then I expanded it again and TRIM'd and let it sit overnight before running the test again with a single half partition and then the second partition.

As far as what others look like, I forgot to even run any other tests like AS SSD, which does such a short test that it can easily fit in the pSLC cache. I haven't found anywhere other than Tom's that really does a detailed test of the cache size and the performance involved so I assume you just mean that others don't show a slowdown, but how many reviewers ever test it that way? Where have you seen anything other than a few anecdotal reports?
 

teubbist

Ars Scholae Palatinae
952
Where have you seen anything other than a few anecdotal reports?
Internal testing data, so I guess it effectively falls under your anecdotal header.

The Orico E7400 is apparently a MaxioTech controller, and I suppose it's possible they organize pSLC cache on a per plane or grouped block level which would explain your partitioned results. As a DRAMless controller that actually makes some sense as it'll avoid potential slow path FTL lookups.
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
Anyone actually interested in more tests? I'm doing them regardless, for myself, and ordered another DRAM-less drive (Patriot P400 Lite which uses yet a different controller; I don't want to spend a lot of money on this but it's worth a little to play with and learn and one can always do with having a decent spare drive.) I'm retesting the PM991 as well with longer times between tests, which is SO HARD to wait for. I tested the PM9B1 and it did a lot better and not just because of the faster inherent write speeds, so I want to be sure it wasn't my methods that made the PM991 behave as it did. (Seeing as it was exactly half the cache with a partition half the size, that would be quite a coincidence if I timed it just right to cause that. But the PM9B1 did not show that behavior.) Both of these are HMB drives, no DRAM.

I wish I had an external PCIe slot, that I could sit on top of the PC. Getting under the desk to pull the adapter out and swap drives then put the adapter back in is a pain. An M.2 extender on a cable would be nice but I imagine a meter of that would affect performance quite a lot. (Low-cost USB enclosures don't have high enough speeds to really test reliably, and I only have a couple of 10Gbps ports anyway.) If I was doing this for money I'd just get a mini or sff PC to sit on the desk to swap drives easily, and use the primary M.2 for testing.
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
Yes I'd love to hoard SSDs with the other things I've hoarded. 🙃 I was thinking of looking at Agora for really cheap or free stuff but decent will probably mean at least a bit of money that I shouldn't spend.

It's tough because for the first time in a long time I'm excited about something (yay drugs), even though it's kinda useless information.
 

SportivoA

Ars Scholae Palatinae
727
I wish I had an external PCIe slot, that I could sit on top of the PC. Getting under the desk to pull the adapter out and swap drives then put the adapter back in is a pain. An M.2 extender on a cable would be nice but I imagine a meter of that would affect performance quite a lot.
If you're going to get into it a fair bit, maybe M.2 to U.2 cabled to M.2 might be worth looking into? Though hot-plug might not be that great. I'm guessing no Thunderbolt or Occulink in your current port options (before buying the cables/adapter/case needed)?
 

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
Yeah no high speed ports like that. If I have a 20Gbps USB port that would be tolerable speed for an SSD in the sizes I'm using, but a PCIe adapter plus buying an enclosure to use it would be more than I want to spend. Likewise it looks like M.2 to U.2 cables plus a U.2 to M.2 adapter would be a bit expensive for just doing this for fun. After I test the Patriot drive I'll probably not need to put the adapter in for months and months for even a brief period.
 
  • Like
Reactions: SportivoA

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
Well, I already don't bother to screw the adapter in with a bracket, and just leave the side of the case off when needed. The big nice thing is the tool-less mounting of the actual SSD in the caddy so swaps would be easier and faster. I wish it was possible to screw in a tool-less adapter to the existing standoff for an M.2 drive. (Technically the M.2 standard requires the standoff and screw for grounding, which is why there is copper surrounding the notch.)

If they made that with a cable to connect to a front panel bay, it would be perfect. I'm not sure what kind of length limits exist for an M.2 extension cable (which I have seen), especially with PCIe5. Maybe it could work in an SFF machine with a single bay close to the board.
 

SportivoA

Ars Scholae Palatinae
727
Cabled PCIe can actually be more reliable/lower loss than PCBs! For server front-panel bays, they'd probably add the cost of a PCIe retimer to get the signal integrity under control and increase the reach to the motherboard. Those servers also try to place the PCIe cable connector as close to the socket as possible to avoid the PCB signal loss. It's probably not light reading in the PCIe spec.
 
  • Like
Reactions: Lord Evermore

Lord Evermore

Ars Tribunus Militum
2,387
Subscriptor++
So I could post more details, though I'm still testing as my new drives only arrived today. But the conclusion I've come to is that some drives will split the cache and some won't, and it's not consistent even by manufacturer.

The Samsung PM991 split the cache, where it would not allocate more than a partition's percentage of cache from that partition when writing to it, even if it had plenty of free space (weird, I know but I tested it a few times). AND it splits the access to the static SLC cache. So if the drive has 30GB dynamic and 4GB static, and you make two equal partitions each one can ONLY access 15GB+2GB from its own space even if there's enough free space for more. And it can then access up to 15GB (45 in TLC) from the other partition's free space, but it can't access the other 2GB static cache. So if one partition is full, writes to it can only access up to 15GB cache from the other partition and none from itself. The other partition also likewise only has up to 15GB, as long as it has 45GB of real capacity free.

The PM9B1 just didn't split the cache at all, period. I also noticed that with both Samsung drives, even though they aren't using all the drive space as cache, they begin folding immediately after the pSLC cache runs out. So it drops to the lowest speed immediately, rather than writing at native speed THEN folding if it runs out of native flash. They only jump back up to native speed once folding has finished.

The Patriot P400 Lite 500GB has 80GB pSLC cache, and doesn't start folding until it has filled up the QLC. The Crucial BX500 480GB has 50GB cache and just trash performance once it's full. I suspect that some of that is the factory-overprovisioned space between 480GB and 500GB/512GB and the rest is dynamic, so this could be considered "hybrid" like the Samsungs.

So unless every model gets tested with multiple partitions, there's going to be no way to know whether the cache gets split. In some cases, where the full drive is used for cache, it might only affect rare users that do a LOT of sustained writing, but on models that have only a small amount of cache to start with, it will potentially show up more quickly if they keep either partition too full. Of course the larger the drive, the more cache there is, in every case. Without getting multiple models from different brands that all use the same controller (but aren't just rebadged), I don't know if this is behavior that the SSD maker is able to configure via firmware or perhaps if it's hard-coded in the design of the controllers.

I think that overprovisioning is probably the best way to ensure you never run out of free space that the drive can use for cache, and you can make it large enough to ensure there's a large cache always available. I just feel like that's more effective than allocating the whole drive and having to keep an eye on whether you've left enough space for the drive to use as cache; you just automatically have to treat the drive as smaller than it is. But, that requires extra work from the user to do the overprovisioning, so it's not going to be the most common method.

I'd like to test the overprovisioning with a drive that splits the cache, but I don't have one large enough to make it easy.

Yes, this is a short summary, in my world.