Over the last two weeks, we’ve written multiple stories about SSD manufacturers who have shipped SSDs with different real-world performance levels than the drives initially debuted with. On Monday afternoon we sat down for a discussion with Crucial regarding its SSD policies in general and the P2 in particular. Here’s what we found out.
First, regarding the P2 specifically: The 1TB and 2TB versions of this drive have always been QLC drives with large SLC caches. They were never produced with TLC NAND like the smaller capacities were. Crucial does not seem to have communicated this distinction to the press, however. StorageReview’s December 2020 review specifically identifies the 2TB drive as “an M.2 form factor SSD (leveraging TLC NAND).”
But whether this distinction was clearly communicated or not, Crucial says they did not ship TLC versions of the P2 1TB or 2TB to market. I didn’t personally buy the “bad” version of the 2TB P2 drive; I mistakenly believed the performance of the TLC 500GB drive could be used to extrapolate the performance of the 2TB QLC drive. Generally speaking, in any given drive family, the larger drives are faster than the smaller variants. That might be the case with the P2 while the SLC cache is available, but it stops being true as soon as the P2 2TB exhausts that cache.
Second, Crucial did not intentionally ship one type of drive to reviewers and another type to customers. According to Crucial, it sampled reviewers with the same drives it was shipping to customers at the time it was shipping them. The company pointed out that P2 reviews have continued to be posted over a span of time, which is true. It has not attempted to hide the performance of the P2 family in any way. In fact, Crucial made a particular point of emphasizing that it stood behind the drive’s performance as published in its own specifications. The company has provided this explanation in the past when questioned about its habit of switching NAND configurations behind the scenes.
I have no reason to doubt that Crucial stands behind the drive specifications it publishes, an example of which is shown below:
Each of the question marks gives a little additional context for the listed statistic if you click on it. That’s a nice touch. A couple of sentences explaining the meaning of certain statistics to help buyers trying to wade through unfamiliar terminology is a great idea. But let me be clear: Whether Crucial stands behind its published specifications has never been in question.
A Familiar Problem With a Problematic New Twist
The question of whether a manufacturer’s official specifications accurately capture the performance of a given product isn’t new, and it’s not unique to Crucial or even the SSD market. In this case, the TLC-equipped P2 and the QLC-equipped P2 at the same capacity perform very differently when the SLC cache is exhausted. With the 1TB and 2TB drives, the problem is that the drive’s performance is sometimes better than the smaller variants (while within the SLC cache) but then can be much worse once that area of NAND is depleted.
So it’s not whether Crucial stands behind its published specifications. We believe it does. The problem is that Crucial’s published specifications do not capture some meaningful and important data on how drive performance can vary between different drives that are currently sold under the exact same SKU.
I emphasized to Crucial that reviewers and the technical press more generally need more information about these situations and that we also need review hardware that reflects all the configurations people are seeing in-market, especially if those configurations change over time. That may mean sampling more versions of the drive for testing as configurations evolve and it may mean more transparency around what sorts of performance variation customers can expect.
The company representatives I spoke with were receptive on this point. Although they could not promise anything specific during our conversation, I think there’s a reasonable chance for improved communication and a better long-term picture of how a given SSD’s performance does or does not change.
The best way for Crucial to resolve this problem is to launch separate SKUs when it intends to build a drive variant with meaningfully different performance in a meaningful workload. The second-best way would be to add some additional specifications to its product and then stand behind those as well.
As a simplified example: If Crucial wants to vary its SSD drive configurations in a manner that impacts performance during very large file transfers, a fair file copy size to evaluate would be between 25 percent and 33 percent of the new drive. A 500GB P2 might be tested with a set of files between 125GB – 165GB, while a 2TB drive’s performance would be tested with a 500GB – 660GB file set. We can safely assume that a reasonable percentage of people buying a new SSD today have an existing set of data they want to migrate to their new drive. If NAND, controller, or firmware changes will impact that process, an SSD’s specifications should include some promises around that type of performance as well. My suggestions of 25 – 33 percent are not set in stone, as it might make sense to grow the file set size by a smaller percentage of the drive size when dealing with large capacities, but it’s a workable idea.
It would be more straightforward to launch a “P2T” and a “P2Q”, but including file copy test results in existing specifications and adjusting future drive design to achieve parity in these metrics as well would also be a step towards resolving this problem.
None of the people I spoke to at Crucial were in a position to wave magic wands and instantly tell me what adjustments the company would be making, but the representatives promised that a discussion on these topics was occurring. I emphasized to Crucial that reviewers need to know the benchmarks and results we publish will still be accurate a few years down the line, and that we test more than just the manufacturer’s stated specifications. Our readerships rely on us to deliver the additional nuance that manufacturers don’t capture in their spec sheets. Not communicating these changes erodes both reader’s trust in publications and a publication’s trust in manufacturers. It’s a corrosive cycle that ultimately benefits no one.
Now that we know Crucial is just one of several manufacturers who have had issues with what I’ll politely call “performance variation,” it’s clear that there needs to be a wider conversation regarding what kind of variance is acceptable in a product, where consumers need to be able to expect consistency, and when and how all this information gets communicated.
There is currently an unacceptable level in the consumer SSD market as a whole. It affects too many important workloads and use-cases to dismiss. It is clear that a QLC drive with a large SLC cache has very different performance characteristics under certain circumstances compared with a traditional MLC or TLC drive.
Crucial, Western Digital, and Samsung may not have set out to deliberately bait-and-switch customers, but the performance variations in some of these products are more than large enough to leave people feeling distinctly bait-and-switched. Performance metrics on the 500GB P2 Crucial sampled in 2020 didn’t map well at all to the performance of the 2TB P2 I bought in the spring of 2021. The behavior of Samsung’s 970 EVO Plus is materially different depending on which version of the drive you own.
Not much else is clear about the larger market right now. We do not currently have a full understanding of which products may have been affected from various manufacturers beyond the P2 (Crucial), SN550 (WD), and 970 EVO Plus (Samsung). Intel is currently the one manufacturer that has told us it does not and has never engaged in this practice.
The two companies we have had conversations with, WD and Crucial, have both indicated a desire to improve the current situation. We’ll see what comes of it. The current status quo is unacceptably confusing for all concerned.
Hardware Accelerators May Dramatically Improve Robot Response Times
If we want to build better robots, we need them to be faster at planning their own motion. A new research team thinks it's invented a combined hardware/software deployment method that can cut existing latencies in half.
Clever OS Scheduling Partly Explains Apple M1’s Responsiveness
Some of the improved responsiveness of the M1 comes courtesy of new OS scheduling techniques.
CTS Labs Responds to Allegations of Bad Faith Over AMD CPU Security Disclosures, Digs Itself a Deeper Hole
CTS Labs CTO has written a letter addressing and defending his company's disclosure of various vulnerabilities in AMD's Ryzen CPU and chipsets, but his explanation raises more questions than it answers.
AMD Responds to CTS Labs Security Allegations, Resolutions Incoming
AMD has now responded to CTS Labs' initial findings, kicking the legs out from one of the company's defenses for its own actions in the process.