Advertisement

Keeping SSDs in TRIM: doing the math

Love Apple gear? Like math? TUAW's Doing the Math series examines the numbers and the science that lie behind the hardware.

One of the new features we first saw in the developer beta of Mac OS X Lion back in February is long-overdue in this correspondent's humble opinion: it finally supports TRIM on solid-state drives.

TRIM (which, despite the capital letters, isn't an acronym) is a way to speed up SSD access by performing important housekeeping tasks in the background or on file deletes, rather than leaving it until the user is writing data to the drive. Since then, TRIM has also appeared in 10.6.6 for new Macs with Apple-supplied SSDs only, and with third-party tools, it's now possible to get TRIM running on any SSD under 10.6.7.

This raises the question: what exactly is TRIM, and why does it matter? If you've been wondering what this seemingly arbitrary abbreviation is, and why it matters, then I'm here with my best Science Hat on to remove all that wonder (as we scientists so often do) and replace it with cold hard fact.

To understand the need for TRIM, we'll first need to understand how solid state drives work. A lump of flash memory, whether inside your iPod or your Mac's solid state drive, is built from many billions of floating gate transistors.

These have one key benefit over the transistors used in the normal sort of dynamic RAM that you have in your computers; they maintain their contents without power, which is why you can safely let your iPad go flat knowing your stuff will still be there when you charge it back up. In computer science jargon, we say that it is non-volatile. However, they also come with a key limitation compared to dynamic RAM or the magnetic storage inside traditional hard disks; when you want to change the data stored by a cell, you can't just overwrite what's there with the new value. You have to erase it back to a known electrical state first.

(An aside: you might suppose that a given cell stores a single binary digit, a 1 or a 0. This is the case for expensive Single Level Cell (SLC) flash memory. Most consumer products, however, use more cost-effective Multi Level Cell (MLC) flash. This stores two bits of data in each cell, halving manufacturing costs whilst sacrificing some reliability and performance.)

SSD structure and the read-write-erase-write cycle

Flash memory cells are grouped into pages, typically around 4 KB each, and pages are then arranged into blocks, which are usually 512 KB. Now here comes the big problem that TRIM ultimately tries to work around: for technical reasons, erase commands can only be carried out against an entire block at a time. Oh, and erase operations are much, much slower than either reading or writing data.

So what, you might think? Well, let's look at a simple example. Consider an imaginary SSD that has just 3 MB of memory; so it's made up of 6 blocks of 512 KB each. This SSD starts off blank.

First, we copy a 1 MB JPEG file to it, which takes up the first two blocks. Then we copy a second file, say, a 2 MB audio snippet; this takes up the last four blocks and fills the drive. Finally, we want to copy another 1 MB JPEG file, but the drive is full; so we tell OS X to erase the first JPEG file to make room for the new one. This means the SSD drive has to carry out block erase operations on the first two blocks to make the room it needs.

So far, so good, although that slow "erase" cycle is an unavoidable pain that will slow us down. But let's change the numbers around. Start over with a blank, 6-block, 3 MB drive and copy six 1 KB text files onto it. Suppose, for whatever reason, each file ends up in a different block -- so now, each block contains 511 KB empty bytes and a single text file.

Now try to copy the 2 MB sound file back on. The SSD wants to put this in four whole blocks, but there are no blocks free -- they all have a single little text file in. So the process becomes:

  1. read the text files from the first four blocks

  2. write them into the fifth block

  3. erase the first four blocks

  4. write the sound file into the first four blocks

Now, although this example is a little contrived, it does serve to demonstrate this important quirk in the way SSDs function. Namely, if the system wants to write to a block that is partially full already, a read-write-erase-write cycle is triggered. Compared to normal reads and writes, this is very slow.

When a delete is not a delete

Now, you may be thinking, why doesn't the drive do the erase command in advance? To return to our second example above, it could have reshuffled the text files into a single block long before we told the drive to save a copy of the big audio file. (Indeed, it should never have scattered them across the blocks to start with, but that's something I needed to make the example work; please bear with me on that.)

If it was all done ahead of time then the user wouldn't have to wait for the read-write-erase-write cycle to complete; they could just save the file and get on with their lives. No problems, right?

Wrong. The problem is that "file deletion" in OS X doesn't actually delete any data from the drive. When you drag a file to the Trash and then empty it, the file remains unchanged on the disk. Instead, OS X simply edits the filesystem data to mark the file as gone. At some point in the indeterminate future, the contents of the file will be overwritten by new stuff coming in, but until that happens, the file is still there. If it didn't work this way, it would take 10 times longer to delete a 100 MB file than a 10 MB one, and 10 times longer again to delete a 1 GB one -- very inconvenient.

Incidentally, this is how file undelete utilities work -- they search for these ghostly echoes of files, and if you're quick and get to it before any overwriting has taken place, you can often recover accidentally deleted files completely unharmed. It's also of interest to computer forensics technicians, who often need to recover deleted files for criminal investigations, and relatedly, it's how those "file shredder" utilities work, too -- they trash the echoes of the files, overwriting the previously-occupied disk blocks with zeroes (or, for especially sensitive data, overwriting them repeatedly with ones, zeroes or 'garbage data' to make sure every stray bit of the old data is gone). The technical term for this on-drive echo phenomenon is data remanence.

The upshot of this is that your drive's firmware never knows if a given chunk of data represents a deleted file or a still-in-use one because the operating system doesn't have the basic good manners to let it know. Except for some circumstances (see later), the drive can't see into the files that Mac OS X has asked it to store -- it just handles the 0s and 1s in pages and blocks, and it doesn't understand that a given block is holding half of the data that makes up a JPEG image. So the SSD is helpless; the only way it can tell that a file was deleted at some point in the past is when the OS tells it to overwrite it with new incoming data.

In practical terms, what you find with real-world SSDs is a sharp decline in write performance over time. The drive starts out blank, and nice and fast. Then you use it, and with daily use files are written and deleted and eventually end up scattered all over the disk. The drive's firmware does what it can to keep giving you empty blocks, but once the drive is full of current and deleted-but-not-gone data, it can't help but keep serving up half-used blocks for you to use. And so, more and more, every single time data is saved to the disk, a read-write-erase-write cycle happens. Compared to what you saw when the drive was factory fresh, your write performance is now halved or worse.

Enter TRIM

So how can we deal with this? One way you'd often see espoused by early SSD adopters in the Windows world was to completely wipe the drive before restoring a backup (or, for the overclocking heroes who are constantly breaking their Windows install with bleeding edge drivers anyway, a fresh install). This does work, but we're Mac users for crying out loud; we expect, nay demand, greater elegance than this take off and nuke from orbit approach.

TRIM is the more elegant method you are looking for. It's like a finishing school for your operating system that gives it the manners to tell a drive when a file is deleted. It's an extension to the SATA command set that, quite simply, sends the drive a courtesy message about inactive blocks. "Hey, see all these pages here? These can be erased now. I'm done with that data. Get on with it." Then the SSD can get on with the read-erase-write parts of the cycle in the background while the user does other things, and when they come to save a file they have plenty of empty blocks to choose from for a speedy write.

Note that, as a minor downside, if you are using TRIM you can forget about file undeletion tools. TRIM works, in effect, a bit like a file shredder that's constantly running in the background -- so unless you are very fast, your deleted file will be permanently gone. This has caused some concern amongst computer forensics technicians.

Wear levelling

TRIM helps overcome another limitation of flash memory that I have, so far, glossed over: it wears out, and pretty quickly, too. Specifically, depending on the exact type of memory, a given cell might only be able to withstand about 10,000 write cycles before it simply stops storing data altogether. If that happens, the drive will mark the blocks as "bad," stop using them, and the overall size of your drive will shrink a little.

Clearly, that's less than ideal, so the drive's firmware is constantly fiddling in the background to move data around. That way, even if the user's operating system keeps writing to the same single file over and over again (perhaps a page file or your Safari history store), then those writes are recorded onto different cells. This process works best when the firmware has as much free space to choose from as possible when incoming write requests are received so that it can choose the least worn cells to handle the data. Without TRIM, the firmware thinks a lot more of the drive is in use than really is, so it has less scope to spread the wear around. Thus, not having TRIM support can lead to parts of the disk wearing out prematurely.

Conclusion

Given all of the above, you'd perhaps conclude that TRIM support is (whilst not absolutely vital) probably very important to people running SSDs. TRIM support was added to Linux in kernel version 2.6.33 in February 2010. OpenSolaris and FreeBSD got support in July 2010. Windows 7 and Windows Server 2008 R2 supported TRIM right out of the box, way back in October of 2009! And what of Mac OS X?

Ahem. As of right now, TRIM support is apparently enabled in developer betas of Lion. It also appears in 10.6.6 and onwards, but in both Lion and Snow Leopard it only works for Apple-supplied SSDs -- hence the screenshot at the top of this post, showing that TRIM is not enabled on the SSD I fitted myself in place of my MacBook Pro's optical drive. There is an unsupported method to make it work on any SSD, but no-one knows at the moment if Lion will globally enable TRIM support or not. I certainly hope so -- TRIM is no longer an exotic technology, and I can see few reasons for Apple to restrict support to their own drives except some rather unpalatable profiteering to protect their overpriced factory-fitted SSDs.

A final footnote: some expensive drives do erase-ahead operations entirely within their firmware, without requiring the OS to send it TRIM commands. How? By making the drive's firmware fully able to read the operating system's filesystem, which is rather akin to using a rocket propelled grenade to crack a walnut. It's worth noting that not only does this make for more expensive drives (because they need more powerful processors to run this smarter firmware), but it usually only works for a small number of filesystems. If you bought a model that could only understand Windows' NTFS format, for example, and used it in a Mac with the HFS+ filesystem, then the firmware would be helpless and you'd be back where you started.