Read, Write, and Space Amplification: The Three Costs Every Key-Value Store Trades
Every storage engine pays for speed in extra reads, extra writes, or extra space. Define the three amplifications, the RUM conjecture that says you can only minimize two, how LSM-trees and B-trees spend differently, and how to read past any benchmark.
When a key-value store benchmark shows one engine "winning," it has almost always silently chosen which cost to hide. Every storage engine pays for its speed somewhere, and there are exactly three places to pay: extra reads, extra writes, or extra space. These are the three amplifications, and once you can name them you can read past any benchmark and predict how an engine will behave on your workload and, just as importantly, what it will cost to run. This is the vocabulary the LSM-tree versus B-tree debate actually runs on, pulled out and explained on its own.
The three numbers
Each amplification is a ratio of what the disk did to what you asked for.
Read amplification (RA) is bytes read from disk per byte your query actually needed. If answering a 100-byte GET forces the engine to read several 4 KB index and data pages, your read amplification is large. RA shows up directly as read latency and as IOPS you pay for.
Write amplification (WA) is bytes written to disk per byte your application wrote. If you write a 100-byte value and the engine eventually writes 1000 bytes to persist and reorganize it, WA is 10. WA shows up as write latency, as IOPS, and on SSDs as physical wear, because flash cells have a finite number of write cycles. A write-amplification of 10 means your drive ages ten times faster than the logical write rate suggests.
Space amplification (SA) is bytes occupied on disk per byte of logical data. If 1 GB of live data sits in 2 GB of files, SA is 2. SA shows up straightforwardly as your storage bill and as how much data fits before you provision more.
The numbers are not academic. WA and RA are IOPS you rent and latency your users feel; SA is gigabytes you pay for every month. An engine with WA of 2 instead of 10 is, very roughly, doing one fifth the write IO for the same workload.
You cannot minimize all three: the RUM conjecture
The reason no engine wins everywhere has a name. The RUM conjecture states that for any access method you can optimize for at most two of read overhead, update overhead, and memory (space) overhead, and you will pay on the third. It is the storage-engine version of "fast, cheap, good: pick two." Every real engine is a point on this triangle, and its designers chose which corner to sacrifice.
This is why "which is the fastest key-value store" is an incomplete question. Fastest at what, and at the expense of which other cost? An engine tuned to a flattering benchmark is usually one that hid the amplification the benchmark did not measure.
How the two families spend
The LSM-tree and the B-tree make opposite bets, and the three amplifications are the clearest way to see it.
The LSM-tree (RocksDB, Cassandra, ScyllaDB):
- Write amplification: high. This is its defining cost. Data is written once to the log, again when the MemTable flushes to an SSTable, and then rewritten repeatedly by compaction as it migrates down the levels. Classic leveled compaction can push WA into the double digits.
- Read amplification: moderate to high, but mitigated. A key may live in several SSTables across levels, so a read can touch many files. Bloom filters and block caches cut this down dramatically in practice, but the fundamental work is higher than a single tree descent.
- Space amplification: low. Compaction de-duplicates stale versions and SSTables are block-compressed, so live data packs tightly. This is the LSM-tree's win.
The B-tree (LMDB, bbolt, SQLite, BaseKV):
- Read amplification: low. One descent from root to leaf, the same path every time, with no ambiguity about where the current value lives. This is the B-tree's win and why it shines on point reads and short scans.
- Write amplification: low to moderate. A changed key dirties its page and that page is written once. Copy-on-write designs like bbolt and LMDB write new page versions rather than overwriting in place, which can roughly double write volume but buys lock-free reads and clean crash recovery.
- Space amplification: moderate. Pages are often left partly full after splits and inserts, and without a compaction process there is nothing to reclaim that slack automatically, so a B-tree tends to occupy somewhat more than its logical data.
Read those two lists side by side and the trade is obvious: the LSM-tree spends writes and reads to save space; the B-tree spends space to save reads. Neither is cheating. They are billing you in different currencies.
Reading a benchmark with the three numbers in hand
Next time you see a key-value benchmark, ask three questions before believing the winner.
First, what was the workload mix? A 95-percent-write ingestion benchmark is an LSM-tree's home turf and tells you nothing about how it serves point reads. A point-lookup benchmark flatters a B-tree. The "best" engine is just the one whose strong amplification matched the test.
Second, what was not measured? A throughput chart that omits space used is probably hiding an engine that wins QPS while consuming far more disk. A write-throughput chart that omits the compaction debt is measuring the easy moment before the bill arrives. The amplification an engine hides is usually the one it is worst at.
Third, what does the cost actually map to for you? If you run on a managed SSD with provisioned IOPS, write amplification is money and drive lifetime. If you store a large dataset, space amplification is your monthly bill. If users feel latency, read amplification is the number that matters. Pick the engine whose cheap amplification is the one your workload and your invoice are most sensitive to.
Why this is BaseKV's whole argument
BaseKV stores data in a single B+tree file on disk, so it inherits the B-tree profile above: low read amplification, modest write amplification, and a space cost it accepts on purpose. That choice is not about beating in-memory Redis at peak QPS, and we do not claim it does. It is about the workloads that are read-heavy, durable, and cost-sensitive, where low read amplification gives Redis-like read ergonomics and the storage sits on NVMe SSD at a fraction of RAM's price per gigabyte. The amplification you most want to be low for a cold, mostly-read dataset is read amplification, and that is precisely the one the B-tree wins. The full cost case is in key-value store vs Redis in 2026; this article is the why underneath the numbers there.
The takeaway is small and durable: there is no free storage engine. There is read overhead, write overhead, and space overhead, you can only minimize two, and the right engine is the one whose unavoidable third cost is the one your workload cares about least.
Related: LSM-Tree vs B-Tree, Single-Threaded vs Multi-Threaded Key-Value Stores, What Is a Key-Value Database?, SQLite vs Key-Value Performance.