Log-Structured Key-Value Store
- An append-only log records every
SetandDeleteoperation as an immutable entry. - Looking up the current value of a key requires scanning the log for the most recent entry.
- A
Deleteentry acts as a tombstone, invalidating all earlier entries for the same key. - Compaction folds the log into a
Dictto produce a minimal equivalent log with no shadowed entries.
The Log-Structured Idea
- Most databases that need high write throughput use an append-only design
- Instead of updating a value in place, every change is written as a new record at the end of a file
- Reads scan the log to find the most recent write for a given key
- Start with the most recent entry and work backward to find a match
- Old records are periodically compacted to save storage
The Entry Type
- Every operation on the store is one of two things:
pub type Entry {
Set(key: String, value: String)
Delete(key: String)
}
Set(key, value)stores a value under a keyDelete(key)removes a key- Both variants carry
Stringfields with labels so they can be accessed by name (entry.key) as well as by pattern matching - The entire store state is
List(Entry) - This is an event log
- The log is never modified, only extended
- Functions that query the store take the log as an argument
Reading from the Log
- To look up a key, scan from the beginning and return the first matching entry:
- Start at the beginning because new entries are prepended
pub fn get(log: List(Entry), key: String) -> Option(String) {
get_scan(log, key)
}
fn get_scan(log: List(Entry), key: String) -> Option(String) {
case log {
[] -> None
[Set(k, v), ..] if k == key -> Some(v)
[Delete(k), ..] if k == key -> None
[_, ..rest] -> get_scan(rest, key)
}
}
getis a thin wrapper;get_scandoes the recursion[Set(k, v), ..rest] if k == key -> Some(v): the guard checks the key, and if it matches, returns the value immediately[Delete(k), ..rest] if k == key -> None: the key was deleted, soNoneis correct even if earlier entries set it-
[_, ..rest] -> get_scan(rest, key): skip this entry and keep scanning -
The return type is
Option(String)Nonemeans the key is not in the store (either never set or deleted)
- This is O(n) per read in the worst case
- That is acceptable for a demo
- In production, maintain an in-memory index so reads are O(1) — see the next section
Indexed Lookup
- Scanning the log on every read is O(n)
- Building a
Dictindex makes each lookup O(1):
pub fn build_index(log: List(Entry)) -> dict.Dict(String, String) {
list.fold(list.reverse(log), dict.new(), fn(acc, entry) {
case entry {
Set(k, v) -> dict.insert(acc, k, v)
Delete(k) -> dict.delete(acc, k)
}
})
}
pub fn get_indexed(
index: dict.Dict(String, String),
key: String,
) -> Option(String) {
case dict.get(index, key) {
Ok(v) -> Some(v)
Error(_) -> None
}
}
build_indexfolds the reversed log (oldest-first) into aDict(String, String), keeping only the most recent value for each live keyget_indexedwrapsdict.get: O(1) per call- Build the index once when the log is loaded; rebuild it after any write
Listing Live Keys
The keys function returns all keys that currently have a value:
pub fn keys(log: List(Entry)) -> List(String) {
log
|> list.fold(dict.new(), fn(acc, entry) {
case entry {
Set(k, _) -> dict.insert(acc, k, True)
Delete(k) -> dict.delete(acc, k)
}
})
|> dict.keys
}
- The fold threads a
Dict(String, Bool)through the log Setinserts the key;Deleteremoves itdict.keysextracts the surviving keys- This is O(n) in the log length and O(k) in the number of live keys
Compaction
- A long-running store accumulates many shadowed entries
- Compaction replaces the full log with a minimal equivalent one:
pub fn compact(log: List(Entry)) -> List(Entry) {
let live_dict =
list.fold(list.reverse(log), dict.new(), fn(acc, entry) {
case entry {
Set(k, v) -> dict.insert(acc, k, v)
Delete(k) -> dict.delete(acc, k)
}
})
dict.to_list(live_dict)
|> list.map(fn(item) { let #(k, v) = item Set(k, v) })
}
- The log is reversed so that the most recent entry for each key comes first when folding
dict.insertfor aSetrecords the latest value- If the key was already seen (i.e., there was a later entry) it is overwritten
- Which is fine because we want the latest value
dict.deletefor aDeleteremoves the key- After the fold, the dict contains only the live key-value pairs
dict.to_listconverts it back to(k, v)pairs wrapped inSet- After compaction,
Deleteentries disappear- They have already been applied by removing the key from the dict
Serialisation
- To persist the log across restarts, each entry is written as a text line:
fn compact_to_lines(log: List(Entry)) -> List(String) {
log
|> list.map(fn(entry) {
case entry {
Set(k, v) -> "SET|" <> k <> "|" <> v
Delete(k) -> "DEL|" <> k
}
})
}
fn parse_lines(lines: List(String)) -> List(Entry) {
list.map(lines, fn(line) {
case string.split(line, "|") {
["SET", k, v] -> Set(k, v)
["DEL", k] -> Delete(k)
_ -> Set("error", "bad line")
}
})
}
"SET|key|value"encodes aSetentry;|separates the fields"DEL|key"encodes aDeleteentryparse_linesreverses the process usingstring.split(line, "|")and pattern matching on the resulting list- In production code, writing to a file uses
simplifile.write_bitsorsimplifile.append
Testing
pub fn set_and_get_test() {
let log = [Set("x", "1")]
get(log, "x")
|> should.equal(Some("1"))
}
pub fn overwrite_test() {
// newest entry is prepended; Set("x","2") is the most recent write
let log = [Set("x", "2"), Set("x", "1")]
get(log, "x")
|> should.equal(Some("2"))
}
pub fn delete_test() {
// Delete is prepended after Set, so it is the most recent entry
let log = [Delete("x"), Set("x", "1")]
get(log, "x")
|> should.equal(None)
}
- All tests use an in-memory
List(Entry); no file I/O needed - Each log is newest-first: the most recently written entry is at the front of the list
overwrite_testverifies that the entry at the front of the list winsdelete_testconfirms that aDeleteat the front returnsNoneeven though an earlierSetexists later in the list
Check Understanding
A log contains ten entries, all setting key "x" to different values.
After calling compact, how many entries does the result contain for "x", and why?
One entry.
compact folds the reversed log into a Dict, so each key appears at most once.
The fold processes entries from oldest to newest;
dict.insert overwrites on each call, so the newest value wins.
All earlier entries for "x" are shadowed and discarded.
The log [Delete("x"), Set("x", "1")] is newest-first.
What does get(log, "x") return?
What would it return if the list were reversed to [Set("x", "1"), Delete("x")],
and what real-world operation sequence does each order represent?
For [Delete("x"), Set("x", "1")], get returns None:
get_scan matches Delete("x") at the head and stops immediately.
This represents: key "x" was set first, then deleted.
For [Set("x", "1"), Delete("x")], get returns Some("1"):
get_scan matches Set("x", "1") at the head and stops.
This represents: key "x" was deleted first, then set again.
The ordering of the list encodes the direction of time.
Exercises
Round-trip serialisation (15 minutes)
Implement to_lines(log: List(Entry)) -> List(String) and
from_lines(lines: List(String)) -> List(Entry).
Write tests confirming that log |> to_lines |> from_lines equals the
original log for a mix of Set and Delete entries.
Indexed get (15 minutes)
Add a function that builds a Dict(String, String) index from the log
and uses it for O(1) lookups.
Compare the interface to the scan-based get:
both return Option(String), so callers do not need to change.
Write-ahead log (20 minutes)
Extend the demo to write each new Entry to a text file using simplifile.
Add a load(path: String) -> Result(List(Entry), String) function
that reads and parses the file on startup.
Write a test that writes a few entries, loads them back, and confirms the result.