Building A Log-Structured Merge-Tree: Part 3
December 13, 2020
Last time we looked at flushing the memtable to disk and by producting a series of Static Sorted Table (SST) files that held all of the memtable’s key value pairs. I think that maybe that blog came out of order, but we’re going to roll with it. We’ll just have to do some hand-waving this time around to progress through this blog. We’ll introduce an SSTManager
to help implement the public interface of the LSMT, and we’ll fill it in next time. For now, we can understand the SSTManager
to be abstract and not dwell on the implementation.
Internal State
The lsmt
struct we will store the internal state for the running database. We’ll keep the user provided Options
, the active memtable, any memtables waiting to flushed to disk (inactive memtables), the SSTManager
which will manage our on disk SST files, a lock for flushing to disk (we’ll build out the plumbing for flushing to disk in a later blog), and a closed
flag.
We’ll go over how these are used in the public interface in this blog and get into more detail in subsequent blogs. For now know that we’re going to hand the user an instance of the lsmt
struct when they open the database.
Public Interface
The public interface for the LSMT is going to be pretty small and straightforward. Over time we’ll piece together the components behind the interface, but from here on out this will be a stable interface. We’ll need to keep this interface in mind and remember that everything under the hood is in service of these five functions.
Write and Delete
From previous posts we’ll recall that that only the memtable accepts writes. Since Write
and Delete
are the only non-read functions in the public interface it might be easiers to start here since the scope of the implementation is then limited to effecting the memtable.
The implementation for Write
is here. We do some checks up front, namely, we ensure that the database isn’t closed, that the key and value aren’t nil or of size 0, and that the key and value conform to user specified options. If the write passes those checks then we simply invoke the Write
method on the active memtable.
Delete
is very similar, except it only takes a key instead of the key and value pair. The implementation for Delete
is here. We’ll notice that Delete
actually performs a Write
to the active memtable and that it writes a Tombstone
value. This means that in our functions that perform reads we’ll need to treat Tombstone
values as deleted values. Since the LSMT works in levels we use Tombstone to track deletes through the levels. The special tombstone value can only be written during Delete
s.
Get
The Get implementation is going to move through the available layers until it can find a value for the key being searched. There’s some special case logic in here for the Tombstone
value. If, at any point, a Get
to an underlying Memtable
or the SSTManager
returns a Tombstone
value then we’ll return nil
to the user to signify that the key has no value.
Iterator
At a high level, the Memtable
and the SSTManager
both provide Iterator
s to their own data. However, users of the LSMT don’t necessary want or need to know about the implementation details here, to the user there is only one Iterator
type and it’s for the entire LSMT. Since this is the public interface’s Iterator
implementation we’ll have to hide all of the implementation details.
In order to provide one single iterator then we’ll need an abstraction that allows many Iterator
s to compose into one Iterator
instance. In service of this we’ll introduce the MergedIterator. The MergedIteraor
will compose iterators, but has a few properties that are desirable for our use case. First, it will only produce one value per key. That means if that two or more of the underlying Iterator
s have the same key, only one key/value pair for that key will be returned.
Second, is that it resolves the multiple key problem by looking at Iterator
s in the order they are passed in. In our case, that means that we’ll order the Memtable
before the SSTManager
such that if both of their Iterator
s have the same key we’ll take the Memtable
’s value for that key since it’s the most up to date.
Later on we’ll see that we might need to even compose MergedIterator
s into MergedIterator
s and we’ll see how powerful of an abstraction this can be. But, for now we’ll use a MergedIterator
in our public interface’s Iterator
implementation.
One thing to note in this implementation is that we have the inactiveMemtables
which we’ll order behind the activeMemtable
but in front of the SSTManager
.
Close
Close will cause the database to stop accepting writes and do the background work to stop the LSMT. For now we’ll just set the closed
struct value to true
to signal that the database has been closed.
Next Up
Next time we’ll start to look at an implementation of the SSTManager
for block based SST storage.