Level Db Tuning

Download LevelDbTuning.ipynb notebook

LevelDB parameter tuning using MLOS

What is Level DB

LevelDB is a key value store built using Log Structured Merge Trees (LSMs) Wiki. LevelDB supports read, write, delete and range query (sorted iteration) operations.

Typical to any database system, LevelDB also comes with a bunch of parameters which can be tuned according to the workload to get the best performance. Before going to the parameters, we’ll briefly describe the working of LevelDB. The source code, the architecture and a simple example of how to use LevelDB can be found here.

LevelDB working

LevelDB Architecture Diagram

MemTable SSTable Diagrams

LevelDB uses 7 levels to store the data, the amoung of data that can be stored in each of the levels after level 0 is $10^{level}$, so level 1 can store around 10 MB of data, level 2 around 100 MB and so on.

As shown the diagram above, the main components of LevelDB are the MemTable,the SSTable files and the log file. LeveDB is primarily optimized for writes.

MemTable is an in memory data structure to which incoming writes are added after they are appended to the log file. MemTables are typically implemented using skip lists or B+ trees. The parameter write_buffer_size (paramter input at DB startup) can be used to control the size of the MemTable and the log file.

Once the MemTable reaches the write_buffer_size (Default 4MB), a new MemTable is created and the original MemTable is made immutable. This immutable MemTable is converted to a new SSTable in the background to be added to the Level 0 of the LSM tree.

SSTable: It is a file in which the key value pairs are stored sorted by keys. The size of SSTable is controlled by the parameter called max_file_size (Default 2MB).

Once the number of SSTable at Level 0 reaches a certain threshold controlled by the paramter kL0_CompactionTrigger (Default 4), these files are merged with higher level overlapping files. If no files are present in the higher level, the files are combined using merge sort techniques and added to higher level. A new file is created for every 2 MB of data by default.

For higher levels from 1 to the maximum number of levels, compaction process (merging process) is triggered when the level gets filled.

A detailed explanation of the working of LeveDB is presented here.

LevelDB paramter tuning using MLOS

In this lab we will be tuning some of the important start up time paramters of LevelDB and observe how it affects the performance. The parameters that we will be tuning are write_buffer_size and max_file_size to try to optimize the throughput and latency of LevelDB for Sequential and random workloads.

LevelDB installation: Instruction on Ubuntu 18.04

Follow the commands below to get, compile and install LevelDB

sudo apt update
sudo apt-get install cmake
git clone --recurse-submodules https://github.com/google/leveldb.git
cd leveldb
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Release .. && cmake --build .

Now, from the ~/leveldb/build directory, you should be able to execute ./db_bench, the microbenchmark which can be used to measure the performance of LevelDB for different workloads.

Please take a look at the db_bench.cc file in the ~/leveldb/benchmarks directory and get an idea about the input parameters and workloads that are possible.

An example command to run a workload that does random writes of 1M values with value size 100 B is:

./db_bench --benchmarks=fillrandom --val_size=100 --num=1000000

The output of the command will look like (numbers might be different):

LevelDB:    version 1.22
Date:       Thu Oct  8 13:56:00 2020
CPU:        40 * Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
CPUCache:   25600 KB
Keys:       16 bytes each
Values:     100 bytes each (50 bytes after compression)
Entries:    1000000
RawSize:    110.6 MB (estimated)
FileSize:   62.9 MB (estimated)
WARNING: Snappy compression is not enabled
------------------------------------------------
Opening the DB now
In the collect stats thread
Total data written = 421.9 MB   
fillrandom :      31.731 micros/op;    3.5 MB/s

In the subsequent cells, we will be using the Bayesian Optimization Python libraries from the MLOS to tune some startup time parameters to obtain the values that result in best throughput and latency.

You will have to install the python MLOS library and switch to the corresponding conda environment before moving further. Follow the steps at https://microsoft.github.io/MLOS/documentation/01-Prerequisites/#python-quickstart for installing python MLOS library.

import subprocess
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.stats import t
from mlos.Optimizers.OptimizationProblem import OptimizationProblem, Objective
from mlos.Optimizers.BayesianOptimizer import BayesianOptimizer
from mlos.Spaces import SimpleHypergrid, ContinuousDimension, DiscreteDimension

from mlos.Optimizers.BayesianOptimizerConfigStore import bayesian_optimizer_config_store
from mlos.Optimizers.BayesianOptimizerFactory import BayesianOptimizerFactory
from mlos.Spaces import Point
from mlos.Logger import create_logger
import logging

# configure the optimizer, start from the default configuration
optimizer_config = bayesian_optimizer_config_store.default
# set the fraction of randomly sampled configuration to 10% of suggestions
optimizer_config.experiment_designer_config_fraction_random_suggestions = .1
# configure the random forest surrogate model
random_forest_config = optimizer_config.homogeneous_random_forest_regression_model_config
# refit the model after each observation
random_forest_config.decision_tree_regression_model_config.n_new_samples_before_refit = 1
# Use the best split in trees (not random as in extremely randomized trees)
random_forest_config.decision_tree_regression_model_config.splitter = 'best'
# right now we're sampling without replacement so we need to subsample
# to make the trees different when using the 'best' splitter
random_forest_config.samples_fraction_per_estimator = .9
# Use 10 trees in the random forest (usually more are better, 10 makes it run pretty quickly)
random_forest_config.n_estimators = 10
# Set multiplier for the confidence bound
optimizer_config.experiment_designer_config.confidence_bound_utility_function_config.alpha = 0.1
# optimizer = optimizer_factory.create_local_optimizer(
#     optimization_problem=optimization_problem,
#     optimizer_config=optimizer_config
# )
# You will have to change the min and max value based on the start up time parameter that you want explore,
# you can change the function or create a similar function for the going further questions. 
def create_parameter_search_space(parameter_name, min_val, max_val):
    parameter_search_space = SimpleHypergrid(
            name='parameter_config',
            dimensions=[
                DiscreteDimension(name=parameter_name, min=min_val, max=max_val)
            ]
        )
    return parameter_search_space
# Optimization Problem
# You will have to set the min and max value based on the objective that you are using 
def create_optimization_problem(parameter_search_space, objective_name, min_val, max_val, minimize=False):
    optimization_problem = OptimizationProblem(
        parameter_space=parameter_search_space,
        objective_space=SimpleHypergrid(name="objectives",
            dimensions=[ContinuousDimension(name=objective_name, min=min_val, max=max_val)]),
        objectives=[Objective(name=objective_name, minimize=minimize)]
    )
    return optimization_problem
# The max value here (1000) is used as an example
# Set the max and the min value appropriately based on the objective metric and the hardware type.
logger = create_logger('Optimizing LevelDB', logging_level=logging.WARN)
def initialize_optimizer():
    parameter_search_space = create_parameter_search_space(parameter_name="write_buffer_size", 
        min_val=1*1024*1024, max_val=128*1024*1024)
    optimization_problem = create_optimization_problem(parameter_search_space, objective_name="throughput",
        min_val=0, max_val=1000, minimize=False)
    optimizer_factory = BayesianOptimizerFactory(logger=logger)
    optimizer = optimizer_factory.create_local_optimizer(
    optimization_problem=optimization_problem,
    optimizer_config=optimizer_config)
    return optimizer
# Please change the leveldb_path to the build directory of your leveldb installation
leveldb_path = "$HOME/leveldb/build/"
# You can change the command to run a different kind of workload (take a look at db_bench.cc to see the possible workloads)
command = "db_bench"

# You might have to change the run workload function to explore a combination of parameters simultaneously
def run_workload(workload, input_parameter, parameter_value):
    # The line below executes the db_bench command with approprite parameters, you can change this 
    # if you want to specify other input parameters
    result = subprocess.check_output(leveldb_path + command + " --benchmarks=" + workload + 
        " --" + str(input_parameter) + "=" + str(parameter_value), shell=True)
    stats = (str(result).split(":")[-1]).split(";")
    # The line below is used to parse the output that is returned by db_bench
    latency, throughput = float(stats[0].strip().split(" ")[0]), float(stats[1].strip().split(" ")[0])
    return latency, throughput

def run_optimizer():
    optimizer = initialize_optimizer()
    for i in range(10):
        new_config_values = optimizer.suggest()
        new_parameter_value = new_config_values["write_buffer_size"]
        latency, throughput = run_workload("fillrandom", "write_buffer_size", new_parameter_value)
        print("Parameter value: {0:.2f} MB, Objective value: {1:.2f} MB/s".format(
            float(new_parameter_value)/(1024*1024), float(throughput)))
        if i > 0:
            optimum_parameter, optimum_value = optimizer.optimum() 
            print("Optimal parameter: {0:.2f} MB, Optimal value: {1:.2f} MB/s".format(
                float(optimum_parameter["write_buffer_size"])/(1024*1024), optimum_value["throughput"]))
        objectives_df = pd.DataFrame({'throughput': [throughput]})
        features_df = new_config_values.to_dataframe()
        optimizer.register(features_df, objectives_df)

# Remember to call initialize_optimizer function before the run_optimizer
# To avoid the optimizer remembering the optimal values from previous run
run_optimizer()
Parameter value: 4.34 MB, Objective value: 13.20 MB/s
Parameter value: 63.73 MB, Objective value: 33.00 MB/s
Optimal parameter: 4.34 MB, Optimal value: 13.20 MB/s
Parameter value: 110.00 MB, Objective value: 31.60 MB/s
Optimal parameter: 63.73 MB, Optimal value: 33.00 MB/s
Parameter value: 40.79 MB, Objective value: 33.80 MB/s
Optimal parameter: 63.73 MB, Optimal value: 33.00 MB/s
Parameter value: 91.23 MB, Objective value: 32.50 MB/s
Optimal parameter: 40.79 MB, Optimal value: 33.80 MB/s
Parameter value: 45.60 MB, Objective value: 29.00 MB/s
Optimal parameter: 40.79 MB, Optimal value: 33.80 MB/s
Parameter value: 75.85 MB, Objective value: 29.60 MB/s
Optimal parameter: 40.79 MB, Optimal value: 33.80 MB/s
Parameter value: 105.25 MB, Objective value: 31.80 MB/s
Optimal parameter: 40.79 MB, Optimal value: 33.80 MB/s
Parameter value: 100.94 MB, Objective value: 32.30 MB/s
Optimal parameter: 40.79 MB, Optimal value: 33.80 MB/s
Parameter value: 95.30 MB, Objective value: 32.50 MB/s
Optimal parameter: 40.79 MB, Optimal value: 33.80 MB/s

Verification

Manually run the benchmark for various values of the parameter that you are testing, plot the graphs and verify if the optimal returned by the optimizer matches with the one manually obtained. For example, if the input_parameter is write_buffer_size, you can start from 2 MB (2097152) and go up to 64 MB (67108864), by trying values like, 2MB, 4MB, 8MB, 16MB, 32MB, 64MB and verify the point of deflection i.e the point where throughput starts to decrease after increasing or latency starts to increase after decreasing and verify if it matches with what is returned by the optimizer.

Going further

In the questions below, you can choose to optimize either for throughput or latency (choose one). Plot graphs indicating how many iterations does the optimizer take to converge. Report values of the optimal values of parameters obtained and the optimal value of the objective metric at these parameter values.

  1. Choose 2 parameters from leveldb/include/leveldb/options.h file (this can include write_buffer_size and max_file_size) and try to tune them manually and using the optimizer and compare the results. Plot a graph of the optimal value vs the iteration.

    Hint: These parameters can be passed in as startup time parameters, look at leveldb/benchmarks/db_bench.cc for the possible startup time parameters.

  2. Try tuning the performance metric by using a combination of startup time parameters. For example, you can try to optimize a combination of write_buffer_size and max_file_size together to obtain the best throughput.

    Hint: You will have to make changes to the parameter search space. You will have to add a second parameter, look at the SmartCache example to do this.

  3. Apart from the startup time parameters, LevelDB has compile time parameters (in leveldb/db/dbformat.h), choose parameters from this file that you think would affect the throughput or latency, make them start up time parameters and tune them using MLOS (some candidates: kl0_compaction_trigger, kl0_slowdownwritetrigger, kl0_stopwritetrigger, etc.). Look at the places where these parameters are used and what effect they can have on the performance.

    Hint: To make these parameters startup time, look at the Open() function in leveldb/benchmarks/db_bench.cc and note how startup time parameters can be passed using Options structure.

Reference