🦀 RedoxQL is an L-Store database written in Rust and Python 🚀
🥇 Fastest database speed in the class for milestone 2 (Winter 2025) ⚡
RedoxQL is our implementation of an L-Store database.
Important
Read the Structure section — We use both Rust and Python and they go in different places
Create a virtual environment
python3 -m venv venv
Source the virtual environment
source venv/bin/activate
Install maturin
pip install -r requirements.txt
Build the Rust code
maturin build --release
Install the module (Note: the version will change so check the exact filename in target/wheels/
)
pip install target/wheels/lstore* --force-reinstall
Run the database benchmark
python3 __main__.py
You should see this for milestone 1.
(venv) ecs-165a-database (main) λ p __main__.py
Inserting 10k records took: 0.0077650810000000035
Updating 10k records took: 0.020893269
Selecting 10k records took: 0.016048745000000003
Aggregate 10k of 100 record batch took: 0.0039221569999999956
Deleting 10k records took: 0.002314741000000009
(venv) ecs-165a-database (main) λ
- Keanu - Secondary indexes, page.rs and all the page stuff, index.rs and all of the index stuff
- Lucas & Andrew - update
- Lucas - Merge, select_version, sum_version, matching
- Abdulrasol - Merge, BaseContainer, TailContainer, PageDirectory, insert into containers, RecordAddress and Record
- Jake - Persistence, RTable, RDatabase, RQuery, new RIndex, python bindings, inserts and reads for all props
.
├── benches
│ └── db_benchmark.rs
├── Cargo.lock
├── Cargo.toml
├── docs
│ └── old-system-diagrams.md
├── LICENSE
├── main_checking.py
├── __main__.py
├── Makefile
├── pyproject.toml
├── python
│ ├── benchmarks
│ │ ├── graph_scale.py
│ │ ├── scaling_tester.py
│ │ ├── simple_example.py
│ │ ├── simple_tester.py
│ │ └── speedtests.py
│ ├── lstore
│ │ ├── db.py
│ │ ├── __init__.py
│ │ ├── query.py
│ │ ├── transaction.py
│ │ └── transaction_worker.py
│ └── tests
│ ├── __init__.py
│ └── test_main.py
├── README.md
├── requirements.txt
└── src
├── container.rs
├── database.rs
├── index.rs
├── lib.rs
├── pagerange.rs
├── page.rs
├── query.rs
├── record.rs
├── system.rs
└── table.rs
The lstore (./lstore
) directory is where the Python code goes. This is what gets called by the tests in __main__.py
. The lstore directory can be referred to as a Python module.
The src (./src
) directory is where the Rust code goes. This gets called by the code in the lstore module.
system.rs
- all the functions and structs related to getting information from the system machine
database.rs
- the database struct and the related functions
In the demo project, we made an API that uses our database to store grades. Here we are sending a PUT and then doing a GET.
After a simple fix after we submitted, we were able to get everything working.
exam_tester_m3_part1.py
and exam_tester_m3_part2.py
m3_tester_part_1.py
and m3_tester_part_2.py
cargo test
Here is what the correct output should look like. You should see multiple tests passing.
Rust unit tests are located in each Rust file and can be found in ./src
The integration tests are located at ./tests
and are also run with cargo test
pytest
Python tests are located in a separate directory called tests
located in ./python
Rust has a way of making docs from the source code. Run cargo doc
and view the produced HTML page in your browser. Adding comments to your code starting with ///
will be put into these docs.
You may need to install flamegraph with cargo install flamegraph
cargo flamegraph --test update_tests
# Open the svg (It's nice to view in a browser)
firefox flamegraph.svg
Preview:
This will take a long time but you can run benchmarks separately.
cargo bench
You can use cargo bench
to see if your changes significantly affect performance.
Often small changes can happen randomly. Like this has no change in the code. Try to run the bench as the only thing running on the system.
Maturin is what builds our Python package. Maturin has a mode called --release
that significantly improves the speed. This is because release mode runs Cargo in release mode, and this does all of the optimization tricks to speed up the executable as much as possible.
Data for both graphs can be found here
With compiler options
(venv) redoxql (main) λ p __main__.py
Inserting 10k records took: 0.006803145
Updating 10k records took: 0.018702902999999996
Selecting 10k records took: 0.016315803000000004
Aggregate 10k of 100 record batch took: 0.005981531999999998
Deleting 10k records took: 0.002332115999999995
(venv) redoxql (main) λ time p m1_tester.py
Insert finished
real 0m0.117s
user 0m0.106s
sys 0m0.010s
user 0m0.106s
sys 0m0.010s
(venv) redoxql (main) λ time p exam_tester_m1.py
Insert finished
real 0m0.282s
user 0m0.272s
sys 0m0.010s
(venv) redoxql (main) λ
Without compiler options
(venv) redoxql (main) λ p __main__.py
Inserting 10k records took: 0.007401888000000002
Updating 10k records took: 0.018957811999999997
Selecting 10k records took: 0.015054888999999995
Aggregate 10k of 100 record batch took: 0.003300163000000002
Deleting 10k records took: 0.002181812999999991
(venv) redoxql (main) λ time p m1_tester.py
Insert finished
real 0m0.112s
user 0m0.108s
sys 0m0.004s
(venv) redoxql (main) λ time p exam_tester_m1.py
Insert finished
real 0m0.268s
user 0m0.254s
sys 0m0.014s
To use the logger, import the debug, error, or info macro from the log crate.
Then you can add the macros to code like debug!("Start database!");
.
When you go to run the code, you can set the env var RUST_LOG=debug
.
Docs: https://docs.rs/env_logger/latest/env_logger/.
We started to try using Pypy which is a runtime for Python that is supposedly faster. Because of Amdahl's law, we actually can't get all that much performance out of it. We also found that the newest version of Pypy cannot use the newest version of Pyo3, so future work is needed to get them to run together.
Future questions:
- How much time does the Python part take up?
- How do we measure the improvement from Python to Pypy
- How do we downgrade Pypy to work with Py03
We opted to go for C Python because it's much faster in this use case. We show the results of some tests here.
We use py-spy
to profile Python and we got expected results. It shows that the main use of Python is just the update function when testM2.py is run.
Here is how we ran it:
py-spy record -o profile.svg -- python3 testM2.py
We used to make a ReturnRecord object for every single row! We also would turn the result of rquery into a list, wrap each in ReturnRecord and then make that back into a new list. ReturnRecord would also do list truncation each time it was initialized (for each row). We moved all of this logic into Rust can simply return what the Rust function returns. This change improved the performance by nearly 30%. The speed of testM2.py went from 1.30 seconds to 0.92 seconds. These results we consistent across 3 different runs each done in an interwoven fashion (with change, then without change, then with change again, etc.) for a total of 6 runs. The change can be seen in PR#162.
- class ReturnRecord:
- def __init__(self, columns: List[int]):
- self.columns = columns[4:]
-
- def __str__(self):
- return f"Record({self.columns})"
- return [
- ReturnRecord(
- list(
- self.rquery.select_version(
- search_key,
- search_key_index,
- projected_columns_index,
- relative_version,
- )
- )
- )
- ]
+ return self.rquery.select_version(
+ search_key,
+ search_key_index,
+ projected_columns_index,
+ relative_version,
+ )
This includes insert
and update
.
The Rust default HashMap is a great general purpose HashMap implementation and is very fast but FxHashMap is a decent bit faster. After changing the page directory to use FxHashMap in PR#186, the speed of many reads and writes improved by over 28% and the overall speed of all the benchmarks improved by over 26%.
use crate::container::{ReservedColumns, NUM_RESERVED_COLUMNS};
use crate::index::RIndexHandle;
use pyo3::prelude::*;
use rustc_hash::FxHashMap;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::sync::{Arc, Mutex, RwLock};
type RedoxQLHasher<K, V> = FxHashMap<K, V>;
#[derive(Default, Clone, Serialize, Deserialize)]
pub struct PageDirectoryMetadata {
pub directory: HashMap<i64, RecordMetadata>,
pub directory: RedoxQLHasher<i64, RecordMetadata>,
}
#[derive(Default, Clone)]
pub struct PageDirectory {
pub directory: HashMap<i64, Record>,
pub directory: RedoxQLHasher<i64, Record>,
}
// -- snip --
You can set specific build flags that improve the optimization.
[profile.release]
lto = "fat"
codegen-units = 1
opt-level = 3
-
lto = "fat"
is the max setting for "Link time optimization", meaning that when it links the binary together, it does the maximum amount of optimizations. -
codegen-units = 1
means that rustc will not split up the source code into different parts to compile separately on different threads. Because of this, it's able to apply optimization tricks like inlining between libraries and our project source code. -
opt-level = 3
is just setting the optimization level. The reason you might want it lower is compilation speed.
After reading bits from the O'Reilly High Performance Python, 3rd Edition, I found out about profiling with snakeviz.
pip install snakeviz
python3 -m cProfile -o profile.prof testM2.py
snakeviz profile.prof
Here is the output from testM2.py
Python gets parsed into bytecode and we can see it using the dis library. This can be useful to see what byte might be unneeded.
- Improve hashing speed for page directory by over 26%
- ~70% speed improvement with lazy select (Not used because the API expects an iterable)
- Move select return record logic all to Rust to achieve 30% speed improvement