Building and Testing
Relevant source files
- .github/workflows/rust-tests.yml
- .gitignore
- Cargo.toml
- benches/storage_benchmark.rs
- src/main.rs
- src/utils/format_bytes.rs
- tests/concurrency_tests.rs
This page provides instructions for building SIMD R Drive from source, running the test suite, and executing performance benchmarks. It covers workspace configuration, feature flags, test types, and benchmark utilities. For information about CI/CD pipeline configuration and automated quality checks, see CI/CD Pipeline.
Prerequisites
Required Dependencies:
- Rust toolchain (stable channel)
- Cargo package manager (bundled with Rust)
Optional Dependencies:
- Python 3.10-3.13 (for Python bindings)
- Maturin (for building Python wheels)
The project uses Rust 2024 edition and requires no platform-specific dependencies beyond the standard Rust toolchain.
Sources: Cargo.toml:1-13
Workspace Structure
SIMD R Drive is organized as a Cargo workspace with multiple crates. Understanding the workspace layout is essential for targeted builds and tests.
Workspace Members:
graph TB
subgraph "Workspace Root"
Root["Cargo.toml\nWorkspace Configuration"]
end
subgraph "Core Crates"
Core["simd-r-drive\n(Root Package)"]
EntryHandle["simd-r-drive-entry-handle\n(Zero-Copy API)"]
Extensions["extensions/\n(Utility Functions)"]
end
subgraph "Network Experiments"
WSServer["experiments/simd-r-drive-ws-server\n(WebSocket Server)"]
WSClient["experiments/simd-r-drive-ws-client\n(Rust Client)"]
ServiceDef["experiments/simd-r-drive-muxio-service-definition\n(RPC Contract)"]
end
subgraph "Python Bindings (Excluded)"
PyBindings["experiments/bindings/python\n(PyO3 Direct)"]
PyWSClient["experiments/bindings/python-ws-client\n(WebSocket Client)"]
end
Root --> Core
Root --> EntryHandle
Root --> Extensions
Root --> WSServer
Root --> WSClient
Root --> ServiceDef
PyBindings -.->|excluded from workspace| Root
PyWSClient -.->|excluded from workspace| Root
Core --> EntryHandle
WSServer --> ServiceDef
WSClient --> ServiceDef
| Crate | Path | Purpose |
|---|---|---|
simd-r-drive | . | Core storage engine |
simd-r-drive-entry-handle | ./simd-r-drive-entry-handle | Zero-copy data access API |
simd-r-drive-extensions | ./extensions | Utility functions and helpers |
simd-r-drive-ws-server | ./experiments/simd-r-drive-ws-server | Axum-based WebSocket server |
simd-r-drive-ws-client | ./experiments/simd-r-drive-ws-client | Native Rust WebSocket client |
simd-r-drive-muxio-service-definition | ./experiments/simd-r-drive-muxio-service-definition | RPC service contract |
Excluded from Workspace:
experiments/bindings/python- Python direct bindings (built separately with Maturin)experiments/bindings/python-ws-client- Python WebSocket client (built separately with Maturin)
Python bindings are excluded because they require a different build system (Maturin) and have incompatible dependency resolution requirements.
Sources: Cargo.toml:65-78
Building the Project
Basic Build Commands
Build all workspace members:
Build with release optimizations:
Build specific crate:
Build all targets (binaries, libraries, tests, benchmarks):
Sources: .github/workflows/rust-tests.yml54
graph LR
subgraph "Feature Flags"
Default["default\n(empty set)"]
Parallel["parallel\nEnables rayon"]
ExposeAPI["expose-internal-api\nExposes internal symbols"]
Arrow["arrow\nArrow integration"]
AllFeatures["--all-features\nEnable everything"]
end
subgraph "Dependencies Enabled"
Rayon["rayon = 1.10.0\nParallel iteration"]
EntryArrow["simd-r-drive-entry-handle/arrow\nArrow conversions"]
end
Parallel --> Rayon
Arrow --> EntryArrow
AllFeatures --> Parallel
AllFeatures --> ExposeAPI
AllFeatures --> Arrow
Feature Flags
The simd-r-drive crate supports optional feature flags that enable additional functionality:
Feature Flag Reference:
| Feature | Dependencies | Purpose |
|---|---|---|
default | None | Standard storage engine only |
parallel | rayon = "1.10.0" | Parallel iteration support for bulk operations |
expose-internal-api | None | Exposes internal APIs for advanced use cases |
arrow | simd-r-drive-entry-handle/arrow | Enables Apache Arrow integration |
Build Examples:
Sources: Cargo.toml:49-55 Cargo.toml30
CLI Binary
The workspace includes a CLI binary for direct interaction with the storage engine:
The CLI entry point is defined at src/main.rs:1-12 and delegates to command execution logic in the cli module.
Sources: src/main.rs:1-12
Running Tests
Test Organization
Test Categories:
- Unit Tests - Embedded in source files, test individual functions and modules
- Integration Tests - Located in
tests/directory, test public API interactions - Concurrency Tests - Multi-threaded scenarios validating thread safety
- Documentation Tests - Code examples in documentation comments
Sources: Cargo.toml:36-47
Running All Tests
Execute the complete test suite:
Sources: .github/workflows/rust-tests.yml57
Concurrency Tests
Concurrency tests validate thread-safe operations and are located in tests/concurrency_tests.rs:1-230 These tests use serial_test to prevent parallel execution and tokio for async runtime.
Test Scenarios:
graph TB
subgraph "Concurrency Test Structure"
TestAttr["#[tokio::test(flavor=multi_thread)]\n#[serial]"]
TempDir["tempfile::tempdir()\nIsolated Storage"]
DataStore["Arc<DataStore>\nShared Storage"]
Tasks["tokio::spawn\nConcurrent Tasks"]
end
subgraph "Test Scenarios"
StreamTest["concurrent_slow_streamed_write_test\nParallel stream writes"]
WriteTest["concurrent_write_test\n16 threads × 10 writes"]
InterleavedTest["interleaved_read_write_test\nRead/Write coordination"]
end
TestAttr --> TempDir
TempDir --> DataStore
DataStore --> Tasks
Tasks --> StreamTest
Tasks --> WriteTest
Tasks --> InterleavedTest
1. Concurrent Slow Streamed Write Test (tests/concurrency_tests.rs:16-109):
- Simulates slow streaming writes from multiple threads
- Uses
SlowReaderwrapper to introduce artificial latency - Validates data integrity after concurrent stream completion
- Configuration: 1MB payloads with 100ms read delays
2. Concurrent Write Test (tests/concurrency_tests.rs:111-161):
- Spawns 16 threads, each performing 10 writes
- Tests high-contention write scenarios
- Validates all 160 writes are retrievable
- Uses 5ms delays to simulate realistic timing
3. Interleaved Read/Write Test (tests/concurrency_tests.rs:163-229):
- Tests read-after-write and write-after-read patterns
- Uses
tokio::sync::Notifyfor coordination - Validates proper synchronization between readers and writers
Run Concurrency Tests:
Key Dependencies:
| Dependency | Purpose | Reference |
|---|---|---|
serial_test = "3.2.0" | Prevents parallel test execution | Cargo.toml44 |
tokio (with rt-multi-thread, macros) | Async runtime for concurrent tests | Cargo.toml47 |
tempfile = "3.19.0" | Temporary file creation for isolated tests | Cargo.toml45 |
Sources: tests/concurrency_tests.rs:1-230 Cargo.toml:44-47
graph LR
subgraph "Benchmark Configuration"
CargoBench["cargo bench\nCriterion Runner"]
StorageBench["storage_benchmark\nharness = false"]
ContentionBench["contention_benchmark\nharness = false"]
end
subgraph "Storage Benchmark Operations"
AppendOp["benchmark_append_entries\n1M writes in batches"]
SeqRead["benchmark_sequential_reads\nZero-copy iteration"]
RandRead["benchmark_random_reads\n1M random lookups"]
BatchRead["benchmark_batch_reads\nVectorized reads"]
end
CargoBench --> StorageBench
CargoBench --> ContentionBench
StorageBench --> AppendOp
StorageBench --> SeqRead
StorageBench --> RandRead
StorageBench --> BatchRead
Running Benchmarks
The workspace includes micro-benchmarks using the Criterion framework to measure storage engine performance.
Benchmark Targets
Benchmark Targets:
- storage_benchmark - Single-process micro-benchmarks (benches/storage_benchmark.rs:1-234)
- contention_benchmark - Multi-threaded contention scenarios (referenced in Cargo.toml:62-63)
Both use harness = false to disable Cargo's default benchmark harness and use custom timing logic.
Sources: Cargo.toml:57-63
Storage Benchmark
The storage benchmark (benches/storage_benchmark.rs:1-234) measures four critical operations:
Configuration Constants:
| Constant | Value | Purpose |
|---|---|---|
NUM_ENTRIES | 1,000,000 | Total entries for write phase |
ENTRY_SIZE | 8 bytes | Fixed payload size |
WRITE_BATCH_SIZE | 1,024 | Entries per batch write |
READ_BATCH_SIZE | 1,024 | Entries per batch read |
NUM_RANDOM_CHECKS | 1,000,000 | Random read operations |
NUM_BATCH_CHECKS | 1,000,000 | Batch read operations |
Benchmark Operations:
1. Append Entries (benches/storage_benchmark.rs:52-83):
- Writes 1M entries in batches using
batch_write - Measures write throughput (writes/second)
- Uses fixed 8-byte little-endian payloads
2. Sequential Reads (benches/storage_benchmark.rs:98-118):
- Iterates through all entries using zero-copy iteration
- Validates data integrity during iteration
- Measures sequential read throughput
3. Random Reads (benches/storage_benchmark.rs:124-149):
- Performs 1M random single-key lookups
- Uses
rand::rng().random_range()for key selection - Validates retrieved data matches expected values
4. Batch Reads (benches/storage_benchmark.rs:155-181):
- Performs vectorized reads using
batch_read - Processes reads in batches of 1,024 keys
- Measures amortized batch read performance
Run Benchmarks:
Benchmark Output Format:
The storage benchmark uses a custom fmt_rate function (benches/storage_benchmark.rs:220-233) to format throughput metrics with thousands separators and three decimal places:
Wrote 1,000,000 entries of 8 bytes in 2.345s (426,439.232 writes/s)
Sequentially read 1,000,000 entries in 1.234s (810,372.771 reads/s)
Randomly read 1,000,000 entries in 3.456s (289,351.852 reads/s)
Batch-read verified 1,000,000 entries in 0.987s (1,013,171.005 reads/s)
Sources: benches/storage_benchmark.rs:1-234 Cargo.toml:57-63
graph TB
subgraph "CI Test Matrix"
Trigger["Push to main\nor Pull Request"]
end
subgraph "Operating Systems"
Ubuntu["ubuntu-latest"]
MacOS["macos-latest"]
Windows["windows-latest"]
end
subgraph "Feature Configurations"
Default["Default (empty)"]
NoDefault["--no-default-features"]
Parallel["--features parallel"]
ExposeAPI["--features expose-internal-api"]
Combined["--features=parallel,expose-internal-api"]
AllFeatures["--all-features"]
end
subgraph "Build Steps"
Cache["Cache Cargo dependencies"]
Build["cargo build --workspace --all-targets"]
Test["cargo test --workspace --all-targets --verbose"]
BenchCheck["cargo bench --workspace --no-run"]
end
Trigger --> Ubuntu
Trigger --> MacOS
Trigger --> Windows
Ubuntu --> Default
Ubuntu --> NoDefault
Ubuntu --> Parallel
Ubuntu --> ExposeAPI
Ubuntu --> Combined
Ubuntu --> AllFeatures
Default --> Cache
Cache --> Build
Build --> Test
Test --> BenchCheck
CI/CD Test Matrix
The GitHub Actions workflow runs tests across multiple platforms and feature combinations to ensure compatibility.
Test Matrix Configuration:
| OS | Feature Flags |
|---|---|
| Ubuntu, macOS, Windows | Default |
| Ubuntu, macOS, Windows | --no-default-features |
| Ubuntu, macOS, Windows | --features parallel |
| Ubuntu, macOS, Windows | --features expose-internal-api |
| Ubuntu, macOS, Windows | --features=parallel,expose-internal-api |
| Ubuntu, macOS, Windows | --all-features |
Total Test Combinations: 3 OS × 6 feature configs = 18 test jobs
CI Pipeline Steps:
- Cache Dependencies (.github/workflows/rust-tests.yml:40-51) - Caches
~/.cargoandtarget/directories using lock file hash - Build (.github/workflows/rust-tests.yml:53-54) - Compiles all workspace targets with specified feature flags
- Test (.github/workflows/rust-tests.yml:56-57) - Runs complete test suite with verbose output
- Check Benchmarks (.github/workflows/rust-tests.yml:60-61) - Validates benchmark compilation without execution
Caching Strategy:
The workflow uses cache keys based on:
- Operating system (
runner.os) - Lock file hash (
hashFiles('**/Cargo.lock')) - Feature flags (
matrix.flags)
This ensures efficient dependency reuse while avoiding cross-contamination between different build configurations.
Sources: .github/workflows/rust-tests.yml:1-62
Development Dependencies
Test Dependencies
| Dependency | Version | Purpose |
|---|---|---|
serial_test | 3.2.0 | Sequential test execution for concurrency tests |
tempfile | 3.19.0 | Temporary file/directory creation |
tokio | 1.45.1 | Async runtime with multi-thread support |
rand | 0.9.0 | Random number generation for benchmarks |
bincode | 1.3.3 | Serialization for test data |
serde | 1.0.219 | Serialization framework |
serde_json | 1.0.140 | JSON serialization for test data |
futures | 0.3.31 | Async utilities |
bytemuck | 1.23.2 | Zero-copy type conversions |
Sources: Cargo.toml:36-47
Benchmark Dependencies
| Dependency | Version | Purpose |
|---|---|---|
criterion | 0.6.0 | Benchmark framework (workspace dependency) |
thousands | 0.2.0 | Number formatting for benchmark output |
rand | 0.9.0 | Random data generation |
Sources: Cargo.toml39 Cargo.toml46 Cargo.toml98 Cargo.toml108
Build Artifacts
Binary Output
Compiled binaries are located in:
- Debug builds:
target/debug/simd-r-drive - Release builds:
target/release/simd-r-drive
Library Output
The workspace produces several library artifacts:
libsimd_r_drive.rlib- Core storage enginelibsimd_r_drive_entry_handle.rlib- Zero-copy APIlibsimd_r_drive_extensions.rlib- Utility functions
Ignored Files
The .gitignore configuration excludes build artifacts and data files:
**/target- All Cargo build directories*.bin- Binary storage files created during testing/data- Development and experimentation data directory.cargo/config.toml- Local Cargo overrides
Sources: .gitignore:1-10
Best Practices
Pre-Commit Checklist:
- Run
cargo test --workspace --all-targetsto validate all tests pass - Run
cargo bench --workspace --no-runto ensure benchmarks compile - Run
cargo build --all-featuresto validate all feature combinations - Check that no test data files (
.bin,/data) are committed
Performance Validation:
- Use
cargo bench --bench storage_benchmarkto establish performance baselines - Compare throughput metrics before and after optimization changes
- Monitor memory usage during concurrency tests
Feature Flag Testing:
- Always test with
--no-default-featuresto ensure minimal builds work - Test all feature combinations that will be published
- Verify that feature flags are properly gated with
#[cfg(feature = "...")]
Sources: .github/workflows/rust-tests.yml:1-62 benches/storage_benchmark.rs:1-234