Building and Testing

Relevant source files

This page provides instructions for building SIMD R Drive from source, running the test suite, and executing performance benchmarks. It covers workspace configuration, feature flags, test types, and benchmark utilities. For information about CI/CD pipeline configuration and automated quality checks, see CI/CD Pipeline.

Prerequisites

Required Dependencies:

Rust toolchain (stable channel)
Cargo package manager (bundled with Rust)

Optional Dependencies:

Python 3.10-3.13 (for Python bindings)
Maturin (for building Python wheels)

The project uses Rust 2024 edition and requires no platform-specific dependencies beyond the standard Rust toolchain.

Sources: Cargo.toml:1-13

Workspace Structure

SIMD R Drive is organized as a Cargo workspace with multiple crates. Understanding the workspace layout is essential for targeted builds and tests.

Workspace Members:

graph TB
    subgraph "Workspace Root"
        Root["Cargo.toml\nWorkspace Configuration"]
end
    
    subgraph "Core Crates"
        Core["simd-r-drive\n(Root Package)"]
EntryHandle["simd-r-drive-entry-handle\n(Zero-Copy API)"]
Extensions["extensions/\n(Utility Functions)"]
end
    
    subgraph "Network Experiments"
        WSServer["experiments/simd-r-drive-ws-server\n(WebSocket Server)"]
WSClient["experiments/simd-r-drive-ws-client\n(Rust Client)"]
ServiceDef["experiments/simd-r-drive-muxio-service-definition\n(RPC Contract)"]
end
    
    subgraph "Python Bindings (Excluded)"
        PyBindings["experiments/bindings/python\n(PyO3 Direct)"]
PyWSClient["experiments/bindings/python-ws-client\n(WebSocket Client)"]
end
    
 
   Root --> Core
 
   Root --> EntryHandle
 
   Root --> Extensions
 
   Root --> WSServer
 
   Root --> WSClient
 
   Root --> ServiceDef
    
 
   PyBindings -.->|excluded from workspace| Root
 
   PyWSClient -.->|excluded from workspace| Root
    
 
   Core --> EntryHandle
 
   WSServer --> ServiceDef
 
   WSClient --> ServiceDef

Crate	Path	Purpose
`simd-r-drive`	`.`	Core storage engine
`simd-r-drive-entry-handle`	`./simd-r-drive-entry-handle`	Zero-copy data access API
`simd-r-drive-extensions`	`./extensions`	Utility functions and helpers
`simd-r-drive-ws-server`	`./experiments/simd-r-drive-ws-server`	Axum-based WebSocket server
`simd-r-drive-ws-client`	`./experiments/simd-r-drive-ws-client`	Native Rust WebSocket client
`simd-r-drive-muxio-service-definition`	`./experiments/simd-r-drive-muxio-service-definition`	RPC service contract

Excluded from Workspace:

experiments/bindings/python - Python direct bindings (built separately with Maturin)
experiments/bindings/python-ws-client - Python WebSocket client (built separately with Maturin)

Python bindings are excluded because they require a different build system (Maturin) and have incompatible dependency resolution requirements.

Sources: Cargo.toml:65-78

Building the Project

Basic Build Commands

Build all workspace members:

Build with release optimizations:

Build specific crate:

Build all targets (binaries, libraries, tests, benchmarks):

Sources: .github/workflows/rust-tests.yml54

graph LR
    subgraph "Feature Flags"
        Default["default\n(empty set)"]
Parallel["parallel\nEnables rayon"]
ExposeAPI["expose-internal-api\nExposes internal symbols"]
Arrow["arrow\nArrow integration"]
AllFeatures["--all-features\nEnable everything"]
end
    
    subgraph "Dependencies Enabled"
        Rayon["rayon = 1.10.0\nParallel iteration"]
EntryArrow["simd-r-drive-entry-handle/arrow\nArrow conversions"]
end
    
 
   Parallel --> Rayon
 
   Arrow --> EntryArrow
 
   AllFeatures --> Parallel
 
   AllFeatures --> ExposeAPI
 
   AllFeatures --> Arrow

Feature Flags

The simd-r-drive crate supports optional feature flags that enable additional functionality:

Feature Flag Reference:

Feature	Dependencies	Purpose
`default`	None	Standard storage engine only
`parallel`	`rayon = "1.10.0"`	Parallel iteration support for bulk operations
`expose-internal-api`	None	Exposes internal APIs for advanced use cases
`arrow`	`simd-r-drive-entry-handle/arrow`	Enables Apache Arrow integration

Build Examples:

Sources: Cargo.toml:49-55 Cargo.toml30

CLI Binary

The workspace includes a CLI binary for direct interaction with the storage engine:

The CLI entry point is defined at src/main.rs:1-12 and delegates to command execution logic in the cli module.

Sources: src/main.rs:1-12

Running Tests

Test Organization

Test Categories:

Unit Tests - Embedded in source files, test individual functions and modules
Integration Tests - Located in tests/ directory, test public API interactions
Concurrency Tests - Multi-threaded scenarios validating thread safety
Documentation Tests - Code examples in documentation comments

Sources: Cargo.toml:36-47

Running All Tests

Execute the complete test suite:

Sources: .github/workflows/rust-tests.yml57

Concurrency Tests

Concurrency tests validate thread-safe operations and are located in tests/concurrency_tests.rs:1-230 These tests use serial_test to prevent parallel execution and tokio for async runtime.

Test Scenarios:

graph TB
    subgraph "Concurrency Test Structure"
        TestAttr["#[tokio::test(flavor=multi_thread)]\n#[serial]"]
TempDir["tempfile::tempdir()\nIsolated Storage"]
DataStore["Arc&lt;DataStore&gt;\nShared Storage"]
Tasks["tokio::spawn\nConcurrent Tasks"]
end
    
    subgraph "Test Scenarios"
        StreamTest["concurrent_slow_streamed_write_test\nParallel stream writes"]
WriteTest["concurrent_write_test\n16 threads × 10 writes"]
InterleavedTest["interleaved_read_write_test\nRead/Write coordination"]
end
    
 
   TestAttr --> TempDir
 
   TempDir --> DataStore
 
   DataStore --> Tasks
    
 
   Tasks --> StreamTest
 
   Tasks --> WriteTest
 
   Tasks --> InterleavedTest

1. Concurrent Slow Streamed Write Test (tests/concurrency_tests.rs:16-109):

Simulates slow streaming writes from multiple threads
Uses SlowReader wrapper to introduce artificial latency
Validates data integrity after concurrent stream completion
Configuration: 1MB payloads with 100ms read delays

2. Concurrent Write Test (tests/concurrency_tests.rs:111-161):

Spawns 16 threads, each performing 10 writes
Tests high-contention write scenarios
Validates all 160 writes are retrievable
Uses 5ms delays to simulate realistic timing

3. Interleaved Read/Write Test (tests/concurrency_tests.rs:163-229):

Tests read-after-write and write-after-read patterns
Uses tokio::sync::Notify for coordination
Validates proper synchronization between readers and writers

Run Concurrency Tests:

Key Dependencies:

Dependency	Purpose	Reference
`serial_test = "3.2.0"`	Prevents parallel test execution	Cargo.toml44
`tokio` (with `rt-multi-thread`, `macros`)	Async runtime for concurrent tests	Cargo.toml47
`tempfile = "3.19.0"`	Temporary file creation for isolated tests	Cargo.toml45

Sources: tests/concurrency_tests.rs:1-230 Cargo.toml:44-47

graph LR
    subgraph "Benchmark Configuration"
        CargoBench["cargo bench\nCriterion Runner"]
StorageBench["storage_benchmark\nharness = false"]
ContentionBench["contention_benchmark\nharness = false"]
end
    
    subgraph "Storage Benchmark Operations"
        AppendOp["benchmark_append_entries\n1M writes in batches"]
SeqRead["benchmark_sequential_reads\nZero-copy iteration"]
RandRead["benchmark_random_reads\n1M random lookups"]
BatchRead["benchmark_batch_reads\nVectorized reads"]
end
    
 
   CargoBench --> StorageBench
 
   CargoBench --> ContentionBench
    
 
   StorageBench --> AppendOp
 
   StorageBench --> SeqRead
 
   StorageBench --> RandRead
 
   StorageBench --> BatchRead

Running Benchmarks

The workspace includes micro-benchmarks using the Criterion framework to measure storage engine performance.

Benchmark Targets

Benchmark Targets:

storage_benchmark - Single-process micro-benchmarks (benches/storage_benchmark.rs:1-234)
contention_benchmark - Multi-threaded contention scenarios (referenced in Cargo.toml:62-63)

Both use harness = false to disable Cargo's default benchmark harness and use custom timing logic.

Sources: Cargo.toml:57-63

Storage Benchmark

The storage benchmark (benches/storage_benchmark.rs:1-234) measures four critical operations:

Configuration Constants:

Constant	Value	Purpose
`NUM_ENTRIES`	1,000,000	Total entries for write phase
`ENTRY_SIZE`	8 bytes	Fixed payload size
`WRITE_BATCH_SIZE`	1,024	Entries per batch write
`READ_BATCH_SIZE`	1,024	Entries per batch read
`NUM_RANDOM_CHECKS`	1,000,000	Random read operations
`NUM_BATCH_CHECKS`	1,000,000	Batch read operations

Benchmark Operations:

1. Append Entries (benches/storage_benchmark.rs:52-83):

Writes 1M entries in batches using batch_write
Measures write throughput (writes/second)
Uses fixed 8-byte little-endian payloads

2. Sequential Reads (benches/storage_benchmark.rs:98-118):

Iterates through all entries using zero-copy iteration
Validates data integrity during iteration
Measures sequential read throughput

3. Random Reads (benches/storage_benchmark.rs:124-149):

Performs 1M random single-key lookups
Uses rand::rng().random_range() for key selection
Validates retrieved data matches expected values

4. Batch Reads (benches/storage_benchmark.rs:155-181):

Performs vectorized reads using batch_read
Processes reads in batches of 1,024 keys
Measures amortized batch read performance

Run Benchmarks:

Benchmark Output Format:

The storage benchmark uses a custom fmt_rate function (benches/storage_benchmark.rs:220-233) to format throughput metrics with thousands separators and three decimal places:

Wrote 1,000,000 entries of 8 bytes in 2.345s (426,439.232 writes/s)
Sequentially read 1,000,000 entries in 1.234s (810,372.771 reads/s)
Randomly read 1,000,000 entries in 3.456s (289,351.852 reads/s)
Batch-read verified 1,000,000 entries in 0.987s (1,013,171.005 reads/s)

Sources: benches/storage_benchmark.rs:1-234 Cargo.toml:57-63

graph TB
    subgraph "CI Test Matrix"
        Trigger["Push to main\nor Pull Request"]
end
    
    subgraph "Operating Systems"
        Ubuntu["ubuntu-latest"]
MacOS["macos-latest"]
Windows["windows-latest"]
end
    
    subgraph "Feature Configurations"
        Default["Default (empty)"]
NoDefault["--no-default-features"]
Parallel["--features parallel"]
ExposeAPI["--features expose-internal-api"]
Combined["--features=parallel,expose-internal-api"]
AllFeatures["--all-features"]
end
    
    subgraph "Build Steps"
        Cache["Cache Cargo dependencies"]
Build["cargo build --workspace --all-targets"]
Test["cargo test --workspace --all-targets --verbose"]
BenchCheck["cargo bench --workspace --no-run"]
end
    
 
   Trigger --> Ubuntu
 
   Trigger --> MacOS
 
   Trigger --> Windows
    
 
   Ubuntu --> Default
 
   Ubuntu --> NoDefault
 
   Ubuntu --> Parallel
 
   Ubuntu --> ExposeAPI
 
   Ubuntu --> Combined
 
   Ubuntu --> AllFeatures
    
 
   Default --> Cache
 
   Cache --> Build
 
   Build --> Test
 
   Test --> BenchCheck

CI/CD Test Matrix

The GitHub Actions workflow runs tests across multiple platforms and feature combinations to ensure compatibility.

Test Matrix Configuration:

OS	Feature Flags
Ubuntu, macOS, Windows	Default
Ubuntu, macOS, Windows	`--no-default-features`
Ubuntu, macOS, Windows	`--features parallel`
Ubuntu, macOS, Windows	`--features expose-internal-api`
Ubuntu, macOS, Windows	`--features=parallel,expose-internal-api`
Ubuntu, macOS, Windows	`--all-features`

Total Test Combinations: 3 OS × 6 feature configs = 18 test jobs

CI Pipeline Steps:

Cache Dependencies (.github/workflows/rust-tests.yml:40-51) - Caches ~/.cargo and target/ directories using lock file hash
Build (.github/workflows/rust-tests.yml:53-54) - Compiles all workspace targets with specified feature flags
Test (.github/workflows/rust-tests.yml:56-57) - Runs complete test suite with verbose output
Check Benchmarks (.github/workflows/rust-tests.yml:60-61) - Validates benchmark compilation without execution

Caching Strategy:

The workflow uses cache keys based on:

Operating system (runner.os)
Lock file hash (hashFiles('**/Cargo.lock'))
Feature flags (matrix.flags)

This ensures efficient dependency reuse while avoiding cross-contamination between different build configurations.

Sources: .github/workflows/rust-tests.yml:1-62

Development Dependencies

Test Dependencies

Dependency	Version	Purpose
`serial_test`	3.2.0	Sequential test execution for concurrency tests
`tempfile`	3.19.0	Temporary file/directory creation
`tokio`	1.45.1	Async runtime with multi-thread support
`rand`	0.9.0	Random number generation for benchmarks
`bincode`	1.3.3	Serialization for test data
`serde`	1.0.219	Serialization framework
`serde_json`	1.0.140	JSON serialization for test data
`futures`	0.3.31	Async utilities
`bytemuck`	1.23.2	Zero-copy type conversions