Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

DeepWiki GitHub

Overview

Relevant source files

Purpose and Scope

This document provides a high-level introduction to SIMD R Drive, describing its core purpose, architectural components, access methods, and key features. It serves as the entry point for understanding the system before diving into detailed subsystem documentation.

For details on the core storage engine internals, see Core Storage Engine. For network-based remote access, see Network Layer and RPC. For Python integration, see Python Integration. For performance optimization details, see Performance Optimizations.

Sources: README.md:1-42


What is SIMD R Drive?

SIMD R Drive is a high-performance, append-only, schema-less storage engine designed for zero-copy binary data access. It stores arbitrary binary payloads in a single-file container without imposing serialization formats, schemas, or data interpretation. All data is treated as raw bytes (&[u8]), providing maximum flexibility for applications that require high-speed storage and retrieval of binary data.

Core Characteristics

CharacteristicDescription
Storage ModelAppend-only, single-file container
Data FormatSchema-less binary (&[u8])
Access PatternZero-copy memory-mapped reads
Alignment64-byte boundaries (configurable via PAYLOAD_ALIGNMENT)
ConcurrencyThread-safe reads and writes within a single process
IndexingHardware-accelerated XXH3_64 hash-based key lookup
IntegrityCRC32C checksums and validation chain

The storage engine is optimized for workloads that benefit from SIMD operations, cache-line efficiency, and direct memory access. By enforcing 64-byte payload alignment, it enables efficient typed slice reinterpretation (e.g., &[u32], &[u64]) without copying.

Sources: README.md:5-8 README.md:43-87 Cargo.toml13


Core Architecture Components

The system consists of three primary layers: the storage engine core, network access layer, and language bindings. The following diagram maps high-level architectural concepts to specific code entities.

graph TB
    subgraph "User Interfaces"
        CLI["CLI Binary\nsimd-r-drive crate\nmain.rs"]
PythonApp["Python Applications\nsimd_r_drive.DataStoreWsClient"]
RustApp["Native Rust Clients\nDataStoreWsClient struct"]
end
    
    subgraph "Network Layer"
        WSServer["WebSocket Server\nsimd-r-drive-ws-server\nAxum HTTP server"]
RPCDef["Service Definition\nsimd-r-drive-muxio-service-definition\nDataStoreService trait"]
end
    
    subgraph "Core Storage Engine"
        DataStore["DataStore struct\nsrc/data_store.rs"]
Traits["DataStoreReader trait\nDataStoreWriter trait"]
KeyIndexer["KeyIndexer struct\nsrc/key_indexer.rs"]
EntryHandle["EntryHandle struct\nsimd-r-drive-entry-handle crate"]
end
    
    subgraph "Storage Infrastructure"
        Mmap["Arc<Mmap>\nmemmap2 crate"]
FileHandle["BufWriter<File>\nstd::fs::File"]
AtomicOffset["AtomicU64\ntail_offset field"]
end
    
    subgraph "Performance Layer"
        SIMDCopy["simd_copy function\nsrc/simd_utils.rs"]
XXH3["xxhash-rust crate\nXXH3_64 algorithm"]
end
    
 
   CLI --> DataStore
 
   PythonApp --> WSServer
 
   RustApp --> WSServer
    
 
   WSServer --> RPCDef
 
   RPCDef --> DataStore
    
 
   DataStore --> Traits
 
   DataStore --> KeyIndexer
 
   DataStore --> EntryHandle
 
   DataStore --> Mmap
 
   DataStore --> FileHandle
 
   DataStore --> AtomicOffset
 
   DataStore --> SIMDCopy
    
 
   KeyIndexer --> XXH3
 
   EntryHandle --> Mmap
    
    style DataStore fill:#f9f9f9,stroke:#333,stroke-width:3px
    style Traits fill:#f9f9f9,stroke:#333,stroke-width:2px

System Architecture with Code Entities

Diagram: System architecture showing code entity mappings

Component Descriptions

ComponentCode EntityPurpose
DataStoreDataStore struct in src/data_store.rsMain storage interface implementing read/write operations
DataStoreReaderDataStoreReader traitDefines zero-copy read operations (read, exists, batch_read)
DataStoreWriterDataStoreWriter traitDefines synchronized write operations (write, delete, batch_write)
KeyIndexerKeyIndexer struct in src/key_indexer.rsHash-based index mapping u64 hashes to (tag, offset) tuples
EntryHandleEntryHandle struct in simd-r-drive-entry-handle/src/lib.rsZero-copy reference to memory-mapped payload data
Memory MappingArc<Mmap> wrapped in MutexShared memory-mapped file reference for zero-copy reads
File HandleArc<RwLock<BufWriter<File>>>Synchronized buffered writer for append operations
Tail OffsetAtomicU64 field tail_offsetAtomic counter tracking the current end-of-file position

Sources: Cargo.toml:66-73 src/data_store.rs (inferred from architecture diagrams), High-level diagrams provided


graph LR
    subgraph "Direct Access"
        DirectApp["Rust Application"]
DirectDS["DataStore::open\nDataStoreReader\nDataStoreWriter"]
end
    
    subgraph "CLI Access"
        CLIApp["Command Line"]
CLIBin["simd-r-drive binary\nclap::Parser"]
end
    
    subgraph "Remote Access"
        PyClient["Python Client\nDataStoreWsClient class"]
RustClient["Rust Client\nDataStoreWsClient struct"]
WSServer["WebSocket Server\nmuxio-tokio-rpc-server\nAxum router"]
BackendDS["DataStore instance"]
end
    
 
   DirectApp --> DirectDS
 
   CLIApp --> CLIBin
 
   CLIBin --> DirectDS
    
 
   PyClient --> WSServer
 
   RustClient --> WSServer
 
   WSServer --> BackendDS
    
    style DirectDS fill:#f9f9f9,stroke:#333,stroke-width:2px
    style WSServer fill:#f9f9f9,stroke:#333,stroke-width:2px
    style BackendDS fill:#f9f9f9,stroke:#333,stroke-width:2px

Access Methods

SIMD R Drive can be accessed through three primary interfaces, each optimized for different use cases.

Access Method Architecture

Diagram: Access methods with code entity mappings

Access Method Comparison

MethodUse CaseCode Entry PointLatencyThroughput
Direct LibraryEmbedded in Rust applicationsDataStore::open()MicrosecondsHighest (zero-copy)
CLICommand-line operations, scriptingsimd-r-drive binary with clapMillisecondsProcess-bound
WebSocket RPCRemote access, language bindingsDataStoreWsClient (Rust/Python)Network-dependentRPC-serialization-bound

Direct Library Access:

  • Applications link against the simd-r-drive crate directly
  • Call DataStore::open() to obtain a storage instance
  • Use DataStoreReader and DataStoreWriter traits for operations
  • Provides lowest latency and highest throughput

CLI Access:

  • The simd-r-drive binary provides a command-line interface
  • Built using clap for argument parsing
  • Useful for scripting, testing, and manual operations
  • Each invocation opens the storage, performs operation, and closes

Remote Access (WebSocket RPC):

  • simd-r-drive-ws-server provides network access via WebSocket
  • Uses the Muxio RPC framework with bitcode serialization
  • DataStoreWsClient available for both Rust and Python clients
  • Enables multi-language access and distributed architectures

For CLI details, see Repository Structure. For WebSocket server architecture, see WebSocket Server. For Python client usage, see Python WebSocket Client API.

Sources: Cargo.toml:23-26 README.md9 README.md:262-266 High-level diagrams


Key Features

Zero-Copy Memory-Mapped Access

SIMD R Drive uses memmap2 to memory-map the storage file, allowing direct access to stored data without deserialization or copying. The EntryHandle struct provides a zero-copy view into the memory-mapped region, returning &[u8] slices that point directly into the mapped file.

This approach enables:

  • Sub-microsecond reads for indexed lookups
  • Minimal memory overhead for large entries
  • Efficient processing of datasets larger than available RAM

The memory-mapped file is wrapped in Arc<Mutex<Arc<Mmap>>> to ensure thread-safe access during concurrent reads and remap operations.

Sources: README.md:43-49 simd-r-drive-entry-handle/ (inferred)

Fixed 64-Byte Payload Alignment

Every non-tombstone payload begins on a 64-byte boundary (defined by PAYLOAD_ALIGNMENT constant). This alignment matches typical CPU cache line sizes and enables:

  • Cache-friendly access with reduced cache line splits
  • Full-speed SIMD operations (AVX2, AVX-512, NEON) without misalignment penalties
  • Zero-copy typed slices when payload length matches element size (e.g., &[u64])

Pre-padding bytes are inserted before payloads to maintain this alignment. Tombstones (deletion markers) do not require alignment.

Sources: README.md:51-59 simd-r-drive-entry-handle/src/constants.rs (inferred)

Single-File Storage Container

All data is stored in a single append-only file with the following characteristics:

AspectDescription
File StructureSequential entries: [pre-pad] [payload] [metadata]
Metadata SizeFixed 20 bytes: key_hash (8) + prev_offset (8) + checksum (4)
Entry ChainingEach metadata contains prev_offset pointing to previous entry's tail
ValidationCRC32C checksums and backward chain traversal
RecoveryAutomatic truncation of incomplete writes on open

The storage format is detailed in Entry Structure and Metadata.

Sources: README.md:62-147 README.md:104-150

Thread-Safe Concurrency

SIMD R Drive supports concurrent operations within a single process using:

MechanismCode EntityPurpose
Read LockRwLock (reads)Allows multiple concurrent readers
Write LockRwLock (writes)Ensures exclusive write access
Atomic OffsetAtomicU64 (tail_offset)Tracks file end without locking
Index LockRwLock<HashMap>Protects key index updates
Mmap LockMutex<Arc<Mmap>>Prevents concurrent remapping

Concurrency Guarantees:

  • ✅ Multiple threads can read concurrently (zero-copy, lock-free)
  • ✅ Write operations are serialized via RwLock
  • ✅ Index updates are synchronized
  • ❌ Multiple processes require external file locking

For detailed concurrency model, see Concurrency and Thread Safety.

Sources: README.md:170-206

Hardware-Accelerated Indexing

The KeyIndexer uses the xxhash-rust crate with XXH3_64 algorithm, which provides hardware acceleration:

  • SSE2 on x86_64 (universally supported)
  • AVX2 on capable x86_64 CPUs (runtime detection)
  • NEON on aarch64 (default)

Key lookups are O(1) via HashMap, with benchmarks showing ~1 million random 8-byte lookups completing in under 1 second.

Sources: README.md:158-168 Cargo.toml34

SIMD Write Acceleration

The simd_copy function (in src/simd_utils.rs) accelerates memory copying during write operations:

  • x86_64 with AVX2 : 32-byte SIMD chunks using _mm256_loadu_si256 / _mm256_storeu_si256
  • aarch64 : 16-byte NEON chunks using vld1q_u8 / vst1q_u8
  • Fallback : Standard copy_from_slice when SIMD unavailable

This optimization reduces CPU cycles during buffer staging before disk writes.

For SIMD implementation details, see SIMD Acceleration.

Sources: README.md:249-257 src/simd_utils.rs (inferred)


Write and Read Modes

Write Modes

ModeMethodUse CaseFlush Behavior
Single Entrywrite(key, payload)Individual writesImmediate flush
Batchbatch_write(&[(key, payload)])Multiple entriesSingle flush at end
Streamingwrite_large_entry(key, Read)Large payloadsStreaming with immediate flush

Batch writes reduce disk I/O overhead by grouping multiple entries under a single write lock and flushing once.

Sources: README.md:208-223

Read Modes

ModeMethodMemory BehaviorUse Case
Directread(key) -> EntryHandleZero-copy mmap referenceStandard reads
Streamingread_stream(key) -> impl ReadBuffered, non-zero-copyLarge entries
Parallel Iterationpar_iter_entries() (Rayon)Parallel processingBulk analytics

Direct reads return EntryHandle with zero-copy &[u8] access. Streaming reads process data incrementally through a buffer. Parallel iteration is available via the optional parallel feature.

For iteration details, see Parallel Iteration (via Rayon).

Sources: README.md:225-247


Repository Structure

The project is organized as a Cargo workspace with the following crates:

Diagram: Workspace structure with crate relationships

Crate Descriptions

CratePathPurpose
simd-r-drive./Core storage engine with DataStore, KeyIndexer, SIMD utilities
simd-r-drive-entry-handle./simd-r-drive-entry-handle/Zero-copy EntryHandle and metadata structures
simd-r-drive-extensions./extensions/Utility functions and helper modules
simd-r-drive-muxio-service-definition./experiments/simd-r-drive-muxio-service-definition/RPC service trait definitions using bitcode
simd-r-drive-ws-server./experiments/simd-r-drive-ws-server/Axum-based WebSocket RPC server
simd-r-drive-ws-client./experiments/simd-r-drive-ws-client/Native Rust WebSocket client
simd-r-drive-py./experiments/bindings/python/PyO3-based Python bindings for direct access
simd-r-drive-ws-client-py./experiments/bindings/python-ws-client/Python WebSocket client wrapper

The workspace is defined in Cargo.toml:65-78 with version 0.15.5-alpha specified in Cargo.toml3

For detailed repository structure, see Repository Structure.

Sources: Cargo.toml:65-78 Cargo.toml:1-10 README.md:259-266


Performance Characteristics

SIMD R Drive is designed for high-performance workloads with the following characteristics:

Benchmark Context

MetricTypical Performance
Random Read (8-byte)~1M lookups in < 1 second
Sequential WriteLimited by disk I/O and flush frequency
Memory OverheadMinimal (mmap-based, on-demand paging)
Index LookupO(1) via HashMap with XXH3_64

Optimization Strategies

  1. SIMD Copy Operations : The simd_copy function uses AVX2/NEON for bulk memory transfers during writes
  2. Hardware-Accelerated Hashing : XXH3_64 with SSE2/AVX2/NEON for fast key hashing
  3. Zero-Copy Reads : Memory-mapped access eliminates deserialization overhead
  4. Cache-Line Alignment : 64-byte boundaries reduce cache misses
  5. Batch Operations : Grouping writes reduces lock contention and flush overhead

For detailed performance optimization documentation, see Performance Optimizations.

Sources: README.md:158-168 README.md:249-257


Next Steps

This overview provides a foundation for understanding SIMD R Drive. For deeper exploration:

Sources: All sections above