Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

DeepWiki GitHub

Overview

Relevant source files

Purpose and Scope

This document provides a high-level introduction to SIMD R Drive, describing its core purpose, architectural components, access methods, and key features. It serves as the entry point for understanding the system before diving into detailed subsystem documentation.

For details on the core storage engine internals, see Core Storage Engine. For network-based remote access, see Network Layer and RPC. For Python integration, see Python Integration. For performance optimization details, see Performance Optimizations.

Sources: README.md:1-42


What is SIMD R Drive?

SIMD R Drive is a high-performance, append-only, schema-less storage engine designed for zero-copy binary data access. It stores arbitrary binary payloads in a single-file container without imposing serialization formats, schemas, or data interpretation. All data is treated as raw bytes (&[u8]), providing maximum flexibility for applications that require high-speed storage and retrieval of binary data.

Core Characteristics

CharacteristicDescription
Storage ModelAppend-only, single-file container
Data FormatSchema-less binary (&[u8])
Access PatternZero-copy memory-mapped reads
Alignment64-byte boundaries (configurable via PAYLOAD_ALIGNMENT)
ConcurrencyThread-safe reads and writes within a single process
IndexingHardware-accelerated XXH3_64 hash-based key lookup
IntegrityCRC32C checksums and validation chain

The storage engine is optimized for workloads that benefit from SIMD operations, cache-line efficiency, and direct memory access. By enforcing 64-byte payload alignment, it enables efficient typed slice reinterpretation (e.g., &[u32], &[u64]) without copying.

Sources: README.md:5-8 README.md:43-87 Cargo.toml13


Core Architecture Components

The system consists of three primary layers: the storage engine core, network access layer, and language bindings. The following diagram maps high-level architectural concepts to specific code entities.

graph TB
    subgraph "User Interfaces"
        CLI["CLI Binary\nsimd-r-drive crate\nmain.rs"]
PythonApp["Python Applications\nsimd_r_drive.DataStoreWsClient"]
RustApp["Native Rust Clients\nDataStoreWsClient struct"]
end
    
    subgraph "Network Layer"
        WSServer["WebSocket Server\nsimd-r-drive-ws-server\nAxum HTTP server"]
RPCDef["Service Definition\nsimd-r-drive-muxio-service-definition\nDataStoreService trait"]
end
    
    subgraph "Core Storage Engine"
        DataStore["DataStore struct\nsrc/data_store.rs"]
Traits["DataStoreReader trait\nDataStoreWriter trait"]
KeyIndexer["KeyIndexer struct\nsrc/key_indexer.rs"]
EntryHandle["EntryHandle struct\nsimd-r-drive-entry-handle crate"]
end
    
    subgraph "Storage Infrastructure"
        Mmap["Arc<Mmap>\nmemmap2 crate"]
FileHandle["BufWriter<File>\nstd::fs::File"]
AtomicOffset["AtomicU64\ntail_offset field"]
end
    
    subgraph "Performance Layer"
        SIMDCopy["simd_copy function\nsrc/simd_utils.rs"]
XXH3["xxhash-rust crate\nXXH3_64 algorithm"]
end
    
 
   CLI --> DataStore
 
   PythonApp --> WSServer
 
   RustApp --> WSServer
    
 
   WSServer --> RPCDef
 
   RPCDef --> DataStore
    
 
   DataStore --> Traits
 
   DataStore --> KeyIndexer
 
   DataStore --> EntryHandle
 
   DataStore --> Mmap
 
   DataStore --> FileHandle
 
   DataStore --> AtomicOffset
 
   DataStore --> SIMDCopy
    
 
   KeyIndexer --> XXH3
 
   EntryHandle --> Mmap
    
    style DataStore fill:#f9f9f9,stroke:#333,stroke-width:3px
    style Traits fill:#f9f9f9,stroke:#333,stroke-width:2px

System Architecture with Code Entities

Diagram: System architecture showing code entity mappings

Component Descriptions

ComponentCode EntityPurpose
DataStoreDataStore struct in src/data_store.rsMain storage interface implementing read/write operations
DataStoreReaderDataStoreReader traitDefines zero-copy read operations (read, exists, batch_read)
DataStoreWriterDataStoreWriter traitDefines synchronized write operations (write, delete, batch_write)
KeyIndexerKeyIndexer struct in src/key_indexer.rsHash-based index mapping u64 hashes to (tag, offset) tuples
EntryHandleEntryHandle struct in simd-r-drive-entry-handle/src/lib.rsZero-copy reference to memory-mapped payload data
Memory MappingArc<Mmap> wrapped in MutexShared memory-mapped file reference for zero-copy reads
File HandleArc<RwLock<BufWriter<File>>>Synchronized buffered writer for append operations
Tail OffsetAtomicU64 field tail_offsetAtomic counter tracking the current end-of-file position

Sources: Cargo.toml:66-73 src/data_store.rs (inferred from architecture diagrams), High-level diagrams provided


graph LR
    subgraph "Direct Access"
        DirectApp["Rust Application"]
DirectDS["DataStore::open\nDataStoreReader\nDataStoreWriter"]
end
    
    subgraph "CLI Access"
        CLIApp["Command Line"]
CLIBin["simd-r-drive binary\nclap::Parser"]
end
    
    subgraph "Remote Access"
        PyClient["Python Client\nDataStoreWsClient class"]
RustClient["Rust Client\nDataStoreWsClient struct"]
WSServer["WebSocket Server\nmuxio-tokio-rpc-server\nAxum router"]
BackendDS["DataStore instance"]
end
    
 
   DirectApp --> DirectDS
 
   CLIApp --> CLIBin
 
   CLIBin --> DirectDS
    
 
   PyClient --> WSServer
 
   RustClient --> WSServer
 
   WSServer --> BackendDS
    
    style DirectDS fill:#f9f9f9,stroke:#333,stroke-width:2px
    style WSServer fill:#f9f9f9,stroke:#333,stroke-width:2px
    style BackendDS fill:#f9f9f9,stroke:#333,stroke-width:2px

Access Methods

SIMD R Drive can be accessed through three primary interfaces, each optimized for different use cases.

Access Method Architecture

Diagram: Access methods with code entity mappings

Access Method Comparison

MethodUse CaseCode Entry PointLatencyThroughput
Direct LibraryEmbedded in Rust applicationsDataStore::open()MicrosecondsHighest (zero-copy)
CLICommand-line operations, scriptingsimd-r-drive binary with clapMillisecondsProcess-bound
WebSocket RPCRemote access, language bindingsDataStoreWsClient (Rust/Python)Network-dependentRPC-serialization-bound

Direct Library Access:

  • Applications link against the simd-r-drive crate directly
  • Call DataStore::open() to obtain a storage instance
  • Use DataStoreReader and DataStoreWriter traits for operations
  • Provides lowest latency and highest throughput

CLI Access:

  • The simd-r-drive binary provides a command-line interface
  • Built using clap for argument parsing
  • Useful for scripting, testing, and manual operations
  • Each invocation opens the storage, performs operation, and closes

Remote Access (WebSocket RPC):

  • simd-r-drive-ws-server provides network access via WebSocket
  • Uses the Muxio RPC framework with bitcode serialization
  • DataStoreWsClient available for both Rust and Python clients
  • Enables multi-language access and distributed architectures

For CLI details, see Repository Structure. For WebSocket server architecture, see WebSocket Server. For Python client usage, see Python WebSocket Client API.

Sources: Cargo.toml:23-26 README.md9 README.md:262-266 High-level diagrams


Key Features

Zero-Copy Memory-Mapped Access

SIMD R Drive uses memmap2 to memory-map the storage file, allowing direct access to stored data without deserialization or copying. The EntryHandle struct provides a zero-copy view into the memory-mapped region, returning &[u8] slices that point directly into the mapped file.

This approach enables:

  • Sub-microsecond reads for indexed lookups
  • Minimal memory overhead for large entries
  • Efficient processing of datasets larger than available RAM

The memory-mapped file is wrapped in Arc<Mutex<Arc<Mmap>>> to ensure thread-safe access during concurrent reads and remap operations.

Sources: README.md:43-49 simd-r-drive-entry-handle/ (inferred)

Fixed 64-Byte Payload Alignment

Every non-tombstone payload begins on a 64-byte boundary (defined by PAYLOAD_ALIGNMENT constant). This alignment matches typical CPU cache line sizes and enables:

  • Cache-friendly access with reduced cache line splits
  • Full-speed SIMD operations (AVX2, AVX-512, NEON) without misalignment penalties
  • Zero-copy typed slices when payload length matches element size (e.g., &[u64])

Pre-padding bytes are inserted before payloads to maintain this alignment. Tombstones (deletion markers) do not require alignment.

Sources: README.md:51-59 simd-r-drive-entry-handle/src/constants.rs (inferred)

Single-File Storage Container

All data is stored in a single append-only file with the following characteristics:

AspectDescription
File StructureSequential entries: [pre-pad] [payload] [metadata]
Metadata SizeFixed 20 bytes: key_hash (8) + prev_offset (8) + checksum (4)
Entry ChainingEach metadata contains prev_offset pointing to previous entry's tail
ValidationCRC32C checksums and backward chain traversal
RecoveryAutomatic truncation of incomplete writes on open

The storage format is detailed in Entry Structure and Metadata.

Sources: README.md:62-147 README.md:104-150

Thread-Safe Concurrency

SIMD R Drive supports concurrent operations within a single process using:

MechanismCode EntityPurpose
Read LockRwLock (reads)Allows multiple concurrent readers
Write LockRwLock (writes)Ensures exclusive write access
Atomic OffsetAtomicU64 (tail_offset)Tracks file end without locking
Index LockRwLock<HashMap>Protects key index updates
Mmap LockMutex<Arc<Mmap>>Prevents concurrent remapping

Concurrency Guarantees:

  • ✅ Multiple threads can read concurrently (zero-copy, lock-free)
  • ✅ Write operations are serialized via RwLock
  • ✅ Index updates are synchronized
  • ❌ Multiple processes require external file locking

For detailed concurrency model, see Concurrency and Thread Safety.

Sources: README.md:170-206

Hardware-Accelerated Indexing

The KeyIndexer uses the xxhash-rust crate with XXH3_64 algorithm, which provides hardware acceleration:

  • SSE2 on x86_64 (universally supported)
  • AVX2 on capable x86_64 CPUs (runtime detection)
  • NEON on aarch64 (default)

Key lookups are O(1) via HashMap, with benchmarks showing ~1 million random 8-byte lookups completing in under 1 second.

Sources: README.md:158-168 Cargo.toml34

SIMD Write Acceleration

The simd_copy function (in src/simd_utils.rs) accelerates memory copying during write operations:

  • x86_64 with AVX2 : 32-byte SIMD chunks using _mm256_loadu_si256 / _mm256_storeu_si256
  • aarch64 : 16-byte NEON chunks using vld1q_u8 / vst1q_u8
  • Fallback : Standard copy_from_slice when SIMD unavailable

This optimization reduces CPU cycles during buffer staging before disk writes.

For SIMD implementation details, see SIMD Acceleration.

Sources: README.md:249-257 src/simd_utils.rs (inferred)


Write and Read Modes

Write Modes

ModeMethodUse CaseFlush Behavior
Single Entrywrite(key, payload)Individual writesImmediate flush
Batchbatch_write(&[(key, payload)])Multiple entriesSingle flush at end
Streamingwrite_large_entry(key, Read)Large payloadsStreaming with immediate flush

Batch writes reduce disk I/O overhead by grouping multiple entries under a single write lock and flushing once.

Sources: README.md:208-223

Read Modes

ModeMethodMemory BehaviorUse Case
Directread(key) -> EntryHandleZero-copy mmap referenceStandard reads
Streamingread_stream(key) -> impl ReadBuffered, non-zero-copyLarge entries
Parallel Iterationpar_iter_entries() (Rayon)Parallel processingBulk analytics

Direct reads return EntryHandle with zero-copy &[u8] access. Streaming reads process data incrementally through a buffer. Parallel iteration is available via the optional parallel feature.

For iteration details, see Parallel Iteration (via Rayon).

Sources: README.md:225-247


Repository Structure

The project is organized as a Cargo workspace with the following crates:

Diagram: Workspace structure with crate relationships

Crate Descriptions

CratePathPurpose
simd-r-drive./Core storage engine with DataStore, KeyIndexer, SIMD utilities
simd-r-drive-entry-handle./simd-r-drive-entry-handle/Zero-copy EntryHandle and metadata structures
simd-r-drive-extensions./extensions/Utility functions and helper modules
simd-r-drive-muxio-service-definition./experiments/simd-r-drive-muxio-service-definition/RPC service trait definitions using bitcode
simd-r-drive-ws-server./experiments/simd-r-drive-ws-server/Axum-based WebSocket RPC server
simd-r-drive-ws-client./experiments/simd-r-drive-ws-client/Native Rust WebSocket client
simd-r-drive-py./experiments/bindings/python/PyO3-based Python bindings for direct access
simd-r-drive-ws-client-py./experiments/bindings/python-ws-client/Python WebSocket client wrapper

The workspace is defined in Cargo.toml:65-78 with version 0.15.5-alpha specified in Cargo.toml3

For detailed repository structure, see Repository Structure.

Sources: Cargo.toml:65-78 Cargo.toml:1-10 README.md:259-266


Performance Characteristics

SIMD R Drive is designed for high-performance workloads with the following characteristics:

Benchmark Context

MetricTypical Performance
Random Read (8-byte)~1M lookups in < 1 second
Sequential WriteLimited by disk I/O and flush frequency
Memory OverheadMinimal (mmap-based, on-demand paging)
Index LookupO(1) via HashMap with XXH3_64

Optimization Strategies

  1. SIMD Copy Operations : The simd_copy function uses AVX2/NEON for bulk memory transfers during writes
  2. Hardware-Accelerated Hashing : XXH3_64 with SSE2/AVX2/NEON for fast key hashing
  3. Zero-Copy Reads : Memory-mapped access eliminates deserialization overhead
  4. Cache-Line Alignment : 64-byte boundaries reduce cache misses
  5. Batch Operations : Grouping writes reduces lock contention and flush overhead

For detailed performance optimization documentation, see Performance Optimizations.

Sources: README.md:158-168 README.md:249-257


Next Steps

This overview provides a foundation for understanding SIMD R Drive. For deeper exploration:

Sources: All sections above

DeepWiki GitHub

Core Storage Engine

Relevant source files

The core storage engine is the foundational layer of SIMD R Drive, providing append-only, memory-mapped binary storage with zero-copy access. This document covers the high-level architecture and key components that implement the storage functionality.

For network-based access to the storage engine, see Network Layer and RPC. For Python integration, see Python Integration. For performance optimizations including SIMD acceleration, see Performance Optimizations.

Purpose and Scope

This page introduces the core storage engine's architecture, primary components, and design principles. Detailed information about specific subsystems is covered in child pages:

Architecture Overview

The storage engine implements an append-only binary store with in-memory indexing and memory-mapped file access. The architecture consists of four primary structures that work together to provide concurrent, high-performance storage operations.

Sources: src/storage_engine/data_store.rs:26-33 src/storage_engine.rs:1-25

graph TB
    subgraph "Public Traits"
        Reader["DataStoreReader\nread, exists, batch_read"]
Writer["DataStoreWriter\nwrite, delete, batch_write"]
end
    
    subgraph "DataStore Structure"
        DS["DataStore"]
FileHandle["Arc&lt;RwLock&lt;BufWriter&lt;File&gt;&gt;&gt;\nfile\nSequential write operations"]
MmapHandle["Arc&lt;Mutex&lt;Arc&lt;Mmap&gt;&gt;&gt;\nmmap\nMemory-mapped read access"]
TailOffset["AtomicU64\ntail_offset\nCurrent end position"]
KeyIndexer["Arc&lt;RwLock&lt;KeyIndexer&gt;&gt;\nkey_indexer\nHash to offset mapping"]
PathBuf["PathBuf\npath\nFile system location"]
end
    
    subgraph "Support Structures"
        EH["EntryHandle\nZero-copy data view"]
EM["EntryMetadata\nkey_hash, prev_offset, checksum"]
EI["EntryIterator\nSequential traversal"]
ES["EntryStream\nStreaming reads"]
end
    
    Reader -.implements.-> DS
    Writer -.implements.-> DS
    
 
   DS --> FileHandle
 
   DS --> MmapHandle
 
   DS --> TailOffset
 
   DS --> KeyIndexer
 
   DS --> PathBuf
    
 
   DS --> EH
 
   DS --> EI
 
   EH --> EM
 
   EH --> ES

Core Components

DataStore

The DataStore struct is the primary interface to the storage engine. It maintains four essential shared resources protected by different synchronization primitives:

FieldTypePurposeSynchronization
fileArc<RwLock<BufWriter<File>>>Write operations to diskRwLock for exclusive writes
mmapArc<Mutex<Arc<Mmap>>>Memory-mapped view for readsMutex for atomic remapping
tail_offsetAtomicU64Current end-of-file positionAtomic operations
key_indexerArc<RwLock<KeyIndexer>>Hash-based key lookupRwLock for concurrent reads
pathPathBufStorage file locationImmutable after creation

Sources: src/storage_engine/data_store.rs:26-33

Traits

The storage engine exposes two primary traits that define its public API:

Sources: src/lib.rs21 src/storage_engine.rs21

EntryHandle

EntryHandle provides a zero-copy view into stored data by maintaining a reference to the memory-mapped file and a byte range. It includes the associated EntryMetadata for accessing checksums and hash values without additional lookups.

Sources: src/storage_engine.rs24 README.md:43-49

KeyIndexer

KeyIndexer maintains an in-memory hash map from 64-bit XXH3 key hashes to packed values containing a 16-bit collision detection tag and a 48-bit file offset. This structure enables O(1) key lookups while detecting hash collisions.

Sources: src/storage_engine/data_store.rs31 README.md:158-169

Component Interaction Flow

The following diagram illustrates how the core components interact during typical read and write operations, showing the actual method calls and data flow between structures.

Sources: src/storage_engine/data_store.rs:752-825 src/storage_engine/data_store.rs:1040-1049 src/storage_engine/data_store.rs:224-259

sequenceDiagram
    participant Client
    participant DS as "DataStore"
    participant FileHandle as "file: RwLock&lt;BufWriter&lt;File&gt;&gt;"
    participant TailOffset as "tail_offset: AtomicU64"
    participant MmapHandle as "mmap: Mutex&lt;Arc&lt;Mmap&gt;&gt;"
    participant KeyIndexer as "key_indexer: RwLock&lt;KeyIndexer&gt;"
    
    Note over Client,KeyIndexer: Write Operation
    Client->>DS: write(key, payload)
    DS->>DS: compute_hash(key)
    DS->>TailOffset: load(Ordering::Acquire)
    DS->>FileHandle: write().unwrap()
    DS->>FileHandle: write_all(pre_pad + payload + metadata)
    DS->>FileHandle: flush()
    DS->>DS: reindex()
    DS->>FileHandle: init_mmap()
    DS->>MmapHandle: lock().unwrap()
    DS->>MmapHandle: *guard = Arc::new(new_mmap)
    DS->>KeyIndexer: write().unwrap()
    DS->>KeyIndexer: insert(key_hash, offset)
    DS->>TailOffset: store(new_offset, Ordering::Release)
    DS-->>Client: Ok(offset)
    
    Note over Client,KeyIndexer: Read Operation
    Client->>DS: read(key)
    DS->>DS: compute_hash(key)
    DS->>KeyIndexer: read().unwrap()
    DS->>KeyIndexer: get_packed(key_hash)
    KeyIndexer-->>DS: Some(packed_value)
    DS->>DS: KeyIndexer::unpack(packed)
    DS->>MmapHandle: lock().unwrap()
    DS->>MmapHandle: clone Arc
    DS->>DS: read_entry_with_context()
    DS->>DS: EntryMetadata::deserialize()
    DS->>DS: create EntryHandle
    DS-->>Client: Ok(Some(EntryHandle))

Storage Initialization and Recovery

The DataStore::open() method performs a multi-stage initialization process that ensures data integrity even after crashes or incomplete writes.

flowchart TD
    Start["DataStore::open(path)"]
OpenFile["open_file_in_append_mode()\nOpenOptions::new()\n.read(true).write(true).create(true)"]
InitMmap["init_mmap()\nmemmap2::MmapOptions::new().map()"]
Recover["recover_valid_chain(mmap, file_len)\nBackward scan validating prev_offset chain"]
CheckTrunc{"final_len < file_len?"}
Truncate["Truncate file to final_len\nfile.set_len(final_len)\nReopen storage"]
BuildIndex["KeyIndexer::build(mmap, final_len)\nForward scan to populate index"]
CreateDS["Create DataStore instance\nwith Arc-wrapped fields"]
Start --> OpenFile
 
   OpenFile --> InitMmap
 
   InitMmap --> Recover
 
   Recover --> CheckTrunc
 
   CheckTrunc -->|Yes Corruption detected| Truncate
 
   CheckTrunc -->|No File valid| BuildIndex
 
   Truncate --> Start
 
   BuildIndex --> CreateDS

The recovery process validates the storage file by:

  1. Scanning backward from the end of file following prev_offset links in EntryMetadata
  2. Verifying chain integrity ensuring each offset points to a valid previous tail
  3. Truncating corruption if an invalid chain is detected
  4. Building the index by forward scanning the validated region

Sources: src/storage_engine/data_store.rs:84-117 src/storage_engine/data_store.rs:383-482

flowchart TD
    WriteCall["write(key, payload)"]
ComputeHash["compute_hash(key)\nXXH3_64 hash"]
AcquireWrite["file.write().unwrap()\nAcquire write lock"]
LoadTail["tail_offset.load(Ordering::Acquire)"]
CalcPad["prepad_len(prev_tail)\n(A - offset % A) & (A-1)"]
WritePad["write_all(&pad[..prepad])\nZero bytes for alignment"]
WritePayload["write_all(payload)\nActual data"]
WriteMetadata["write_all(metadata.serialize())\nkey_hash + prev_offset + checksum"]
Flush["flush()\nEnsure disk persistence"]
Reindex["reindex()\nUpdate mmap and index"]
UpdateTail["tail_offset.store(new_offset)\nOrdering::Release"]
WriteCall --> ComputeHash
 
   ComputeHash --> AcquireWrite
 
   AcquireWrite --> LoadTail
 
   LoadTail --> CalcPad
 
   CalcPad --> WritePad
 
   WritePad --> WritePayload
 
   WritePayload --> WriteMetadata
 
   WriteMetadata --> Flush
 
   Flush --> Reindex
 
   Reindex --> UpdateTail

Write Path Architecture

Write operations follow a specific sequence to maintain consistency and enable zero-copy reads. All payloads are aligned to 64-byte boundaries using pre-padding.

The reindex() method performs two critical operations after each write:

  1. Memory map update : Creates a new Mmap view of the file and atomically swaps it
  2. Index update : Inserts new key-hash to offset mappings into KeyIndexer

Sources: src/storage_engine/data_store.rs:827-834 src/storage_engine/data_store.rs:758-825 src/storage_engine/data_store.rs:224-259

Read Path Architecture

Read operations use the hash index for O(1) lookup and return EntryHandle instances that provide zero-copy access to the memory-mapped data.

flowchart TD
    ReadCall["read(key)"]
ComputeHash["compute_hash(key)\nXXH3_64 hash"]
AcquireIndex["key_indexer.read().unwrap()\nAcquire read lock"]
GetMmap["get_mmap_arc()\nClone Arc&lt;Mmap&gt;"]
ReadContext["read_entry_with_context()"]
Lookup["key_indexer.get_packed(key_hash)"]
CheckFound{"Found?"}
Unpack["KeyIndexer::unpack(packed)\nExtract tag + offset"]
VerifyTag{"Tag matches?"}
ReadMetadata["EntryMetadata::deserialize()\nFrom mmap[offset..offset+20]"]
CalcRange["Derive entry range\nentry_start..entry_end"]
CheckTombstone{"Is tombstone?"}
CreateHandle["Create EntryHandle\nmmap_arc + range + metadata"]
ReturnNone["Return None"]
ReadCall --> ComputeHash
 
   ComputeHash --> AcquireIndex
 
   AcquireIndex --> GetMmap
 
   GetMmap --> ReadContext
 
   ReadContext --> Lookup
 
   Lookup --> CheckFound
 
   CheckFound -->|No| ReturnNone
 
   CheckFound -->|Yes| Unpack
 
   Unpack --> VerifyTag
 
   VerifyTag -->|No Collision| ReturnNone
 
   VerifyTag -->|Yes| ReadMetadata
 
   ReadMetadata --> CalcRange
 
   CalcRange --> CheckTombstone
 
   CheckTombstone -->|Yes| ReturnNone
 
   CheckTombstone -->|No| CreateHandle

The tag verification step at src/storage_engine/data_store.rs:513-521 prevents hash collision issues by comparing a 16-bit tag derived from the original key with the stored tag.

Sources: src/storage_engine/data_store.rs:1040-1049 src/storage_engine/data_store.rs:501-565

Batch Operations

Batch operations reduce lock acquisition overhead by processing multiple entries in a single synchronized operation.

flowchart TD
    BatchWrite["batch_write(entries)"]
HashAll["compute_hash_batch(keys)\nParallel XXH3_64"]
AcquireWrite["file.write().unwrap()"]
InitBuffer["buffer = Vec::new()"]
LoopStart{"For each entry"}
CalcPad["prepad_len(tail_offset)"]
BuildEntry["Build entry:\npre_pad + payload + metadata"]
AppendBuffer["buffer.extend_from_slice(entry)"]
UpdateTail["tail_offset += entry.len()"]
LoopEnd["Next entry"]
WriteAll["file.write_all(&buffer)"]
Flush["file.flush()"]
Reindex["reindex(key_hash_offsets)"]
BatchWrite --> HashAll
 
   HashAll --> AcquireWrite
 
   AcquireWrite --> InitBuffer
 
   InitBuffer --> LoopStart
 
   LoopStart -->|More entries| CalcPad
 
   CalcPad --> BuildEntry
 
   BuildEntry --> AppendBuffer
 
   AppendBuffer --> UpdateTail
 
   UpdateTail --> LoopEnd
 
   LoopEnd --> LoopStart
 
   LoopStart -->|Done| WriteAll
 
   WriteAll --> Flush
 
   Flush --> Reindex

Batch Write

The batch_write() method accumulates all entries in an in-memory buffer before acquiring the write lock once:

Sources: src/storage_engine/data_store.rs:838-843 src/storage_engine/data_store.rs:847-939

Batch Read

The batch_read() method acquires the index lock once and performs all lookups before releasing it:

Sources: src/storage_engine/data_store.rs:1105-1109 src/storage_engine/data_store.rs:1111-1158

Iteration Support

The storage engine provides multiple iteration mechanisms for different use cases:

IteratorTypeSynchronizationUse Case
EntryIteratorSequentialSingle lock acquisitionForward/backward iteration
par_iter_entries()Parallel (Rayon)Lock-free after setupHigh-throughput scanning
IntoIteratorConsumingSingle lock acquisitionOwned iteration

The EntryIterator scans backward through the file using the prev_offset chain, tracking seen keys in a HashSet to return only the latest version of each key.

Sources: src/storage_engine/entry_iterator.rs:8-128 src/storage_engine/data_store.rs:276-280 src/storage_engine/data_store.rs:296-361

Key Design Principles

Append-Only Model

All write operations append data to the end of the file. Updates and deletes create new entries rather than modifying existing data. This design ensures:

  • No seeking during writes : Sequential disk I/O for maximum throughput
  • Crash safety : Incomplete writes are detected and truncated during recovery
  • Zero-copy reads : Memory-mapped regions remain valid as data is never modified

Sources: README.md:98-103

Fixed Alignment

Every non-tombstone payload starts on a 64-byte boundary (configurable via PAYLOAD_ALIGNMENT). This alignment enables:

  • Cache-line efficiency : Payloads align with CPU cache lines
  • SIMD operations : Full-speed vectorized reads without crossing boundaries
  • Zero-copy typed views : Safe reinterpretation as typed slices

Sources: README.md:51-59 src/storage_engine/data_store.rs:670-673

Metadata-After-Payload

Each entry stores its metadata immediately after the payload, forming a backward-linked chain. This design:

  • Minimizes seeking : Metadata and payload are adjacent on disk
  • Enables recovery : The chain can be validated backward from any point
  • Supports nesting : Storage containers can be stored within other containers

Sources: README.md:139-148

Lock-Free Reads

Read operations do not acquire write locks and perform zero-copy access through memory-mapped views. The AtomicU64 tail offset ensures readers see a consistent view without blocking writes.

Sources: README.md:170-183 src/storage_engine/data_store.rs:1040-1049

DeepWiki GitHub

Repository Structure

Relevant source files

Purpose and Scope

This document provides a comprehensive overview of the SIMD R Drive repository structure, including its Cargo workspace organization, crate hierarchy, and inter-crate dependencies. It covers the core storage engine crates, experimental network components, and build configuration. For details about the storage architecture and API, see Core Storage Engine. For network communication details, see Network Layer and RPC. For Python integration specifics, see Python Integration.


Workspace Organization

The SIMD R Drive repository is organized as a Cargo workspace containing six published crates and two excluded Python binding crates. The workspace uses Cargo resolver version 2 and maintains unified versioning (0.15.5-alpha) across all workspace members.

Workspace Configuration:

ConfigurationValue
ResolverVersion 2
Unified Version0.15.5-alpha
Rust Edition2024
LicenseApache-2.0
Repositoryhttps://github.com/jzombie/rust-simd-r-drive

The workspace is defined in Cargo.toml:65-78 with members spanning core functionality, utilities, and experimental features.

Workspace Structure Diagram:

graph TB
    subgraph "Core Crates"
        Core["simd-r-drive\n(Root Package)\nsrc/"]
EntryHandle["simd-r-drive-entry-handle\nsimd-r-drive-entry-handle/"]
Extensions["simd-r-drive-extensions\nextensions/"]
end
    
    subgraph "Experimental Network Crates"
        MuxioDef["simd-r-drive-muxio-service-definition\nexperiments/simd-r-drive-muxio-service-definition/"]
WSClient["simd-r-drive-ws-client\nexperiments/simd-r-drive-ws-client/"]
WSServer["simd-r-drive-ws-server\nexperiments/simd-r-drive-ws-server/"]
end
    
    subgraph "Excluded Python Bindings"
        PyBinding["Python Direct Binding\nexperiments/bindings/python/\n(Excluded from workspace)"]
PyWSClient["Python WebSocket Client\nexperiments/bindings/python-ws-client/\n(Excluded from workspace)"]
end
    
 
   Core --> EntryHandle
 
   Extensions --> Core
 
   WSServer --> Core
 
   WSServer --> MuxioDef
 
   WSClient --> Core
 
   WSClient --> MuxioDef
 
   PyBinding -.-> Core
 
   PyWSClient -.-> WSClient
    
    style Core fill:#f9f9f9,stroke:#333,stroke-width:3px
    style EntryHandle fill:#f9f9f9,stroke:#333,stroke-width:2px
    style Extensions fill:#f9f9f9,stroke:#333,stroke-width:2px

Sources: Cargo.toml:65-78


Core Crates

simd-r-drive (Root Package)

The main storage engine crate providing append-only, schema-less binary storage with SIMD optimizations. Located at the repository root with source code in src/.

Key Components:

  • DataStore - Main storage interface implementing DataStoreReader and DataStoreWriter traits
  • KeyIndexer - Hash-based index for O(1) key lookups using XXH3
  • Storage format implementation with 64-byte payload alignment
  • SIMD-optimized memory operations (simd_copy)
  • Recovery and validation logic

Features:

  • default - Standard functionality
  • expose-internal-api - Exposes internal APIs for testing
  • parallel - Enables Rayon-based parallel iteration
  • arrow - Proxy feature enabling Apache Arrow support in entry-handle

CLI Binary: The crate includes a CLI application for direct storage operations without network overhead.

Sources: Cargo.toml:11-56 Cargo.lock:1795-1820


simd-r-drive-entry-handle

Zero-copy data access abstraction located in simd-r-drive-entry-handle/. Provides safe wrappers around memory-mapped file regions without copying payload data.

Key Components:

  • EntryHandle - Zero-copy view into memory-mapped payload data
  • EntryMetadata - Fixed 20-byte metadata structure
  • Arrow integration (optional) for columnar data access
  • CRC32C checksum validation

Dependencies:

  • memmap2 - Memory-mapped file support
  • crc32fast - Checksum validation
  • arrow (optional) - Apache Arrow integration

The crate is dependency-minimal to enable use as a standalone zero-copy accessor.

Sources: Cargo.toml83 Cargo.lock:1823-1829


simd-r-drive-extensions

Utility functions and helper modules located in extensions/. Provides commonly-needed functionality for working with the storage engine.

Key Components:

  • align_or_copy - Utility for handling alignment requirements
  • Serialization helpers (bincode integration)
  • Testing utilities
  • File system utilities (walkdir integration)

Dependencies:

  • simd-r-drive - Core storage engine
  • bincode - Serialization support
  • serde - Serialization framework
  • tempfile - Temporary file handling
  • walkdir - Directory traversal

Sources: Cargo.toml69 Cargo.lock:1832-1841


Experimental Network Crates

These crates enable remote access to the storage engine via WebSocket-based RPC. Located in experiments/.

simd-r-drive-muxio-service-definition

RPC service contract definition using the Muxio framework. Located in experiments/simd-r-drive-muxio-service-definition/.

Purpose:

  • Defines RPC service interface shared between client and server
  • Uses bitcode for efficient binary serialization
  • Provides type-safe method definitions

Dependencies:

  • bitcode - Binary serialization
  • muxio-rpc-service - RPC service framework

Sources: Cargo.toml72 Cargo.lock:1844-1849


simd-r-drive-ws-client

Native Rust WebSocket client for remote storage access. Located in experiments/simd-r-drive-ws-client/.

Key Components:

  • BaseDataStoreWsClient - WebSocket client implementation
  • RPC method callers
  • Connection management
  • Async/await integration with Tokio

Dependencies:

  • simd-r-drive - Storage engine traits
  • simd-r-drive-muxio-service-definition - Shared RPC contract
  • muxio-tokio-rpc-client - WebSocket RPC client runtime
  • muxio-rpc-service-caller - RPC method invocation
  • tokio - Async runtime

Sources: Cargo.toml71 Cargo.lock:1852-1863


simd-r-drive-ws-server

WebSocket server exposing the storage engine over RPC. Located in experiments/simd-r-drive-ws-server/.

Key Components:

  • Axum-based HTTP/WebSocket server
  • RPC endpoint routing
  • DataStore backend integration
  • Configuration via CLI arguments

Dependencies:

  • simd-r-drive - Storage engine backend
  • simd-r-drive-muxio-service-definition - Shared RPC contract
  • muxio-tokio-rpc-server - WebSocket RPC server runtime
  • clap - CLI argument parsing
  • tokio - Async runtime

Sources: Cargo.toml70 Cargo.lock:1866-1878


Python Bindings (Excluded)

Two Python binding crates exist but are excluded from the main workspace due to their PyO3 and Maturin build requirements.

experiments/bindings/python

Direct Python bindings to the core storage engine using PyO3. Excluded at Cargo.toml75

Purpose:

  • FFI bridge between Python and Rust
  • Direct local access to DataStore from Python
  • PyO3-based class and method exports

experiments/bindings/python-ws-client

Python WebSocket client wrapping the Rust client. Excluded at Cargo.toml76

Purpose:

  • Remote storage access from Python
  • PyO3 wrapper around simd-r-drive-ws-client
  • Async/await bridge between Python and Tokio

For detailed documentation on these bindings, see Python Integration.

Sources: Cargo.toml:74-77


Dependency Graph

Inter-Crate Dependencies:

Sources: Cargo.lock:1795-1878 Cargo.toml:80-90


External Dependencies

Workspace-Level Shared Dependencies

The workspace defines common dependencies to ensure version consistency across all crates:

Storage & Performance:

DependencyVersionPurpose
memmap20.9.5Memory-mapped file I/O
xxhash-rust0.8.15 (xxh3)Fast hashing with hardware acceleration
crc32fast1.4.2CRC32C checksums
arrow57.0.0Apache Arrow integration (optional)

Async & Network:

DependencyVersionPurpose
tokio1.45.1Async runtime (experiments only)
muxio-rpc-service0.9.0-alphaRPC framework
muxio-tokio-rpc-client0.9.0-alphaRPC client runtime
muxio-tokio-rpc-server0.9.0-alphaRPC server runtime

Serialization:

DependencyVersionPurpose
bitcode0.6.6Binary serialization for RPC
bincode1.3.3Legacy serialization (extensions)
serde1.0.219Serialization framework

CLI & Utilities:

DependencyVersionPurpose
clap4.5.40CLI argument parsing
tracing0.1.41Structured logging
tracing-subscriber0.3.20Log output configuration

Sources: Cargo.toml:80-112


Build Configuration

Workspace Settings

All crates share common metadata defined at the workspace level:

Sources: Cargo.toml:1-9


Benchmark Configuration

The root crate includes two benchmark harnesses:

storage_benchmark:

  • Harness: Criterion
  • Purpose: Measure write/read throughput and latency
  • Path: benches/storage_benchmark.rs

contention_benchmark:

  • Harness: Criterion
  • Purpose: Measure concurrent access performance
  • Path: benches/contention_benchmark.rs

Sources: Cargo.toml:57-63


Directory Layout

rust-simd-r-drive/
├── Cargo.toml                          # Workspace root
├── Cargo.lock                          # Dependency lock file
├── src/                                # simd-r-drive core source
│   ├── lib.rs
│   ├── data_store.rs
│   ├── key_indexer.rs
│   └── ...
├── simd-r-drive-entry-handle/          # Zero-copy handle crate
│   ├── Cargo.toml
│   └── src/
├── extensions/                         # Utilities crate
│   ├── Cargo.toml
│   └── src/
└── experiments/
    ├── simd-r-drive-ws-server/         # WebSocket server
    │   ├── Cargo.toml
    │   └── src/
    ├── simd-r-drive-ws-client/         # Rust WebSocket client
    │   ├── Cargo.toml
    │   └── src/
    ├── simd-r-drive-muxio-service-definition/  # RPC contract
    │   ├── Cargo.toml
    │   └── src/
    └── bindings/
        ├── python/                     # PyO3 direct binding (excluded)
        │   ├── Cargo.toml
        │   ├── pyproject.toml
        │   └── src/
        └── python-ws-client/           # PyO3 WS client (excluded)
            ├── Cargo.toml
            ├── pyproject.toml
            └── src/

Sources: Cargo.toml:65-78


Publishing Strategy

All workspace crates (except Python bindings) are configured for publishing to crates.io with publish = true. The Python bindings use Maturin for building Python wheels and are distributed separately via PyPI.

Crates.io Packages:

  • simd-r-drive - Core storage engine
  • simd-r-drive-entry-handle - Zero-copy handle
  • simd-r-drive-extensions - Utilities
  • simd-r-drive-ws-server - WebSocket server
  • simd-r-drive-ws-client - Rust WebSocket client
  • simd-r-drive-muxio-service-definition - RPC contract

PyPI Packages:

  • simd-r-drive (Python binding via Maturin)
  • simd-r-drive-ws-client-py (Python WebSocket client via Maturin)

For Python package build instructions, see Building Python Bindings.

Sources: Cargo.toml:1-9 Cargo.toml21

DeepWiki GitHub

Storage Architecture

Relevant source files

Purpose and Scope

This document describes the core storage architecture of SIMD R Drive, covering the append-only design principles, single-file storage model, on-disk layout, validation chain structure, and recovery mechanisms. For details about the programmatic interface, see DataStore API. For specifics on entry structure and metadata format, see Entry Structure and Metadata. For memory mapping and zero-copy access patterns, see Memory Management and Zero-Copy Access.


Append-Only Design

SIMD R Drive implements a strict append-only storage model where data is written sequentially to the end of a single file. Entries are never modified or deleted in place; instead, updates and deletions are represented by appending new entries.

Key Characteristics

CharacteristicImplementationBenefit
Sequential WritesAll writes append to file endMinimizes disk seeks, maximizes throughput
Immutable EntriesExisting entries never modifiedEnables concurrent reads without locks
Version ChainMultiple versions stored for same keySupports time-travel and recovery
Latest-Wins SemanticsIndex points to most recent versionEnsures consistency in queries

The append-only model is enforced through the tail_offset field src/storage_engine/data_store.rs30 which tracks the current end of valid data. Write operations acquire an exclusive lock, append data, and atomically update the tail offset src/storage_engine/data_store.rs:816-824

Write Flow

Sources: src/storage_engine/data_store.rs:752-825 src/storage_engine/data_store.rs:827-843


Single-File Storage Container

All data resides in a single binary file managed by the DataStore struct src/storage_engine/data_store.rs:27-33 This design simplifies deployment, backup, and replication while enabling efficient memory-mapped access.

graph TB
    subgraph "DataStore Structure"
        FileHandle["file:\nArc&lt;RwLock&lt;BufWriter&lt;File&gt;&gt;&gt;"]
MmapHandle["mmap:\nArc&lt;Mutex&lt;Arc&lt;Mmap&gt;&gt;&gt;"]
TailOffset["tail_offset:\nAtomicU64"]
Indexer["key_indexer:\nArc&lt;RwLock&lt;KeyIndexer&gt;&gt;"]
PathBuf["path:\nPathBuf"]
end
    
    subgraph "Write Operations"
        WriteOp["Write/Delete Operations"]
end
    
    subgraph "Read Operations"
        ReadOp["Read/Exists Operations"]
end
    
    subgraph "Physical Storage"
        DiskFile["Storage File on Disk"]
end
    
 
   WriteOp --> FileHandle
 
   FileHandle --> DiskFile
    
 
   ReadOp --> MmapHandle
 
   ReadOp --> Indexer
 
   MmapHandle --> DiskFile
    
 
   WriteOp -.->|updates after flush| TailOffset
 
   WriteOp -.->|remaps after flush| MmapHandle
 
   WriteOp -.->|updates after flush| Indexer
    
 
   TailOffset -.->|tracks valid data end| DiskFile

File Handles and Memory Mapping

Sources: src/storage_engine/data_store.rs:27-33 src/storage_engine/data_store.rs:84-117

File Opening and Initialization

The DataStore::open function src/storage_engine/data_store.rs:84-117 performs the following sequence:

  1. Open File : Opens the file in read/write mode, creating it if necessary src/storage_engine/data_store.rs:161-170
  2. Initial Mapping : Creates an initial memory-mapped view src/storage_engine/data_store.rs:172-174
  3. Chain Recovery : Validates the storage chain and determines the valid data extent src/storage_engine/data_store.rs:383-482
  4. Truncation (if needed) : Truncates corrupted data if recovery finds inconsistencies src/storage_engine/data_store.rs:91-104
  5. Index Building : Constructs the in-memory hash index by scanning valid entries src/storage_engine/data_store.rs108

Sources: src/storage_engine/data_store.rs:84-117 src/storage_engine/data_store.rs:141-144


Storage Layout

Entries are stored with 64-byte alignment to optimize cache efficiency and SIMD operations. The alignment constant PAYLOAD_ALIGNMENT is defined as 64 simd-r-drive-entry-handle/src/constants.rs

Non-Tombstone Entry Layout

The pre-padding calculation ensures the payload starts on a 64-byte boundary src/storage_engine/data_store.rs:669-673:

pad = (PAYLOAD_ALIGNMENT - (prev_tail % PAYLOAD_ALIGNMENT)) & (PAYLOAD_ALIGNMENT - 1)

Where prev_tail is the absolute file offset of the previous entry's metadata end.

graph LR
    subgraph "Tombstone Entry"
        NullByte["NULL Byte\n(1 byte)\n0x00"]
KeyHash["Key Hash\n(8 bytes)\nXXH3_64"]
PrevOff["Prev Offset\n(8 bytes)\nPrevious Tail"]
Checksum["Checksum\n(4 bytes)\nCRC32C"]
end
    
 
   NullByte --> KeyHash
 
   KeyHash --> PrevOff
 
   PrevOff --> Checksum

Tombstone Entry Layout

Tombstones represent deleted keys and use a minimal layout without pre-padding:

The single NULL_BYTE src/storage_engine.rs136 serves as both the tombstone marker and payload. This design minimizes space overhead for deletions src/storage_engine/data_store.rs:864-897

Layout Summary Table

ComponentNon-Tombstone SizeTombstone SizePurpose
Pre-Pad0-63 bytes0 bytesAlign payload to 64-byte boundary
PayloadVariable1 byte (0x00)User data or deletion marker
Key Hash8 bytes8 bytesXXH3_64 hash for index lookup
Prev Offset8 bytes8 bytesPoints to previous tail (chain link)
Checksum4 bytes4 bytesCRC32C of payload for integrity

Sources: README.md:112-137 src/storage_engine/data_store.rs:669-673 src/storage_engine/data_store.rs:864-897


Validation Chain

Each entry's metadata contains a prev_offset field that points to the absolute file offset of the previous entry's metadata end (the "previous tail"). This creates a backward-linked chain from the end of the file to offset 0.

graph RL
    EOF["End of File\n(tail_offset)"]
Meta3["Entry 3 Metadata\nprev_offset = T2"]
Entry3["Entry 3 Payload\n(aligned)"]
Meta2["Entry 2 Metadata\nprev_offset = T1"]
Entry2["Entry 2 Payload\n(aligned)"]
Meta1["Entry 1 Metadata\nprev_offset = 0"]
Entry1["Entry 1 Payload\n(aligned)"]
Zero["Offset 0\n(File Start)"]
EOF --> Meta3
 
   Meta3 -.->|prev_offset| Meta2
 
   Meta3 --> Entry3
    
 
   Meta2 -.->|prev_offset| Meta1
 
   Meta2 --> Entry2
    
 
   Meta1 -.->|prev_offset| Zero
 
   Meta1 --> Entry1
    
 
   Entry1 --> Zero

Chain Structure

Sources: README.md:139-148 src/storage_engine/data_store.rs:383-482

Chain Validation Properties

The validation chain ensures data integrity through several properties:

  1. Completeness : A valid chain must trace back to offset 0 without gaps src/storage_engine/data_store.rs473
  2. Forward Progress : Each prev_offset must be less than the current metadata offset src/storage_engine/data_store.rs:465-468
  3. Bounds Checking : All offsets must be within valid file boundaries src/storage_engine/data_store.rs:435-438
  4. Size Consistency : Total chain size must not exceed file length src/storage_engine/data_store.rs473

These properties are enforced during the recover_valid_chain function src/storage_engine/data_store.rs:383-482 which validates the chain structure on every file open.

Sources: src/storage_engine/data_store.rs:383-482


Recovery Mechanisms

SIMD R Drive implements automatic recovery to handle incomplete writes from crashes, power loss, or other interruptions. The recovery process occurs transparently during DataStore::open.

graph TB
    Start["Open Storage File"]
CheckSize{{"File Size ≥\nMETADATA_SIZE?"}}
EmptyFile["Return offset 0\n(Empty File)"]
InitCursor["cursor = file_length"]
ReadMeta["Read metadata at\ncursor - METADATA_SIZE"]
ExtractPrev["Extract prev_offset"]
CalcBounds["Calculate entry bounds\nusing prepad_len"]
ValidateBounds{{"Entry bounds valid?"}}
DecrementCursor["cursor -= 1"]
WalkBackward["Walk chain backward\nusing prev_offset"]
CheckChain{{"Chain reaches\noffset 0?"}}
CheckSize2{{"Total size ≤\nfile_length?"}}
FoundValid["Store valid offset\nbest_valid_offset = cursor"]
MoreToCheck{{"cursor ≥\nMETADATA_SIZE?"}}
ReturnBest["Return best_valid_offset"]
Truncate["Truncate file to\nbest_valid_offset"]
Reopen["Recursively reopen\nDataStore::open"]
BuildIndex["Build KeyIndexer\nfrom valid chain"]
Complete["Recovery Complete"]
Start --> CheckSize
 
   CheckSize -->|No| EmptyFile
 
   CheckSize -->|Yes| InitCursor
 
   EmptyFile --> Complete
    
 
   InitCursor --> ReadMeta
 
   ReadMeta --> ExtractPrev
 
   ExtractPrev --> CalcBounds
 
   CalcBounds --> ValidateBounds
    
 
   ValidateBounds -->|No| DecrementCursor
 
   ValidateBounds -->|Yes| WalkBackward
    
 
   WalkBackward --> CheckChain
 
   CheckChain -->|No| DecrementCursor
 
   CheckChain -->|Yes| CheckSize2
    
 
   CheckSize2 -->|No| DecrementCursor
 
   CheckSize2 -->|Yes| FoundValid
    
 
   FoundValid --> ReturnBest
    
 
   DecrementCursor --> MoreToCheck
 
   MoreToCheck -->|Yes| ReadMeta
 
   MoreToCheck -->|No| ReturnBest
    
 
   ReturnBest --> Truncate
 
   Truncate --> Reopen
 
   Reopen --> BuildIndex
 
   BuildIndex --> Complete

Recovery Algorithm

Sources: src/storage_engine/data_store.rs:383-482 src/storage_engine/data_store.rs:91-104

Recovery Steps Detailed

Step 1: Backward Scan

Starting from the end of file, the recovery algorithm scans backward looking for valid metadata entries. For each potential metadata location src/storage_engine/data_store.rs:390-394:

  • Reads 20 bytes of metadata
  • Deserializes into EntryMetadata structure
  • Extracts prev_offset (previous tail)
  • Calculates entry boundaries using alignment rules

Step 2: Chain Validation

For each candidate entry, the algorithm walks the entire chain backward src/storage_engine/data_store.rs:424-471:

back_cursor = prev_tail
while back_cursor != 0:
    - Read previous entry metadata
    - Validate bounds and size
    - Check forward progress (prev_prev_tail < prev_metadata_offset)
    - Accumulate total chain size
    - Update back_cursor to previous prev_offset

A chain is valid only if it reaches offset 0 and the total size doesn't exceed file length src/storage_engine/data_store.rs473

Step 3: Truncation and Retry

If corruption is detected (file length > valid chain extent), the function:

  1. Logs a warning src/storage_engine/data_store.rs:92-97
  2. Drops existing file handles and memory map src/storage_engine/data_store.rs:98-99
  3. Reopens file with write access src/storage_engine/data_store.rs100
  4. Truncates to valid offset src/storage_engine/data_store.rs101
  5. Syncs to disk src/storage_engine/data_store.rs102
  6. Recursively calls DataStore::open src/storage_engine/data_store.rs103

Step 4: Index Rebuilding

After establishing the valid data extent, KeyIndexer::build scans forward through the file src/storage_engine/data_store.rs108 populating the hash index with key → offset mappings. This forward scan uses the iterator pattern src/storage_engine/entry_iterator.rs:56-127 to process each entry sequentially.

Sources: src/storage_engine/data_store.rs:383-482 src/storage_engine/data_store.rs:91-104 src/storage_engine/entry_iterator.rs:21-54


graph TB
    subgraph "Storage Architecture (2.1)"
        AppendOnly["Append-Only Design"]
SingleFile["Single-File Container"]
Layout["Storage Layout\n(64-byte alignment)"]
Chain["Validation Chain"]
Recovery["Recovery Mechanism"]
end
    
    subgraph "DataStore API (2.2)"
        Write["write/batch_write"]
Read["read/batch_read"]
Delete["delete/batch_delete"]
end
    
    subgraph "Entry Structure (2.3)"
        EntryMetadata["EntryMetadata\n(key_hash, prev_offset, checksum)"]
PrePad["Pre-Padding Logic"]
Tombstone["Tombstone Format"]
end
    
    subgraph "Memory Management (2.4)"
        Mmap["Memory-Mapped File"]
EntryHandle["EntryHandle\n(Zero-Copy View)"]
Alignment["PAYLOAD_ALIGNMENT"]
end
    
    subgraph "Key Indexer (2.6)"
        KeyIndexer["KeyIndexer"]
HashLookup["Hash → Offset Mapping"]
end
    
 
   AppendOnly -.->|enforces| Write
 
   SingleFile -.->|provides| Mmap
 
   Layout -.->|defines| PrePad
 
   Layout -.->|uses| Alignment
 
   Chain -.->|uses| EntryMetadata
 
   Recovery -.->|validates| Chain
 
   Recovery -.->|rebuilds| KeyIndexer
    
 
   Write -.->|appends| Layout
 
   Read -.->|uses| HashLookup
 
   Read -.->|returns| EntryHandle
 
   Delete -.->|writes| Tombstone

Architecture Integration

The storage architecture integrates with other subsystems through well-defined boundaries:

Component Interactions

Sources: src/storage_engine/data_store.rs:1-1183 src/storage_engine.rs:1-25

Key Design Decisions

DecisionRationaleTrade-off
64-byte alignmentMatches CPU cache lines, enables SIMDWastes 0-63 bytes per entry
Backward-linked chainEfficient recovery from EOFCannot validate forward
Single NULL byte tombstonesMinimal deletion overheadSpecial case in code
Metadata-last layoutSequential write patternRecovery scans backward
Atomic tail_offsetLock-free progress trackingAdditional synchronization primitive

Sources: README.md:51-59 README.md:104-148 src/storage_engine/data_store.rs:669-673


Performance Characteristics

The storage architecture exhibits the following performance properties:

Write Performance

Read Performance

Recovery Performance

Sources: src/storage_engine/data_store.rs:752-939 src/storage_engine/data_store.rs:1027-1183 src/storage_engine/data_store.rs:383-482


Limitations and Considerations

Append-Only Growth

The file grows continuously with updates and deletions. Compaction is available but requires exclusive access src/storage_engine/data_store.rs:706-749

Alignment Overhead

Each non-tombstone entry may waste up to 63 bytes for alignment padding. For small payloads, this overhead can be significant relative to payload size.

Recovery Cost

Recovery time increases with file size, as chain validation requires traversing all entries. For very large files (exceeding RAM), this may cause extended startup times.

Single-Process Design

While thread-safe within a process, multiple processes cannot safely share the same file without external locking mechanisms README.md:185-206

Sources: README.md:51-59 README.md:185-206 src/storage_engine/data_store.rs:669-673 src/storage_engine/data_store.rs:706-749

DeepWiki GitHub

DataStore API

Relevant source files

This document describes the public API of the DataStore struct, which provides the primary interface for interacting with the SIMD R Drive storage engine. It covers the methods for opening storage files, reading and writing data, batch operations, streaming, iteration, and maintenance operations.

For details on the underlying storage architecture and design principles, see Storage Architecture. For information about entry structure and metadata format, see Entry Structure and Metadata. For details on concurrency mechanisms, see Concurrency and Thread Safety.

API Structure

The DataStore API is organized around two primary traits that separate read and write capabilities, along with additional utility methods provided directly on the DataStore struct.

Sources: src/storage_engine/data_store.rs:26-750 src/storage_engine/traits.rs src/storage_engine.rs:1-24

graph TB
    subgraph "Trait Definitions"
        DSReader["DataStoreReader\ntrait"]
DSWriter["DataStoreWriter\ntrait"]
end
    
    subgraph "Core Implementation"
        DS["DataStore\nstruct"]
end
    
    subgraph "Associated Types"
        EH["EntryHandle"]
EM["EntryMetadata"]
EI["EntryIterator"]
ES["EntryStream"]
end
    
 
   DSReader -->|implemented by| DS
 
   DSWriter -->|implemented by| DS
    
 
   DS -->|returns| EH
 
   DS -->|returns| EM
 
   DS -->|provides| EI
 
   EH -->|converts to| ES
    
 
   DSReader -->|defines read ops| ReadMethods["read()\nbatch_read()\nexists()\nread_metadata()\nread_last_entry()"]
DSWriter -->|defines write ops| WriteMethods["write()\nbatch_write()\nwrite_stream()\ndelete()\nrename()\ncopy()"]
DS -->|direct methods| DirectMethods["open()\niter_entries()\ncompact()\nget_path()"]

Core Types

TypePurposeDefined In
DataStoreMain storage engine implementationsrc/storage_engine/data_store.rs:27-33
DataStoreReaderTrait defining read operationsTrait definition
DataStoreWriterTrait defining write operationsTrait definition
EntryHandleZero-copy reference to stored datasimd_r_drive_entry_handle crate
EntryMetadataEntry metadata structure (key hash, prev offset, checksum)simd_r_drive_entry_handle crate
EntryIteratorSequential iterator over entriessrc/storage_engine/entry_iterator.rs:21-127
EntryStreamStreaming reader for entry datasrc/storage_engine/entry_stream.rs

Sources: src/storage_engine/data_store.rs:27-33 src/storage_engine/entry_iterator.rs:8-25 src/storage_engine.rs:1-24

Opening Storage Files

The DataStore provides two methods for opening storage files, both returning Result<DataStore>.

Sources: src/storage_engine/data_store.rs:84-117 src/storage_engine/data_store.rs:141-144 src/storage_engine/data_store.rs:161-170

graph LR
    subgraph "Open Methods"
        Open["open(path)"]
OpenExisting["open_existing(path)"]
end
    
    subgraph "Process Flow"
        OpenFile["open_file_in_append_mode()"]
InitMmap["init_mmap()"]
RecoverChain["recover_valid_chain()"]
BuildIndex["KeyIndexer::build()"]
end
    
    subgraph "Result"
        DSInstance["DataStore instance"]
end
    
 
   Open -->|creates if missing| OpenFile
 
   OpenExisting -->|requires existing file| VerifyFile["verify_file_existence()"]
VerifyFile --> OpenFile
    
 
   OpenFile --> InitMmap
 
   InitMmap --> RecoverChain
 
   RecoverChain -->|may truncate| BuildIndex
 
   BuildIndex --> DSInstance

open

Opens or creates a storage file at the specified path. This method performs the following operations:

  1. Opens the file in read/write mode (creates if necessary)
  2. Memory-maps the file using mmap
  3. Runs recover_valid_chain() to validate data integrity
  4. Truncates the file if corruption is detected
  5. Builds the in-memory key index using KeyIndexer::build()

Parameters:

  • path: Path to the storage file

Returns:

  • Ok(DataStore): Successfully opened storage instance
  • Err(std::io::Error): File operation failure

Sources: src/storage_engine/data_store.rs:84-117

open_existing

Opens an existing storage file. Unlike open(), this method returns an error if the file does not exist. It calls verify_file_existence() before delegating to open().

Parameters:

  • path: Path to the existing storage file

Returns:

  • Ok(DataStore): Successfully opened storage instance
  • Err(std::io::Error): File does not exist or cannot be opened

Sources: src/storage_engine/data_store.rs:141-144

Type Conversion Constructors

Allows creating a DataStore directly from a PathBuf:

This conversion calls DataStore::open() internally and panics if the file cannot be opened.

Sources: src/storage_engine/data_store.rs:53-64

Read Operations

Read operations are defined by the DataStoreReader trait and return EntryHandle instances that provide zero-copy access to stored data.

Sources: src/storage_engine/data_store.rs:1027-1182

graph TB
    subgraph "Read API Methods"
        Read["read(key)"]
ReadHash["read_with_key_hash(hash)"]
BatchRead["batch_read(keys)"]
BatchReadHash["batch_read_hashed_keys(hashes)"]
ReadLast["read_last_entry()"]
ReadMeta["read_metadata(key)"]
Exists["exists(key)"]
ExistsHash["exists_with_key_hash(hash)"]
end
    
    subgraph "Internal Flow"
        ComputeHash["compute_hash()"]
GetIndexLock["key_indexer.read()"]
GetMmap["get_mmap_arc()"]
ReadContext["read_entry_with_context()"]
end
    
    subgraph "Return Types"
        OptHandle["Option<EntryHandle>"]
VecOptHandle["Vec<Option<EntryHandle>>"]
OptMeta["Option<EntryMetadata>"]
Bool["bool"]
end
    
 
   Read --> ComputeHash
 
   ComputeHash --> ReadHash
 
   ReadHash --> GetIndexLock
 
   GetIndexLock --> GetMmap
 
   GetMmap --> ReadContext
 
   ReadContext --> OptHandle
    
 
   BatchRead --> BatchReadHash
 
   BatchReadHash --> ReadContext
 
   ReadContext --> VecOptHandle
    
 
   ReadMeta --> Read
 
   Read --> OptMeta
    
 
   Exists --> Read
 
   Read --> Bool

read

Reads a single entry by key. Computes the key hash using compute_hash(), acquires a read lock on the key index, and calls read_entry_with_context() to retrieve the entry.

Parameters:

  • key: Raw key bytes

Returns:

  • Ok(Some(EntryHandle)): Entry found and valid
  • Ok(None): Entry not found or is a tombstone
  • Err(std::io::Error): Lock acquisition failure

Tag Verification: When the original key is provided, the method verifies the stored tag matches KeyIndexer::tag_from_key(key) to detect hash collisions.

Sources: src/storage_engine/data_store.rs:1040-1049

read_with_key_hash

Reads an entry using a pre-computed hash. This method skips the hashing step and tag verification, useful when the hash is already known.

Parameters:

  • prehashed_key: Pre-computed XXH3 hash of the key

Returns:

  • Ok(Some(EntryHandle)): Entry found
  • Ok(None): Entry not found
  • Err(std::io::Error): Lock acquisition failure

Note: Tag verification is not performed when using pre-computed hashes.

Sources: src/storage_engine/data_store.rs:1051-1059

batch_read

Reads multiple entries in a single operation. Computes all hashes using compute_hash_batch(), then delegates to batch_read_hashed_keys().

Parameters:

  • keys: Slice of key references

Returns:

  • Ok(Vec<Option<EntryHandle>>): Vector of results, same length as input
  • Err(std::io::Error): Lock acquisition failure

Efficiency: Acquires the index read lock once for all lookups, reducing lock contention compared to multiple read() calls.

Sources: src/storage_engine/data_store.rs:1105-1109

batch_read_hashed_keys

Reads multiple entries using pre-computed hashes. Optionally accepts original keys for tag verification.

Parameters:

  • prehashed_keys: Slice of pre-computed key hashes
  • non_hashed_keys: Optional slice of original keys for tag verification (must match length of prehashed_keys)

Returns:

  • Ok(Vec<Option<EntryHandle>>): Vector of results
  • Err(std::io::Error): Lock failure or length mismatch

Sources: src/storage_engine/data_store.rs:1111-1158

read_last_entry

Reads the most recently written entry (at tail_offset - METADATA_SIZE). Does not verify the key or check the index; directly reads from the tail position.

Returns:

  • Ok(Some(EntryHandle)): Last entry retrieved
  • Ok(None): Storage is empty or tail is invalid
  • Err(std::io::Error): Not applicable (infallible after initial checks)

Sources: src/storage_engine/data_store.rs:1061-1103

read_metadata

Reads only the metadata for a key without accessing the payload data. Calls read() and extracts the metadata field.

Returns:

  • Ok(Some(EntryMetadata)): Metadata retrieved
  • Ok(None): Key not found
  • Err(std::io::Error): Read failure

Sources: src/storage_engine/data_store.rs:1160-1162

exists

Checks if a key exists in the storage.

Returns:

  • Ok(true): Key exists and is not a tombstone
  • Ok(false): Key not found or is deleted
  • Err(std::io::Error): Read failure

Sources: src/storage_engine/data_store.rs:1030-1032

exists_with_key_hash

Checks key existence using a pre-computed hash.

Sources: src/storage_engine/data_store.rs:1034-1038

graph TB
    subgraph "Write API Methods"
        Write["write(key, payload)"]
WriteHash["write_with_key_hash(hash, payload)"]
BatchWrite["batch_write(entries)"]
BatchWriteHash["batch_write_with_key_hashes(entries)"]
WriteStream["write_stream(key, reader)"]
WriteStreamHash["write_stream_with_key_hash(hash, reader)"]
end
    
    subgraph "Core Write Flow"
        AcquireLock["file.write()"]
GetTail["tail_offset.load()"]
CalcPrepad["prepad_len(tail)"]
WritePrepad["write zeros for alignment"]
WritePayload["write payload data"]
WriteMeta["write EntryMetadata"]
Flush["file.flush()"]
Reindex["reindex()"]
end
    
    subgraph "Post-Write"
        RemapMmap["init_mmap()"]
UpdateIndex["key_indexer.insert()"]
UpdateTail["tail_offset.store()"]
end
    
 
   Write --> WriteHash
 
   WriteHash --> BatchWriteHash
 
   WriteStream --> WriteStreamHash
    
 
   BatchWriteHash --> AcquireLock
 
   WriteStreamHash --> AcquireLock
    
 
   AcquireLock --> GetTail
 
   GetTail --> CalcPrepad
 
   CalcPrepad --> WritePrepad
 
   WritePrepad --> WritePayload
 
   WritePayload --> WriteMeta
 
   WriteMeta --> Flush
 
   Flush --> Reindex
    
 
   Reindex --> RemapMmap
 
   RemapMmap --> UpdateIndex
 
   UpdateIndex --> UpdateTail

Write Operations

Write operations are defined by the DataStoreWriter trait. All writes are append-only and atomic with respect to the file lock.

Sources: src/storage_engine/data_store.rs:752-1025

write

Writes a single key-value pair. Computes the key hash and delegates to write_with_key_hash().

Parameters:

  • key: Key bytes
  • payload: Payload bytes (cannot be empty or all NULL bytes)

Returns:

  • Ok(tail_offset): New tail offset after write
  • Err(std::io::Error): Write failure

Sources: src/storage_engine/data_store.rs:827-830

write_with_key_hash

Writes a payload using a pre-computed key hash. Delegates to batch_write_with_key_hashes() with a single entry.

Sources: src/storage_engine/data_store.rs:832-834

batch_write

Writes multiple key-value pairs in a single atomic operation. All entries are buffered and written together, then flushed once.

Parameters:

  • entries: Slice of (key, payload) tuples

Returns:

  • Ok(tail_offset): New tail offset after batch write
  • Err(std::io::Error): Write failure (entire batch is rejected)

Efficiency: Reduces I/O overhead and lock contention compared to multiple write() calls.

Sources: src/storage_engine/data_store.rs:838-843

batch_write_with_key_hashes

Core batch write implementation. Processes each entry, calculating alignment padding and writing pre-pad, payload, and metadata. All data is buffered before flushing to disk.

Parameters:

  • prehashed_keys: Vector of (hash, payload) tuples
  • allow_null_bytes: Whether to allow single NULL byte payloads (used for tombstones)

Returns:

  • Ok(tail_offset): New tail offset
  • Err(std::io::Error): Write failure

Write Process:

  1. Acquire exclusive write lock on file
  2. Load current tail_offset
  3. For each entry:
    • Calculate prepad_len(tail_offset) for 64-byte alignment
    • Buffer pre-pad zeros (if needed)
    • Buffer payload (using simd_copy())
    • Buffer metadata (key_hash, prev_offset, checksum)
    • Update tail_offset
  4. Write entire buffer to file
  5. Flush to disk
  6. Call reindex() to update mmap and index

Sources: src/storage_engine/data_store.rs:847-939

write_stream

Writes a payload from a streaming source without loading it entirely into memory. Useful for large entries.

Parameters:

  • key: Key bytes
  • reader: Any type implementing Read trait

Returns:

  • Ok(tail_offset): New tail offset after write
  • Err(std::io::Error): Write or read failure

Sources: src/storage_engine/data_store.rs:753-756

write_stream_with_key_hash

Core streaming write implementation. Reads from the source in chunks of WRITE_STREAM_BUFFER_SIZE bytes, computing the checksum incrementally.

Write Process:

  1. Acquire write lock
  2. Write pre-pad bytes for alignment
  3. Read chunks from reader using WRITE_STREAM_BUFFER_SIZE buffer
  4. Write each chunk immediately to file
  5. Update checksum incrementally
  6. Write metadata after all payload bytes
  7. Flush and reindex

Validation:

  • Rejects empty payloads
  • Rejects payloads containing only NULL bytes

Sources: src/storage_engine/data_store.rs:758-825

Delete Operations

Delete operations write tombstones (single NULL byte entries) to the storage. They are implemented using the write infrastructure with allow_null_bytes = true.

delete

Deletes a single key by writing a tombstone. Delegates to batch_delete().

Returns:

  • Ok(tail_offset): New tail offset
  • Err(std::io::Error): Write failure

Sources: src/storage_engine/data_store.rs:986-988

batch_delete

Deletes multiple keys in a single operation. Computes hashes and delegates to batch_delete_key_hashes().

Sources: src/storage_engine/data_store.rs:990-993

batch_delete_key_hashes

Core batch delete implementation. First checks which keys actually exist to avoid writing unnecessary tombstones, then writes tombstones for existing keys only.

Process:

  1. Acquire read lock on index
  2. Filter keys to only those that exist
  3. Release read lock
  4. If no keys exist, return immediately without I/O
  5. Create delete operations: (hash, &NULL_BYTE)
  6. Call batch_write_with_key_hashes() with allow_null_bytes = true

Optimization: Tombstones are written without pre-pad bytes since they are always exactly 1 byte.

Sources: src/storage_engine/data_store.rs:995-1024

Key Management Operations

These operations manipulate entries by key, providing higher-level functionality built on read and write operations.

rename

Renames a key by copying its data to a new key and deleting the old key.

Parameters:

  • old_key: Current key
  • new_key: New key (must be different from old key)

Returns:

  • Ok(tail_offset): New tail offset after rename
  • Err(std::io::Error): Key not found, or keys are identical

Process:

  1. Read entry at old_key
  2. Convert to EntryStream
  3. Write stream to new_key
  4. Delete old_key

Sources: src/storage_engine/data_store.rs:941-958

copy

Copies an entry from this storage to a different storage instance.

Parameters:

  • key: Key to copy
  • target: Target DataStore (must have a different path)

Returns:

  • Ok(target_offset): Offset in target storage where entry was written
  • Err(std::io::Error): Key not found, same storage path, or write failure

Process:

  1. Read entry handle for key
  2. Call copy_handle() to write to target
  3. Uses EntryStream for streaming copy

Sources: src/storage_engine/data_store.rs:960-979

transfer

Transfers an entry to another storage by copying then deleting.

Parameters:

  • key: Key to transfer
  • target: Target DataStore

Returns:

  • Ok(tail_offset): New tail offset after deletion
  • Err(std::io::Error): Copy or delete failure

Process:

graph TB
    subgraph "Iteration Methods"
        IntoIter["into_iter()"]
IterEntries["iter_entries()"]
ParIterEntries["par_iter_entries()"]
end
    
    subgraph "Iterator Types"
        EI["EntryIterator"]
PI["ParallelIterator<EntryHandle>"]
end
    
    subgraph "Implementation Details"
        GetMmap["get_mmap_arc()"]
GetTail["tail_offset.load()"]
CollectOffsets["collect packed offsets"]
FilterMap["filter_map entries"]
end
    
 
   IntoIter -->|consumes DataStore| IterEntries
 
   IterEntries --> GetMmap
 
   GetMmap --> GetTail
 
   GetTail --> EI
    
 
   ParIterEntries -->|requires parallel feature| CollectOffsets
 
   CollectOffsets --> FilterMap
 
   FilterMap --> PI
  1. Copy entry to target
  2. Delete entry from source

Sources: src/storage_engine/data_store.rs:981-984

Iteration

The DataStore provides methods for iterating over all valid entries. Iteration yields EntryHandle instances for each unique key, providing zero-copy access.

Sources: src/storage_engine/data_store.rs:269-361 src/storage_engine/entry_iterator.rs:8-127

iter_entries

Creates a sequential iterator over all valid entries. The iterator:

  • Starts at tail_offset and moves backward
  • Skips tombstones (deleted entries)
  • Ensures unique keys (only returns latest version)
  • Uses zero-copy access via EntryHandle

Returns:

  • EntryIterator: Sequential iterator over EntryHandle

Example:

Sources: src/storage_engine/data_store.rs:276-280

IntoIterator Implementation

Allows consuming a DataStore to produce an iterator:

This delegates to iter_entries() and consumes the DataStore instance.

Sources: src/storage_engine/data_store.rs:44-51

par_iter_entries (Parallel Feature)

Creates a parallel iterator using Rayon for multi-threaded processing. This method:

  1. Acquires a read lock on the index
  2. Collects all packed offset values (fast operation)
  3. Releases the lock immediately
  4. Returns a ParallelIterator that processes entries across threads

Returns:

  • ParallelIterator<Item = EntryHandle>: Parallel iterator over entries

Performance: Suitable for bulk operations on multi-core systems where each entry can be processed independently.

Sources: src/storage_engine/data_store.rs:296-361

Maintenance Operations

These methods provide utilities for monitoring and optimizing storage.

compact

Compacts the storage by creating a new file containing only the latest version of each key. This operation:

  1. Creates temporary file at {path}.bk
  2. Iterates through current entries
  3. Copies latest version of each key to temporary file
  4. Swaps temporary file with original file

Requirements:

  • Requires &mut self (exclusive mutable reference)
  • Should only be called when no other threads access the storage

Warning: While &mut self prevents concurrent mutations, it does not prevent other threads from holding shared references (&DataStore) if wrapped in Arc<DataStore>. External synchronization may be required.

Returns:

  • Ok(()): Compaction successful
  • Err(std::io::Error): I/O failure

Sources: src/storage_engine/data_store.rs:706-749

estimate_compaction_savings

Calculates potential space savings from compaction by comparing total file size to the size needed for unique entries only.

Returns:

  • Number of bytes that would be saved by compaction

Implementation: Iterates through entries, tracking seen keys, and sums the file size of unique entries.

Sources: src/storage_engine/data_store.rs:605-616

get_path

Returns the file path of the storage.

Sources: src/storage_engine/data_store.rs:265-267

len

Returns the number of unique keys currently stored (excluding tombstones).

Returns:

  • Ok(count): Number of keys in index
  • Err(std::io::Error): Lock acquisition failure

Sources: src/storage_engine/data_store.rs:1164-1171

is_empty

Checks if the storage contains any entries.

Sources: src/storage_engine/data_store.rs:1173-1177

file_size

Returns the total size of the storage file in bytes.

Sources: src/storage_engine/data_store.rs:1179-1181

Internal Methods

The following methods are internal implementation details but are important for understanding the API's behavior.

read_entry_with_context

Core read logic used by all read methods. Performs:

  1. Index lookup via key_indexer_guard.get_packed(&key_hash)
  2. Unpacking of tag and offset via KeyIndexer::unpack()
  3. Tag verification (if non_hashed_key is provided)
  4. Metadata deserialization
  5. Entry bounds calculation (handling pre-pad and tombstones)
  6. Construction of EntryHandle

Tag Verification: If non_hashed_key is provided and tag does not match KeyIndexer::tag_from_key(), returns None and logs a warning about potential hash collision.

Sources: src/storage_engine/data_store.rs:502-565

reindex

Updates the memory map and key index after write operations. This method:

  1. Creates new mmap via init_mmap()
  2. Acquires locks on mmap and key_indexer
  3. Inserts or removes key mappings
  4. Updates atomic tail_offset

Hash Collision Handling: If key_indexer_guard.insert() returns an error (collision detected), the entire operation fails to prevent inconsistent state.

Sources: src/storage_engine/data_store.rs:224-259

prepad_len

Calculates the number of padding bytes needed to align offset to PAYLOAD_ALIGNMENT (64 bytes). Formula:

prepad = (PAYLOAD_ALIGNMENT - (offset % PAYLOAD_ALIGNMENT)) & (PAYLOAD_ALIGNMENT - 1)

This ensures all non-tombstone payloads start on 64-byte boundaries.

Sources: src/storage_engine/data_store.rs:670-673

API Usage Patterns

Basic Read-Write Pattern

Batch Operations Pattern

Streaming Pattern

Iteration Pattern

Sources: src/lib.rs:20-115 README.md:208-246

Dismiss

Refresh this wiki

This wiki was recently refreshed. Please wait 4 days to refresh again.

On this page

  • DataStore API
  • API Structure
  • Core Types
  • Opening Storage Files
  • open
  • open_existing
  • Type Conversion Constructors
  • Read Operations
  • read
  • read_with_key_hash
  • batch_read
  • batch_read_hashed_keys
  • read_last_entry
  • read_metadata
  • exists
  • exists_with_key_hash
  • Write Operations
  • write
  • write_with_key_hash
  • batch_write
  • batch_write_with_key_hashes
  • write_stream
  • write_stream_with_key_hash
  • Delete Operations
  • delete
  • batch_delete
  • batch_delete_key_hashes
  • Key Management Operations
  • rename
  • copy
  • transfer
  • Iteration
  • iter_entries
  • IntoIterator Implementation
  • par_iter_entries (Parallel Feature)
  • Maintenance Operations
  • compact
  • estimate_compaction_savings
  • get_path
  • len
  • is_empty
  • file_size
  • Internal Methods
  • read_entry_with_context
  • reindex
  • prepad_len
  • API Usage Patterns
  • Basic Read-Write Pattern
  • Batch Operations Pattern
  • Streaming Pattern
  • Iteration Pattern
DeepWiki GitHub

Entry Structure and Metadata

Relevant source files

Purpose and Scope

This document details the on-disk binary layout of entries in the SIMD R Drive storage engine. It covers the structure of aligned entries, tombstones, metadata fields, and the alignment strategy that enables zero-copy access.

For information about how entries are read and accessed in memory, see Memory Management and Zero-Copy Access. For details on the validation chain and recovery mechanisms, see Storage Architecture.


On-Disk Entry Layout Overview

Every entry written to the storage file consists of three components:

  1. Pre-Pad Bytes (optional, 0-63 bytes) - Zero bytes inserted to ensure the payload starts at a 64-byte boundary
  2. Payload - Variable-length binary data
  3. Metadata - Fixed 20-byte structure containing key hash, previous offset, and checksum

The exception is tombstones (deletion markers), which use a minimal 1-byte payload with no pre-padding.

Sources: README.md:104-137 simd-r-drive-entry-handle/src/entry_metadata.rs:9-37


Aligned Entry Structure

Entry Layout Table

Offset RangeFieldSize (Bytes)Description
P .. P+padPre-Pad (optional)padZero bytes to align payload start
P+pad .. NPayloadN-(P+pad)Variable-length data
N .. N+8Key Hash864-bit XXH3 key hash
N+8 .. N+16Prev Offset8Absolute offset of previous tail
N+16 .. N+20Checksum4CRC32C of payload

Where:

  • pad = (A - (prev_tail % A)) & (A - 1), with A = PAYLOAD_ALIGNMENT (64 bytes)
  • The next entry starts at offset N + 20

Aligned Entry Structure Diagram

Sources: README.md:112-137 simd-r-drive-entry-handle/src/entry_metadata.rs:11-23


Tombstone Structure

Tombstones are special deletion markers that do not require payload alignment. They consist of a single zero byte followed by the standard 20-byte metadata structure.

Tombstone Layout Table

Offset RangeFieldSize (Bytes)Description
T .. T+1Payload1Single byte 0x00
T+1 .. T+21Metadata20Key hash, prev, crc32c

Tombstone Structure Diagram

Sources: README.md:126-131 simd-r-drive-entry-handle/src/entry_metadata.rs:25-30


EntryMetadata Structure

The EntryMetadata struct represents the fixed 20-byte metadata block that follows every payload. It is defined in #[repr(C)] layout to ensure consistent binary representation.

graph TB
    subgraph EntryMetadataStruct["EntryMetadata struct"]
field1["key_hash: u64\n8 bytes\nXXH3_64 hash"]
field2["prev_offset: u64\n8 bytes\nbackward chain link"]
field3["checksum: [u8; 4]\n4 bytes\nCRC32C payload checksum"]
end
    
 
   field1 --> field2
 
   field2 --> field3
    
    note4["Serialized at offset N\nfollowing payload"]
note5["Total: METADATA_SIZE = 20"]
field1 -.-> note4
 
   field3 -.-> note5

Metadata Fields

Field Descriptions

key_hash: u64 (8 bytes, offset N .. N+8)

  • 64-bit XXH3 hash of the key
  • Used by KeyIndexer for O(1) lookups
  • Combined with a tag for collision detection
  • Hardware-accelerated via SSE2/AVX2/NEON

prev_offset: u64 (8 bytes, offset N+8 .. N+16)

  • Absolute file offset of the previous entry for this key
  • Forms a backward-linked chain for version history
  • Set to 0 for the first entry of a key
  • Used during chain validation and recovery

checksum: [u8; 4] (4 bytes, offset N+16 .. N+20)

  • CRC32C checksum of the payload
  • Provides fast integrity verification
  • Not cryptographically secure
  • Used during recovery to detect corruption

Serialization and Deserialization

The EntryMetadata struct provides methods for converting to/from bytes:

  • serialize() -> [u8; 20] - Converts metadata to byte array using little-endian encoding
  • deserialize(data:&[u8]) -> Self - Reconstructs metadata from byte slice

Sources: simd-r-drive-entry-handle/src/entry_metadata.rs:44-113 README.md:114-120


Pre-Padding and Alignment Strategy

Alignment Purpose

All non-tombstone payloads start at a 64-byte aligned address. This alignment ensures:

  • Cache-line efficiency - Matches typical CPU cache line size
  • SIMD optimization - Enables full-speed AVX2/AVX-512/NEON operations
  • Zero-copy typed views - Allows safe reinterpretation as typed slices (&[u16], &[u32], etc.)
graph TD
    Start["Calculate padding needed"]
GetPrevTail["prev_tail = last written offset"]
CalcPad["pad = (PAYLOAD_ALIGNMENT - (prev_tail % PAYLOAD_ALIGNMENT))\n& (PAYLOAD_ALIGNMENT - 1)"]
CheckPad{"pad > 0?"}
WritePad["Write pad zero bytes"]
WritePayload["Write payload at aligned offset"]
Start --> GetPrevTail
 
   GetPrevTail --> CalcPad
 
   CalcPad --> CheckPad
 
   CheckPad -->|Yes| WritePad
 
   CheckPad -->|No| WritePayload
 
   WritePad --> WritePayload

The alignment is configured via PAYLOAD_ALIGNMENT constant (64 bytes as of version 0.15.0).

Pre-Padding Calculation

The formula pad = (A - (prev_tail % A)) & (A - 1) where A = PAYLOAD_ALIGNMENT ensures:

  • If prev_tail is already aligned, pad = 0
  • Otherwise, pad equals the bytes needed to reach the next aligned boundary
  • Maximum padding is A - 1 bytes (63 bytes for 64-byte alignment)

Constants

The alignment is defined in simd-r-drive-entry-handle/src/constants.rs:1-20:

ConstantValueDescription
PAYLOAD_ALIGN_LOG26Log₂ of alignment (2⁶ = 64)
PAYLOAD_ALIGNMENT64Actual alignment boundary in bytes
METADATA_SIZE20Fixed size of metadata block

Sources: README.md:51-59 simd-r-drive-entry-handle/src/entry_metadata.rs:22-23 CHANGELOG.md:25-51


Backward Chain Formation

Chain Structure

Each entry's prev_offset field creates a backward-linked chain that tracks the version history for a given key. This chain is essential for:

  • Recovery and validation on file open
  • Detecting incomplete writes
  • Rebuilding the index

Chain Properties

  • Most recent entry is at the end of the file (highest offset)
  • Chain traversal moves backward from tail toward offset 0
  • First entry for a key has prev_offset = 0
  • Valid chain can be walked all the way back to byte 0 without gaps
  • Broken chain indicates corruption or incomplete write

Usage in Recovery

During file open, the system:

  1. Scans backward from EOF reading metadata
  2. Follows prev_offset links to validate chain continuity
  3. Verifies checksums at each step
  4. Truncates file if corruption is detected
  5. Scans forward to rebuild the index

Sources: README.md:139-147 simd-r-drive-entry-handle/src/entry_metadata.rs:41-43


Entry Type Comparison

Aligned Entry vs. Tombstone

AspectAligned Entry (Non-Tombstone)Tombstone (Deletion Marker)
Pre-padding0-63 bytes (alignment dependent)None
Payload sizeVariable (user-defined)Fixed 1 byte (0x00)
Payload alignment64-byte boundaryNo alignment requirement
Metadata size20 bytes20 bytes
Total minimum size21 bytes (1-byte payload + metadata)21 bytes (1-byte + metadata)
Total maximum overhead83 bytes (63-byte pad + 20 metadata)21 bytes
Zero-copy capableYes (aligned payload)No (tombstone flag only)

When Tombstones Are Used

Tombstones mark key deletions while maintaining chain integrity. They:

  • Preserve the backward chain via prev_offset
  • Use minimal space (no alignment overhead)
  • Are detected during reads and filtered out
  • Enable recovery to skip deleted entries

Sources: README.md:112-137 simd-r-drive-entry-handle/src/entry_metadata.rs:9-37


Metadata Serialization Format

Binary Layout in File

Constants for Range Indexing

The simd-r-drive-entry-handle/src/constants.rs:1-20 file defines range constants for metadata field access:

  • KEY_HASH_RANGE = 0..8
  • PREV_OFFSET_RANGE = 8..16
  • CHECKSUM_RANGE = 16..20
  • METADATA_SIZE = 20

These ranges are used in EntryMetadata::serialize() and deserialize() methods.

Sources: simd-r-drive-entry-handle/src/entry_metadata.rs:62-112


Alignment Evolution and Migration

Version History

v0.14.0-alpha and earlier: Used 16-byte alignment (PAYLOAD_ALIGNMENT = 16)

v0.15.0-alpha onwards: Changed to 64-byte alignment (PAYLOAD_ALIGNMENT = 64)

This change was made to:

  • Ensure full cache-line alignment
  • Support AVX-512 and future SIMD extensions
  • Improve zero-copy performance across modern hardware

Migration Considerations

Storage files created with different alignment values are not compatible :

  • v0.14.x readers cannot correctly parse v0.15.x stores
  • v0.15.x readers may misinterpret v0.14.x padding

To migrate between versions:

  1. Read all entries using the old version binary
  2. Write entries to a new store using the new version binary
  3. Replace the old file after verification

In multi-service environments, deploy reader upgrades before writer upgrades to avoid mixed-version issues.

Sources: CHANGELOG.md:25-82 README.md:51-59


Debug Assertions for Alignment

Runtime Validation

The codebase includes debug-only alignment assertions that validate both pointer and offset alignment:

debug_assert_aligned(ptr: *const u8, align: usize) - Validates pointer alignment

  • Active in debug and test builds
  • Zero cost in release/bench builds
  • Ensures buffer base address is properly aligned

debug_assert_aligned_offset(off: u64) - Validates file offset alignment

These assertions help catch alignment issues during development without imposing runtime overhead in production.

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-88 CHANGELOG.md:33-41


Summary

The SIMD R Drive entry structure uses a carefully designed binary layout that balances efficiency, integrity, and flexibility:

  • Fixed 64-byte alignment ensures cache-friendly, SIMD-optimized access
  • 20-byte fixed metadata provides fast integrity checks and chain traversal
  • Variable pre-padding maintains alignment without complex calculations
  • Minimal tombstones mark deletions efficiently
  • Backward-linked chain enables robust recovery and validation

This design enables zero-copy reads, high write throughput, and automatic crash recovery while maintaining a simple, append-only storage model.

DeepWiki GitHub

Memory Management and Zero-Copy Access

Relevant source files

This document explains how SIMD R Drive implements zero-copy reads through memory-mapped files, manages payload alignment for optimal performance, and provides safe concurrent access to stored data. It covers the EntryHandle abstraction, the Arc<Mmap> sharing strategy, alignment requirements enforced by PAYLOAD_ALIGNMENT, and utilities for working with aligned binary data.

For details on the on-disk entry format and metadata structure, see Entry Structure and Metadata. For information on how concurrent reads and writes are coordinated, see Concurrency and Thread Safety. For performance characteristics of the alignment strategy, see Payload Alignment and Cache Efficiency.


Memory-Mapped File Architecture

SIMD R Drive uses memory-mapped files (memmap2::Mmap) to enable zero-copy reads. The storage file is mapped into the process's virtual address space, allowing direct access to payload bytes without copying them into separate buffers. This approach provides several benefits:

  • Zero-copy reads : Data is accessed directly from the mapped memory region
  • Larger-than-RAM datasets : The OS handles paging, so only accessed portions consume physical memory
  • Efficient random access : Any offset in the file can be accessed with pointer arithmetic
  • Shared memory : Multiple threads can read from the same mapped region concurrently

The memory-mapped file is wrapped in Arc<Mutex<Arc<Mmap>>> within the DataStore structure:

DataStore.mmap_arc: Arc<Mutex<Arc<Mmap>>>

The outer Arc allows the DataStore itself to be cloned and shared. The Mutex protects remapping operations (which occur after writes). The inner Arc<Mmap> is what gets cloned and handed out to readers, ensuring that even if a remap occurs, existing readers retain a valid view of the previous mapping.

Sources: README.md:43-49 README.md:174-180 src/storage_engine/entry_iterator.rs:22-23


Mmap Lifecycle and Remapping

Diagram: Memory-mapped file lifecycle across write operations

When a write operation extends the file, the following sequence occurs:

  1. The file is extended and flushed to disk
  2. The Mutex<Arc<Mmap>> is locked
  3. A new Mmap is created from the extended file
  4. The old Arc<Mmap> is replaced with a new one
  5. Existing EntryHandle instances continue using their old Arc<Mmap> reference until dropped

This design ensures that readers never see an invalid memory mapping, even as writes extend the file.

Sources: README.md:174-180 src/storage_engine/data_store.rs:46-48 (implied from architecture)


EntryHandle: Zero-Copy Data Access

The EntryHandle struct provides a zero-copy view into a specific payload within the memory-mapped file. It consists of three fields:

EntryHandle {
    mmap_arc: Arc<Mmap>,
    range: Range<usize>,
    metadata: EntryMetadata,
}

Each EntryHandle holds:

  • mmap_arc : A reference-counted pointer to the memory-mapped file
  • range : The byte range <FileRef file-url="https://github.com/jzombie/rust-simd-r-drive/blob/487b7b98/start..end) identifying the payload location\n- metadata #LNaN-LNaN" NaN file-path="start..end) identifying the payload location\n- **metadata`**">Hii (structure definition)

EntryHandle Methods

Diagram: EntryHandle methods and their memory semantics

MethodReturn TypeMemory SemanticsUse Case
as_slice()&[u8]Zero-copy borrowDirect byte access
into_vec()Vec<u8>Owned copyWhen ownership needed
as_arrow_buffer()arrow::buffer::BufferZero-copy BufferArrow integration (feature-gated)
into_arrow_buffer()arrow::buffer::BufferConsumes EntryHandleArrow integration (feature-gated)
metadata()&EntryMetadataBorrowAccess key hash, checksum
get_mmap_arc()&Arc<Mmap>BorrowLow-level mmap access
get_range()&Range<usize>BorrowGet payload boundaries

The as_slice() method is the primary zero-copy interface:

It returns a slice directly into the memory-mapped region with no allocation or copying.

Sources: simd-r-drive-entry-handle/src/entry_handle.rs:22-50 (method implementations)


Alignment Requirements

All non-tombstone payloads in SIMD R Drive begin on a fixed 64-byte aligned boundary. This alignment is defined by the PAYLOAD_ALIGNMENT constant:

PAYLOAD_ALIGN_LOG2 = 6
PAYLOAD_ALIGNMENT = 1 << 6 = 64 bytes

The 64-byte alignment matches typical CPU cache line sizes and ensures that SIMD operations (AVX, AVX-512, SVE) can operate at full speed without crossing cache line boundaries.

Sources: simd-r-drive-entry-handle/src/constants.rs:13-18 README.md:51-59


Alignment Calculation

The pre-padding required to align a payload is calculated using:

pad = (PAYLOAD_ALIGNMENT - (prev_tail % PAYLOAD_ALIGNMENT)) & (PAYLOAD_ALIGNMENT - 1)

Where prev_tail is the file offset immediately after the previous entry's metadata. This formula ensures:

  • Payloads always start at multiples of PAYLOAD_ALIGNMENT
  • Pre-padding ranges from 0 to PAYLOAD_ALIGNMENT - 1 bytes
  • The calculation works for any power-of-two alignment

Diagram: Pre-padding calculation for 64-byte alignment

The storage engine writes zero bytes for the pre-padding region, then writes the payload starting at the aligned boundary.

Sources: README.md:112-124 src/storage_engine/entry_iterator.rs:50-53


Alignment Validation

The crate provides debug-only assertions to verify alignment invariants:

debug_assert_aligned(ptr: *const u8, align: usize)
debug_assert_aligned_offset(offset: u64)

These assertions:

  • Are compiled to no-ops in release builds (zero runtime cost)
  • Execute in debug and test builds to catch alignment violations
  • Check both pointer alignment (debug_assert_aligned) and offset alignment (debug_assert_aligned_offset)

Diagram: Debug-only alignment assertion behavior

graph TB
    subgraph "Alignment Validation"
        DebugMode["Debug/Test Build"]
ReleaseMode["Release Build"]
Call["debug_assert_aligned(ptr, align)"]
CheckPowerOf2["Assert align.is_power_of_two()"]
CheckAligned["Assert (ptr as usize & (align-1)) == 0"]
NoOp["No-op (optimized away)"]
Call --> DebugMode
 
       Call --> ReleaseMode
        
 
       DebugMode --> CheckPowerOf2
 
       CheckPowerOf2 --> CheckAligned
        
 
       ReleaseMode --> NoOp
    end

The Arrow integration feature uses these assertions when creating arrow::buffer::Buffer instances from EntryHandle:

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-88 simd-r-drive-entry-handle/src/entry_handle.rs:52-85 (Arrow integration)


Typed Slice Access via align_or_copy

The align_or_copy utility function enables efficient conversion of byte slices into typed slices (e.g., &[f32], &[u32]) with automatic fallback when alignment requirements are not met.

align_or_copy<T, const N: usize>(
    bytes: &[u8],
    from_le_bytes: fn([u8; N]) -> T
) -> Cow<'_, [T]>

Sources: src/utils/align_or_copy.rs:1-73


Zero-Copy vs. Fallback Behavior

Diagram: align_or_copy decision flow

The function attempts zero-copy reinterpretation using slice::align_to::<T>():

ConditionResultAllocation
Memory aligned for T AND length is multiple of size_of::<T>()Cow::BorrowedNone
Misaligned OR length not a multipleCow::OwnedAllocates Vec<T>

Zero-copy path:

Fallback path:

Sources: src/utils/align_or_copy.rs:44-73


Usage Example

Due to the 64-byte payload alignment, EntryHandle payloads are often well-aligned for SIMD types:

Since PAYLOAD_ALIGNMENT = 64 is a multiple of align_of::<f32>() = 4, align_of::<f64>() = 8, and align_of::<u128>() = 16, zero-copy access is typically possible for these types when the payload length is an exact multiple of their size.

Sources: src/utils/align_or_copy.rs:37-43 README.md:51-59


graph TB
    subgraph "DataStore Structure"
        DS["DataStore"]
MmapMutex["Arc&lt;Mutex&lt;Arc&lt;Mmap&gt;&gt;&gt;"]
end
    
    subgraph "Reader Thread 1"
        R1["read(key1)"]
EH1["EntryHandle\n(Arc&lt;Mmap&gt; clone)"]
Slice1["as_slice() → &[u8]"]
end
    
    subgraph "Reader Thread 2"
        R2["read(key2)"]
EH2["EntryHandle\n(Arc&lt;Mmap&gt; clone)"]
Slice2["as_slice() → &[u8]"]
end
    
    subgraph "Reader Thread N"
        RN["read(keyN)"]
EHN["EntryHandle\n(Arc&lt;Mmap&gt; clone)"]
SliceN["as_slice() → &[u8]"]
end
    
    subgraph "Memory-Mapped File"
        Mmap["memmap2::Mmap\n(shared memory region)"]
end
    
 
   DS --> MmapMutex
    MmapMutex -.Arc::clone.-> R1
    MmapMutex -.Arc::clone.-> R2
    MmapMutex -.Arc::clone.-> RN
    
 
   R1 --> EH1
 
   R2 --> EH2
 
   RN --> EHN
    
 
   EH1 --> Slice1
 
   EH2 --> Slice2
 
   EHN --> SliceN
    
    Slice1 -.references.-> Mmap
    Slice2 -.references.-> Mmap
    SliceN -.references.-> Mmap

Memory Sharing and Concurrency

The Arc<Mmap> design enables efficient memory sharing across threads while maintaining safety:

Diagram: Concurrent zero-copy reads via Arc sharing

Concurrency characteristics:

  • No read locks required : Once an EntryHandle holds an Arc<Mmap>, it can access data without further synchronization
  • Write safety : Writers acquire RwLock<File> to serialize writes, and Mutex<Arc<Mmap>> to safely remap after writes
  • Memory efficiency : All readers share the same physical pages, regardless of thread count
  • Graceful remapping : Old Arc<Mmap> instances remain valid until all references are dropped

This design allows SIMD R Drive to support thousands of concurrent readers with minimal overhead.

Sources: README.md:172-206 src/storage_engine/entry_iterator.rs:22-23


Datasets Larger Than RAM

Memory mapping enables efficient access to storage files that exceed available physical memory:

  • OS-managed paging : The operating system handles paging, loading only accessed regions into physical memory
  • Transparent access : Application code uses the same EntryHandle API regardless of file size
  • Efficient random access : Jumping between distant file offsets does not require explicit seeking
  • Memory pressure handling : Unused pages are evicted by the OS when memory is needed elsewhere

For example, a storage file containing 100 GB of data on a system with 16 GB of RAM can be accessed efficiently. Only the portions actively being read will occupy physical memory, with the OS managing page faults transparently.

This design makes SIMD R Drive suitable for:

  • Large-scale data analytics workloads
  • Multi-gigabyte append-only logs
  • Embedded databases on resource-constrained systems
  • Archive storage with infrequent random access patterns

Sources: README.md:43-49 README.md:147-148

DeepWiki GitHub

Concurrency and Thread Safety

Relevant source files

Purpose and Scope

This document describes the concurrency model and thread safety mechanisms used by the SIMD R Drive storage engine. It covers the synchronization primitives, locking strategies, and guarantees provided for concurrent access in single-process, multi-threaded environments. For information about the core storage architecture and data structures, see Storage Architecture. For details on the DataStore API methods, see DataStore API.

Sources: README.md:170-207 src/storage_engine/data_store.rs:1-33

Synchronization Architecture

The DataStore structure employs a combination of synchronization primitives to enable safe concurrent access while maintaining high performance for reads and writes.

graph TB
    subgraph DataStore["DataStore Structure"]
File["file\nArc&lt;RwLock&lt;BufWriter&lt;File&gt;&gt;&gt;\nWrite synchronization"]
Mmap["mmap\nArc&lt;Mutex&lt;Arc&lt;Mmap&gt;&gt;&gt;\nMemory-map coordination"]
TailOffset["tail_offset\nAtomicU64\nSequential write ordering"]
KeyIndexer["key_indexer\nArc&lt;RwLock&lt;KeyIndexer&gt;&gt;\nConcurrent index access"]
Path["path\nPathBuf\nFile location"]
end
    
 
   ReadOps["Read Operations"] --> KeyIndexer
 
   ReadOps --> Mmap
    
 
   WriteOps["Write Operations"] --> File
 
   WriteOps --> TailOffset
    WriteOps -.reindex.-> Mmap
    WriteOps -.reindex.-> KeyIndexer
    WriteOps -.reindex.-> TailOffset
    
 
   ParallelIter["par_iter_entries"] --> KeyIndexer
 
   ParallelIter --> Mmap

DataStore Synchronization Primitives

Synchronization Primitives by Field:

FieldTypePurposeConcurrency Model
fileArc<RwLock<BufWriter<File>>>Write operationsExclusive write lock required
mmapArc<Mutex<Arc<Mmap>>>Memory-mapped viewLocked during remap operations
tail_offsetAtomicU64Next write positionLock-free atomic updates
key_indexerArc<RwLock<KeyIndexer>>Hash-to-offset indexRead-write lock for lookups/updates
pathPathBufStorage file pathImmutable after creation

Sources: src/storage_engine/data_store.rs:27-33 README.md:172-182

Read Path Concurrency

Read operations in SIMD R Drive are designed to be lock-free and zero-copy, enabling high throughput for concurrent readers.

sequenceDiagram
    participant Thread1 as "Reader Thread 1"
    participant Thread2 as "Reader Thread 2"
    participant KeyIndexer as "key_indexer\nRwLock&lt;KeyIndexer&gt;"
    participant Mmap as "mmap\nMutex&lt;Arc&lt;Mmap&gt;&gt;"
    participant File as "Memory-Mapped File"
    
    Thread1->>KeyIndexer: read_lock()
    Thread2->>KeyIndexer: read_lock()
    Note over Thread1,Thread2: Multiple readers can acquire lock simultaneously
    
    KeyIndexer-->>Thread1: lookup hash → (tag, offset)
    KeyIndexer-->>Thread2: lookup hash → (tag, offset)
    
    Thread1->>Mmap: get_mmap_arc()
    Thread2->>Mmap: get_mmap_arc()
    Note over Mmap: Clone Arc, release Mutex immediately
    
    Mmap-->>Thread1: Arc&lt;Mmap&gt;
    Mmap-->>Thread2: Arc&lt;Mmap&gt;
    
    Thread1->>File: Direct memory access [offset..offset+len]
    Thread2->>File: Direct memory access [offset..offset+len]
    Note over Thread1,Thread2: Zero-copy reads, no locking required

Lock-Free Read Architecture

Read Method Implementation

The read_entry_with_context method (lines 501-565) centralizes the read logic:

  1. Acquire Read Lock: Obtains a read lock on key_indexer via the provided RwLockReadGuard
  2. Index Lookup: Retrieves the packed (tag, offset) value for the key hash
  3. Tag Verification: Validates the tag to detect hash collisions (lines 513-521)
  4. Memory Access: Reads metadata and payload directly from the memory-mapped region
  5. Tombstone Check: Filters out deleted entries (single NULL byte)
  6. Zero-Copy Handle: Returns an EntryHandle referencing the memory-mapped data

Key Performance Characteristics:

  • No Write Lock: Reads never block on the file write lock
  • Concurrent Readers: Multiple threads can read simultaneously via RwLock read lock
  • No Data Copy: The EntryHandle references the memory-mapped region directly
  • Immediate Lock Release: The get_mmap_arc method (lines 658-663) clones the Arc<Mmap> and immediately releases the Mutex

Sources: src/storage_engine/data_store.rs:501-565 src/storage_engine/data_store.rs:658-663 README.md:174-175

Write Path Synchronization

Write operations require exclusive access to the storage file and coordinate with the memory-mapped view and index.

sequenceDiagram
    participant Writer as "Writer Thread"
    participant FileRwLock as "file\nRwLock"
    participant BufWriter as "BufWriter&lt;File&gt;"
    participant TailOffset as "tail_offset\nAtomicU64"
    participant Reindex as "reindex()"
    participant MmapMutex as "mmap\nMutex"
    participant KeyIndexerRwLock as "key_indexer\nRwLock"
    
    Writer->>FileRwLock: write_lock()
    Note over FileRwLock: Exclusive lock acquired\nOnly one writer at a time
    
    Writer->>TailOffset: load(Ordering::Acquire)
    TailOffset-->>Writer: current tail offset
    
    Writer->>BufWriter: write pre-pad + payload + metadata
    Writer->>BufWriter: flush()
    
    Writer->>Reindex: reindex(&file, key_hash_offsets, tail_offset)
    
    Reindex->>BufWriter: init_mmap(&file)
    Note over Reindex: Create new memory-mapped view
    
    Reindex->>MmapMutex: lock()
    Reindex->>MmapMutex: Replace Arc&lt;Mmap&gt;
    Reindex->>MmapMutex: unlock()
    
    Reindex->>KeyIndexerRwLock: write_lock()
    Reindex->>KeyIndexerRwLock: insert(key_hash → offset)
    Reindex->>KeyIndexerRwLock: unlock()
    
    Reindex->>TailOffset: store(new_tail, Ordering::Release)
    
    Reindex-->>Writer: Ok(())
    Writer->>FileRwLock: unlock()

Write Operation Flow

Write Methods and Locking

Single Write (write):

  • Delegates to batch_write_with_key_hashes with a single entry (lines 827-834)

Batch Write (batch_write_with_key_hashes):

  • Acquires write lock on file (lines 852-855)
  • Loads tail_offset atomically (line 858)
  • Writes all entries to buffer (lines 863-922)
  • Writes buffer to file in single operation (line 935)
  • Flushes to disk (line 936)
  • Calls reindex to update mmap and index (lines 938-943)
  • Write lock automatically released on file_guard drop

Streaming Write (write_stream_with_key_hash):

  • Acquires write lock on file (lines 759-762)
  • Streams payload to file via buffered reads (lines 778-790)
  • Flushes after streaming (line 814)
  • Calls reindex to update mmap and index (lines 818-823)

Sources: src/storage_engine/data_store.rs:752-843 src/storage_engine/data_store.rs:847-943

Memory Mapping Coordination

The reindex method coordinates the critical transition from write completion to read visibility by updating the memory-mapped view.

Reindex Operation

The reindex method (lines 224-259) performs four critical operations:

Step-by-Step Process:

  1. Create New Memory Map (line 231):

    • Calls init_mmap(&write_guard) to create fresh Mmap from the file
    • File must be flushed before this point to ensure visibility
  2. Update Shared Mmap (lines 232, 255):

    • Acquires Mutex on self.mmap
    • Replaces the Arc<Mmap> with the new mapping
    • Existing readers retain old Arc<Mmap> references until they drop
  3. Update Index (lines 233-253):

    • Acquires write lock on self.key_indexer
    • Inserts new (key_hash → offset) mappings
    • Removes tombstone entries if present
  4. Update Tail Offset (line 256):

    • Stores new tail offset with Ordering::Release
    • Ensures all prior writes are visible before offset update

Critical Ordering: The write lock on file is held throughout the entire reindex operation, preventing concurrent writes from starting until the index and mmap are fully updated.

Sources: src/storage_engine/data_store.rs:224-259 src/storage_engine/data_store.rs:172-174

Index Concurrency Model

The key_indexer uses RwLock<KeyIndexer> to enable concurrent read access while providing exclusive write access during updates.

Index Access Patterns

OperationLock TypeHeld DuringPurpose
readRwLock::read()Index lookup only (lines 507-508)Multiple threads can lookup concurrently
batch_readRwLock::read()All lookups in batchAmortizes lock acquisition overhead
par_iter_entriesRwLock::read()Collecting offsets only (lines 300-302)Minimizes lock hold time, drops before parallel iteration
write / batch_writeRwLock::write()Index updates in reindex (lines 233-236)Exclusive access for inserting new entries
delete (tombstone)RwLock::write()Removing from index (line 243)Exclusive access for deletion

Parallel Iteration Strategy

The par_iter_entries method (lines 296-361, feature-gated by parallel) demonstrates efficient lock usage:

This approach:

  • Minimizes the duration the read lock is held
  • Avoids holding locks during expensive parallel processing
  • Allows concurrent writes to proceed while parallel iteration is in progress

Sources: src/storage_engine/data_store.rs:296-361

Atomic Tail Offset

The tail_offset field uses AtomicU64 with acquire-release ordering to coordinate write sequencing without locks.

Atomic Operations Pattern

Load (Acquire):

  • Used at start of write operations (lines 763, 858)
  • Ensures all prior writes are visible before loading the offset
  • Provides the base offset for the new write

Store (Release):

  • Used at end of reindex after all updates (line 256)
  • Ensures all writes (file, mmap, index) are visible before storing
  • Prevents reordering of writes after the offset update

Sequential Consistency: Even though multiple threads may acquire the write lock sequentially, the atomic tail offset ensures that:

  1. Each write observes the correct starting position
  2. The tail offset update happens after all associated data is committed
  3. Readers see a consistent view of the storage

Sources: src/storage_engine/data_store.rs30 src/storage_engine/data_store.rs256 src/storage_engine/data_store.rs763 src/storage_engine/data_store.rs858

Multi-Process Considerations

SIMD R Drive's concurrency mechanisms are designed for single-process, multi-threaded environments. Multi-process access requires external coordination.

Thread Safety Matrix

EnvironmentReadsWritesIndex UpdatesStorage Safety
Single Process, Single Thread✅ Safe✅ Safe✅ Safe✅ Safe
Single Process, Multi-Threaded✅ Safe (lock-free, zero-copy)✅ Safe (RwLock<File>)✅ Safe (RwLock<KeyIndexer>)✅ Safe (Mutex<Arc<Mmap>>)
Multiple Processes, Shared File⚠️ Unsafe (no cross-process coordination)❌ Unsafe (no external locking)❌ Unsafe (separate memory spaces)❌ Unsafe (race conditions)

Why Multi-Process is Unsafe

Separate Memory Spaces:

  • Each process has its own Arc<Mmap> instance
  • Index changes in one process are not visible to others
  • The tail_offset atomic is process-local

No Cross-Process Synchronization:

  • RwLock and Mutex only synchronize within a single process
  • File writes from multiple processes can interleave arbitrarily
  • Memory mapping updates are not coordinated

Risk of Corruption:

  • Two processes can both acquire file write locks independently
  • Both may attempt to write at the same tail_offset
  • The storage file's append-only chain can be broken

External Locking Requirements

For multi-process access, applications must implement external file locking:

  • Advisory Locks: Use flock (Unix) or LockFile (Windows)
  • Mandatory Locks: File system-level exclusive access
  • Distributed Locks: For networked file systems

The core library intentionally does not provide cross-process locking to avoid platform-specific dependencies and maintain flexibility.

Sources: README.md:185-206 src/storage_engine/data_store.rs:27-33

Concurrency Testing

The test suite validates concurrent access patterns through multi-threaded integration tests.

Test Coverage

Concurrent Write Test (concurrent_write_test):

  • Spawns 16 threads writing 10 entries each (lines 111-161)
  • Uses Arc<DataStore> to share storage instance
  • Verifies all 160 entries are correctly written and retrievable
  • Demonstrates write lock serialization

Interleaved Read-Write Test (interleaved_read_write_test):

  • Two threads alternating reads and writes on same key (lines 163-229)
  • Uses tokio::sync::Notify for coordination
  • Validates that writes are immediately visible to readers
  • Confirms index updates are atomic

Concurrent Streamed Write Test (concurrent_slow_streamed_write_test):

  • Multiple threads streaming 1MB payloads concurrently (lines 14-109)
  • Simulates slow readers with artificial delays
  • Verifies write lock prevents interleaved streaming
  • Validates payload integrity after concurrent writes

Sources: tests/concurrency_tests.rs:14-229

graph TD
 
   A["1. file: RwLock (Write)"] --> B["2. mmap: Mutex"]
B --> C["3. key_indexer: RwLock (Write)"]
C --> D["4. tail_offset: AtomicU64"]
style A fill:#f9f9f9
    style B fill:#f9f9f9
    style C fill:#f9f9f9
    style D fill:#f9f9f9

Locking Order and Deadlock Prevention

To prevent deadlocks, SIMD R Drive follows a consistent lock acquisition order:

Lock Hierarchy

Acquisition Order:

  1. File Write Lock: Acquired first in all write operations
  2. Mmap Mutex: Locked during reindex to update memory mapping
  3. Key Indexer Write Lock: Locked during reindex to update index
  4. Tail Offset Atomic: Updated last with Release ordering

Lock Release: Locks are released in reverse order automatically through RAII (Rust's drop semantics).

Deadlock Prevention: This fixed order ensures no circular wait conditions. Read operations only acquire the key indexer read lock, which cannot deadlock with other readers.

Sources: src/storage_engine/data_store.rs:224-259 src/storage_engine/data_store.rs:752-825 src/storage_engine/data_store.rs:847-943

Performance Characteristics

Concurrent Read Performance

  • Zero-Copy: No memory allocation or copying for payload access
  • Lock-Free After Index Lookup: Once Arc<Mmap> is cloned, no locks held
  • Scalable: Read throughput increases linearly with thread count
  • Cache-Friendly: 64-byte alignment optimizes cache line access

Write Serialization

  • Sequential Writes: All writes are serialized by the RwLock<File>
  • Minimal Lock Hold Time: Lock held only during actual I/O and index update
  • Batch Amortization: batch_write amortizes lock overhead across multiple entries
  • No Write Starvation: Fair RwLock implementation prevents reader starvation of writers

Benchmark Results

From benches/storage_benchmark.rs typical performance on a single-threaded benchmark:

  • Write Throughput: ~1M entries (8 bytes each) written per batch operation
  • Sequential Read: Iterates all entries via zero-copy handles
  • Random Read: ~1M random lookups complete in under 1 second
  • Batch Read: Vectorized lookups provide additional speedup

For multi-threaded workloads, the concurrency tests demonstrate that read throughput scales with core count while write throughput is limited by the sequential append-only design.

Sources: benches/storage_benchmark.rs:1-234 tests/concurrency_tests.rs:1-230

DeepWiki GitHub

Key Indexing and Hashing

Relevant source files

Purpose and Scope

This page documents the key indexing system and hashing mechanisms used in SIMD R Drive's storage engine. It covers the KeyIndexer data structure, the XXH3 hashing algorithm, tag-based collision detection, and hardware acceleration features.

For information about how the index is accessed in concurrent operations, see Concurrency and Thread Safety. For details on how metadata is stored alongside payloads, see Entry Structure and Metadata.


Overview

The SIMD R Drive storage engine maintains an in-memory index that maps key hashes to file offsets, enabling O(1) lookup performance for stored entries. This index is critical for avoiding full file scans when retrieving data.

The indexing system consists of three main components:

  1. KeyIndexer : A concurrent hash map that stores packed values containing both a collision-detection tag and a file offset
  2. XXH3_64 hashing : A fast, hardware-accelerated hashing algorithm that generates 64-bit hashes from arbitrary keys
  3. Tag-based verification : A secondary collision detection mechanism that validates lookups to prevent hash collision errors

Sources: src/storage_engine/data_store.rs:1-33 README.md:158-168


KeyIndexer Structure

The KeyIndexer is the core data structure managing the key-to-offset mapping. It is wrapped in thread-safe containers to support concurrent access.

graph TB
    subgraph "DataStore"
        KeyIndexerField["key_indexer\nArc&lt;RwLock&lt;KeyIndexer&gt;&gt;"]
end
    
    subgraph "KeyIndexer Internals"
        HashMap["HashMap&lt;u64, u64&gt;\nwith Xxh3BuildHasher"]
PackedValue["Packed u64 Value\n(16-bit tag / 48-bit offset)"]
end
    
    subgraph "Operations"
        Insert["insert(key_hash, offset)\n→ Result&lt;()&gt;"]
Lookup["get_packed(&key_hash)\n→ Option&lt;u64&gt;"]
Unpack["unpack(packed)\n→ (tag, offset)"]
TagFromKey["tag_from_key(key)\n→ u16"]
Build["build(mmap, tail_offset)\n→ KeyIndexer"]
end
    
 
   KeyIndexerField --> HashMap
 
   HashMap --> PackedValue
    
 
   Insert -.-> HashMap
 
   Lookup -.-> HashMap
 
   Unpack -.-> PackedValue
 
   TagFromKey -.-> PackedValue
 
   Build --> KeyIndexerField

Packed Value Format

The KeyIndexer stores a compact 64-bit packed value for each hash. This value encodes two pieces of information:

BitsFieldDescription
63-48Tag (16-bit)Collision detection tag derived from key bytes
47-0Offset (48-bit)Absolute file offset to entry metadata

This packing allows the index to store both the offset and a verification tag without requiring additional memory allocations.

Sources: src/storage_engine/data_store.rs31 src/storage_engine/data_store.rs:509-511


Hashing Algorithm: XXH3_64

SIMD R Drive uses the XXH3_64 hashing algorithm from the xxhash-rust crate. XXH3 is optimized for speed and provides hardware acceleration on supported platforms.

Hardware Acceleration

The XXH3_64 implementation automatically detects and utilizes CPU-specific SIMD instructions:

graph LR
    KeyBytes["Key Bytes\n&[u8]"]
subgraph "compute_hash(key)"
        XXH3["xxhash-rust\nXXH3_64"]
end
    
    subgraph "Hardware Detection"
        X86Check["x86_64 CPU?"]
ARMCheck["aarch64 CPU?"]
end
    
    subgraph "SIMD Paths"
        SSE2["SSE2 Instructions\n(baseline x86_64)"]
AVX2["AVX2 Instructions\n(if available)"]
NEON["NEON Instructions\n(aarch64 default)"]
end
    
 
   KeyBytes --> XXH3
 
   XXH3 --> X86Check
 
   XXH3 --> ARMCheck
    
 
   X86Check -->|Yes| SSE2
 
   X86Check -->|Yes + feature| AVX2
 
   ARMCheck -->|Yes| NEON
    
 
   SSE2 --> Hash64["u64 Hash"]
AVX2 --> Hash64
 
   NEON --> Hash64
PlatformBaseline InstructionsOptional ExtensionsPerformance Boost
x86_64SSE2 (always enabled)AVX2Significant
aarch64NEON (default)None requiredBuilt-in
FallbackScalar operationsNoneBaseline

Sources: README.md:160-165 Cargo.lock:1814-1820 src/storage_engine/data_store.rs:2-4

Hash Computation Functions

The digest module provides batch and single-key hashing functions:

FunctionSignatureUse Case
compute_hashfn(key: &[u8]) -> u64Single key hashing
compute_hash_batchfn(keys: &[&[u8]]) -> Vec<u64>Parallel batch hashing
compute_checksumfn(payload: &[u8]) -> [u8; 4]CRC32C payload validation

Sources: src/storage_engine/data_store.rs:2-4 src/storage_engine/data_store.rs754 src/storage_engine/data_store.rs:839-842


Tag-Based Collision Detection

While XXH3_64 produces high-quality 64-bit hashes, the system implements an additional layer of collision detection using 16-bit tags derived from the original key bytes.

sequenceDiagram
    participant Client
    participant DataStore
    participant KeyIndexer
    participant Mmap
    
    Note over Client,Mmap: Write Operation
    Client->>DataStore: write(key, payload)
    DataStore->>DataStore: key_hash = compute_hash(key)
    DataStore->>KeyIndexer: tag = tag_from_key(key)
    DataStore->>Mmap: write payload + metadata
    DataStore->>KeyIndexer: insert(key_hash, pack(tag, offset))
    
    Note over Client,Mmap: Read Operation with Tag Verification
    Client->>DataStore: read(key)
    DataStore->>DataStore: key_hash = compute_hash(key)
    DataStore->>KeyIndexer: packed = get_packed(key_hash)
    KeyIndexer-->>DataStore: Some(packed_value)
    DataStore->>DataStore: (stored_tag, offset) = unpack(packed)
    DataStore->>DataStore: expected_tag = tag_from_key(key)
    
    alt Tag Mismatch
        DataStore->>DataStore: warn!("Tag mismatch")
        DataStore-->>Client: None (collision detected)
    else Tag Match
        DataStore->>Mmap: read entry at offset
        Mmap-->>DataStore: EntryHandle
        DataStore-->>Client: Some(EntryHandle)
    end

How Tags Work

Tag Verification Process

When reading an entry, the system performs the following verification:

  1. Compute the 64-bit hash of the requested key
  2. Look up the hash in the KeyIndexer to get the packed value
  3. Unpack the stored tag and offset from the packed value
  4. Compute the expected tag from the original key bytes
  5. Compare the stored tag with the expected tag
  6. If tags match, proceed with reading; if not, return None (collision detected)

This two-level verification (hash + tag) dramatically reduces the probability of false positives from hash collisions.

Sources: src/storage_engine/data_store.rs:502-522 src/storage_engine/data_store.rs:513-521

Collision Detection in Batch Operations

For batch read operations, the system can perform tag verification when the original keys are provided:

Sources: src/storage_engine/data_store.rs:1105-1158


Index Building and Maintenance

Index Construction on Open

When a DataStore is opened, the KeyIndexer is constructed by scanning the validated storage file forward from offset 0 to the tail offset:

graph TB
    OpenFile["DataStore::open(path)"]
RecoverChain["recover_valid_chain()"]
BuildIndex["KeyIndexer::build(mmap, final_len)"]
subgraph "Index Build Process"
        ScanForward["Scan from 0 to tail_offset"]
ReadMetadata["Read EntryMetadata"]
ExtractHash["Extract key_hash"]
ComputeTag["Compute tag from prev entries"]
PackValue["pack(tag, metadata_offset)"]
InsertMap["Insert into HashMap"]
end
    
 
   OpenFile --> RecoverChain
 
   RecoverChain --> BuildIndex
 
   BuildIndex --> ScanForward
 
   ScanForward --> ReadMetadata
 
   ReadMetadata --> ExtractHash
 
   ExtractHash --> ComputeTag
 
   ComputeTag --> PackValue
 
   PackValue --> InsertMap
 
   InsertMap --> ScanForward
    
 
   InsertMap --> Initialized["KeyIndexer Ready"]

The index is built only once during the open() operation. Subsequent writes update the index incrementally.

Sources: src/storage_engine/data_store.rs:84-116 src/storage_engine/data_store.rs:106-108

graph LR
    WriteOp["Write Operation\n(single or batch)"]
FlushFile["Flush to disk"]
Reindex["reindex()"]
subgraph "Reindex Steps"
        RemapFile["Create new mmap"]
LockIndex["Acquire RwLock write"]
UpdateEntries["Update key_hash → offset"]
HandleDeletes["Remove deleted keys"]
UpdateTail["Store tail_offset (Atomic)"]
end
    
 
   WriteOp --> FlushFile
 
   FlushFile --> Reindex
 
   Reindex --> RemapFile
 
   RemapFile --> LockIndex
 
   LockIndex --> UpdateEntries
 
   UpdateEntries --> HandleDeletes
 
   HandleDeletes --> UpdateTail

Index Updates During Writes

After each write operation, the reindex() method updates the in-memory index with new key mappings:

Critical : The file must be flushed before remapping to ensure newly written data is visible in the new memory-mapped view.

Sources: src/storage_engine/data_store.rs:224-259 src/storage_engine/data_store.rs:814-824


Concurrent Access and Locking

The KeyIndexer is protected by an Arc<RwLock<KeyIndexer>> wrapper, enabling multiple concurrent readers while ensuring exclusive access for writers.

OperationLock TypeConcurrency
read()Read lockMultiple threads can read in parallel
batch_read()Read lockMultiple threads can read in parallel
write()Write lockSingle writer, blocks all readers
batch_write()Write lockSingle writer, blocks all readers
delete()Write lockSingle writer, blocks all readers

Sources: src/storage_engine/data_store.rs31 src/storage_engine/data_store.rs:1042-1049 src/storage_engine/data_store.rs:852-856


Performance Characteristics

Lookup Performance

The indexing system provides O(1) average-case lookup performance. Benchmarks demonstrate:

  • 1 million random seeks (8-byte entries): typically < 1 second
  • Hash computation overhead : negligible due to hardware acceleration
  • Tag verification overhead : minimal (single comparison)

Sources: README.md:166-167

Memory Overhead

Each index entry consumes:

  • 8 bytes for the key hash (map key)
  • 8 bytes for the packed value (map value)
  • Total : 16 bytes per unique key in memory

For a dataset with 1 million unique keys, the index occupies approximately 16 MB of RAM.

graph LR
    subgraph "Scalar Baseline"
        ScalarOps["Byte-by-byte\nprocessing"]
ScalarTime["~100% time"]
end
    
    subgraph "SSE2 (x86_64)"
        SSE2Ops["16-byte chunks\nparallel"]
SSE2Time["~40-60% time"]
end
    
    subgraph "AVX2 (x86_64)"
        AVX2Ops["32-byte chunks\nparallel"]
AVX2Time["~30-50% time"]
end
    
    subgraph "NEON (aarch64)"
        NEONOps["16-byte chunks\nparallel"]
NEONTime["~40-60% time"]
end
    
 
   ScalarOps --> ScalarTime
 
   SSE2Ops --> SSE2Time
 
   AVX2Ops --> AVX2Time
 
   NEONOps --> NEONTime

Hardware Acceleration Impact

The use of SIMD instructions in XXH3_64 significantly improves hash computation speed:

Performance gains vary by workload but are most significant for batch operations that process many keys.

Sources: README.md:160-165 src/storage_engine/data_store.rs:839-842


Error Handling and Collision Management

Hash Collision Detection

The dual-layer verification (64-bit hash + 16-bit tag) provides strong collision resistance:

  • Probability of hash collision : ~1 in 2^64 for XXH3_64
  • Probability of tag collision given hash collision : ~1 in 2^16
  • Combined probability : ~1 in 2^80

Write-Time Collision Handling

If a collision is detected during a write operation (same hash but different tag), the batch write operation fails entirely:

This prevents inconsistent index states and ensures data integrity.

Sources: src/storage_engine/data_store.rs:245-251

Read-Time Collision Handling

If a tag mismatch is detected during a read, the system logs a warning and returns None:

Sources: src/storage_engine/data_store.rs:513-521


Summary

The key indexing and hashing system in SIMD R Drive provides:

  1. Fast lookups : O(1) hash-based access to entries
  2. Hardware acceleration : Automatic SIMD optimization on SSE2, AVX2, and NEON platforms
  3. Collision resistance : Dual-layer verification with 64-bit hashes and 16-bit tags
  4. Thread safety : Concurrent reads with exclusive writes via RwLock
  5. Low memory overhead : 16 bytes per unique key

This design enables efficient storage operations even for datasets with millions of entries, while maintaining data integrity through robust collision detection.

Sources: src/storage_engine/data_store.rs:1-1183 README.md:158-168

DeepWiki GitHub

Network Layer and RPC

Relevant source files

Purpose and Scope

The network layer provides remote access to the SIMD R Drive storage engine over WebSocket connections using the Muxio RPC framework. This document covers the core RPC infrastructure, service definitions, and native Rust client implementation.

For WebSocket server deployment and configuration details, see WebSocket Server. For Muxio RPC protocol internals, see Muxio RPC Framework. For native Rust client usage, see Native Rust Client. For Python client integration, see Python Integration.

Architecture Overview

The network layer follows a symmetric client-server architecture built on three core crates that work together to enable remote DataStore access.

graph TB
    subgraph "Client Applications"
        RustApp["Rust Application"]
PyApp["Python Application\n(see section 4)"]
end
    
    subgraph "simd-r-drive-ws-client"
        WSClient["DataStoreWsClient"]
ClientCaller["RPC Caller\nmuxio-rpc-service-caller"]
ClientRuntime["muxio-tokio-rpc-client"]
end
    
    subgraph "Network Transport"
        WS["WebSocket\ntokio-tungstenite\nBinary Frames"]
end
    
    subgraph "simd-r-drive-ws-server"
        WSServer["Axum Server\nHTTP/WS Upgrade"]
ServerRuntime["muxio-tokio-rpc-server"]
ServerEndpoint["RPC Endpoint\nmuxio-rpc-service-endpoint"]
end
    
    subgraph "simd-r-drive-muxio-service-definition"
        ServiceDef["DataStoreService\nRPC Interface"]
Bitcode["bitcode serialization"]
end
    
    subgraph "Core Storage"
        DataStore["DataStore"]
end
    
 
   RustApp --> WSClient
 
   PyApp -.-> WSClient
 
   WSClient --> ClientCaller
 
   ClientCaller --> ClientRuntime
 
   ClientRuntime --> ServiceDef
 
   ClientRuntime --> WS
    
 
   WS --> WSServer
 
   WSServer --> ServerRuntime
 
   ServerRuntime --> ServiceDef
 
   ServerRuntime --> ServerEndpoint
 
   ServerEndpoint --> DataStore
    
 
   ServiceDef --> Bitcode

Component Architecture

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:1-23 experiments/simd-r-drive-ws-client/Cargo.toml:1-22 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-17

CrateRoleKey Dependencies
simd-r-drive-muxio-service-definitionShared RPC contract defining service interfacebitcode, muxio-rpc-service
simd-r-drive-ws-serverWebSocket server hosting DataStoremuxio-tokio-rpc-server, axum, tokio
simd-r-drive-ws-clientNative Rust client for remote accessmuxio-tokio-rpc-client, muxio-rpc-service-caller

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22 experiments/simd-r-drive-ws-client/Cargo.toml:14-21 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:13-16

Service Definition and Contract

The simd-r-drive-muxio-service-definition crate defines the RPC interface as a shared contract between clients and servers. This crate must be used by both sides to ensure protocol compatibility.

DataStoreService Interface

The service definition declares available RPC methods that map to DataStore operations. The Muxio framework generates type-safe client and server stubs from this definition.

Typical Service Methods:

  • read(key) - Retrieve payload by key
  • write(key, payload) - Store payload
  • delete(key) - Remove entry
  • exists(key) - Check key existence
  • batch_read(keys) - Bulk retrieval
  • batch_write(entries) - Bulk insertion
  • stream_* - Streaming operations

Sources: experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-17

Bitcode Serialization

All RPC messages use bitcode for binary serialization, providing compact representation and fast encoding/decoding.

Advantages:

  • Zero-copy deserialization where possible
  • Smaller message sizes than JSON or bincode
  • Strong type safety via Rust's type system
  • Efficient handling of large payloads

Sources: Cargo.lock:392-413 experiments/simd-r-drive-muxio-service-definition/Cargo.toml14

Communication Protocol

Request-Response Flow

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:14-21 experiments/simd-r-drive-ws-server/Cargo.toml:13-22

WebSocket Transport Layer

The network layer uses tokio-tungstenite for WebSocket communication over a Tokio async runtime.

Connection Properties:

  • Binary WebSocket frames (not text)
  • Persistent bidirectional connection
  • Automatic ping/pong for keepalive
  • Multiplexed requests via Muxio protocol

Protocol Stack:

┌─────────────────────────────────┐
│  DataStore Operations           │
├─────────────────────────────────┤
│  Muxio RPC (Service Methods)    │
├─────────────────────────────────┤
│  Bitcode Serialization           │
├─────────────────────────────────┤
│  WebSocket Binary Frames         │
├─────────────────────────────────┤
│  HTTP/1.1 Upgrade               │
├─────────────────────────────────┤
│  TCP (tokio runtime)            │
└─────────────────────────────────┘

Sources: Cargo.lock:305-339 (axum), Cargo.lock:1302-1317 (muxio-tokio-rpc-client), Cargo.lock:1320-1335 (muxio-tokio-rpc-server)

Muxio RPC Framework

The Muxio framework provides the core RPC abstraction layer, handling request routing, response matching, and connection multiplexing.

graph LR
    subgraph "Client Side"
        SC["muxio-rpc-service-caller\nMethod Invocation"]
TC["muxio-tokio-rpc-client\nTransport Handler"]
end
    
    subgraph "Shared"
        MRS["muxio-rpc-service\nCore Abstractions"]
SD["Service Definition"]
end
    
    subgraph "Server Side"
        TS["muxio-tokio-rpc-server\nTransport Handler"]
SE["muxio-rpc-service-endpoint\nMethod Routing"]
end
    
 
   SC --> MRS
 
   SC --> TC
 
   TC --> SD
    
 
   TS --> MRS
 
   TS --> SE
 
   SE --> SD
    
 
   MRS --> SD

Framework Components

Sources: Cargo.lock:1250-1271 (muxio), Cargo.lock:1261-1271 (muxio-rpc-service), Cargo.lock:1274-1284 (muxio-rpc-service-caller), Cargo.lock:1287-1299 (muxio-rpc-service-endpoint)

ComponentPurposeKey Features
muxioCore multiplexing protocolRequest/response correlation, concurrent requests
muxio-rpc-serviceRPC service abstractionsService trait definitions, async method dispatch
muxio-rpc-service-callerClient-side invocationType-safe method calls, future-based API
muxio-rpc-service-endpointServer-side routingRequest demultiplexing, method dispatch
muxio-tokio-rpc-clientTokio client runtimeWebSocket connection management
muxio-tokio-rpc-serverTokio server runtimeAxum integration, connection handling

Sources: Cargo.lock:1250-1335

Request Multiplexing

The Muxio protocol assigns unique request IDs to enable multiple concurrent RPC calls over a single WebSocket connection.

Multiplexing Benefits:

  • Single persistent connection reduces overhead
  • Concurrent requests improve throughput
  • Out-of-order responses supported
  • Connection pooling not required

Sources: Cargo.lock:1250-1258

Client-Server Integration

Server-Side Implementation

The server crate (simd-r-drive-ws-server) integrates Axum for HTTP/WebSocket handling with the Muxio RPC server runtime.

Server Architecture:

  1. Axum HTTP server accepts connections
  2. WebSocket upgrade on /ws endpoint
  3. muxio-tokio-rpc-server handles RPC protocol
  4. muxio-rpc-service-endpoint routes to DataStore methods
  5. Responses serialized with bitcode and returned

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22

Client-Side Implementation

The client crate (simd-r-drive-ws-client) provides DataStoreWsClient for remote DataStore access.

Client Architecture:

  1. Connect to WebSocket endpoint
  2. muxio-tokio-rpc-client manages connection
  3. muxio-rpc-service-caller provides method invocation API
  4. Requests serialized with bitcode
  5. Async methods return Future<Result<T>>

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:13-21

Dependency Graph

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22 experiments/simd-r-drive-ws-client/Cargo.toml:13-21 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:13-16

Performance Characteristics

Serialization Overhead

Bitcode provides efficient binary serialization with minimal overhead:

  • Small message payloads for keys and metadata
  • Zero-copy where possible for large payloads
  • Faster than JSON or traditional formats

Sources: Cargo.lock:392-413

Network Considerations

The WebSocket-based architecture has specific performance characteristics:

AspectImpact
Connection overheadOne-time WebSocket handshake
Request latencyNetwork RTT + serialization
ThroughputLimited by network bandwidth
Concurrent requestsMultiplexed over single connection
KeepaliveAutomatic ping/pong frames

Trade-offs:

  • Remote access enables distributed deployments
  • Network latency higher than direct DataStore access
  • Serialization/deserialization adds CPU overhead
  • Suitable for applications with network-tolerant latency requirements

Sources: Cargo.lock:305-339 (axum), Cargo.lock:1302-1335 (muxio tokio rpc)

Async Runtime Integration

All network components run on the Tokio async runtime, providing non-blocking I/O and efficient concurrency.

Tokio Integration:

  • WebSocket connections handled asynchronously
  • Multiple clients served concurrently
  • Non-blocking DataStore operations
  • Future-based API throughout

Sources: experiments/simd-r-drive-ws-server/Cargo.toml18 experiments/simd-r-drive-ws-client/Cargo.toml19

DeepWiki GitHub

WebSocket Server

Relevant source files

Purpose and Scope

This document covers the simd-r-drive-ws-server WebSocket server implementation, which provides remote access to the SIMD R Drive storage engine over WebSocket connections using the Muxio RPC framework. The server accepts connections from both native Rust clients (see Native Rust Client) and Python applications (see Python WebSocket Client API), routing RPC requests to the underlying DataStore.

For information about the RPC protocol and serialization format, see Muxio RPC Framework. For details on the core storage operations being exposed, see DataStore API.


Architecture Overview

The WebSocket server is built on the Axum web framework and Tokio async runtime, providing a high-performance, concurrent RPC endpoint for storage operations. The server acts as a thin network wrapper around the DataStore, translating WebSocket messages into storage API calls.

graph TB
    CLI["Server CLI\n(clap parser)"]
Axum["Axum Web Framework\nHTTP/WebSocket handling"]
MuxioServer["muxio-tokio-rpc-server\nRPC server runtime"]
Endpoint["muxio-rpc-service-endpoint\nRequest routing"]
ServiceDef["simd-r-drive-muxio-service-definition\nRPC contract"]
DataStore["simd-r-drive::DataStore\nStorage engine"]
Tokio["Tokio Runtime\nAsync executor"]
Tungstenite["tokio-tungstenite\nWebSocket protocol"]
CLI --> Axum
 
   Axum --> MuxioServer
 
   MuxioServer --> Endpoint
 
   MuxioServer --> Tungstenite
 
   Endpoint --> ServiceDef
 
   Endpoint --> DataStore
    
 
   Axum -.-> Tokio
 
   MuxioServer -.-> Tokio
 
   Tungstenite -.-> Tokio

Component Stack

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:1-23 Cargo.lock:305-339 Cargo.lock:1320-1335


Crate Structure

The server is located in the experiments workspace and has a minimal dependency footprint focused on networking and RPC:

DependencyPurposeVersion Source
simd-r-driveCore storage engineworkspace
simd-r-drive-muxio-service-definitionRPC service contractworkspace
muxio-tokio-rpc-serverRPC server runtimeworkspace (0.9.0-alpha)
muxio-rpc-serviceRPC abstractionsworkspace (0.9.0-alpha)
axumWeb framework0.8.4
tokioAsync runtimeworkspace
clapCLI argument parsingworkspace
tracing / tracing-subscriberStructured loggingworkspace

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22 Cargo.lock:305-339 Cargo.lock:1320-1335


Server Initialization Flow

The server follows a standard initialization pattern: parse CLI arguments, configure logging, initialize the DataStore, and start the WebSocket server.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22

sequenceDiagram
    participant Main as "main()"
    participant CLI as "clap::Parser"
    participant Tracing as "tracing_subscriber"
    participant DS as "DataStore::open()"
    participant RPC as "muxio_tokio_rpc_server"
    participant Axum as "axum::serve()"
    
    Main->>CLI: Parse CLI arguments
    CLI-->>Main: ServerArgs{port, path, log_level}
    
    Main->>Tracing: init() with env_filter
    Note over Tracing: Configure RUST_LOG levels
    
    Main->>DS: open(data_file_path)
    DS-->>Main: DataStore instance
    
    Main->>RPC: Create endpoint with DataStore
    RPC-->>Main: ServiceEndpoint
    
    Main->>Axum: bind() and serve()
    Note over Axum: Listen on 0.0.0.0:port
    Axum->>Axum: Accept WebSocket connections

Request Processing Pipeline

When a client connects and sends RPC requests, the server processes them through multiple layers before reaching the storage engine.

Sources: Cargo.lock:305-339 Cargo.lock:1287-1299 Cargo.lock:1320-1335

graph LR
    Client["WebSocket Client"]
WS["tokio-tungstenite\nWebSocket frame"]
Axum["Axum Router\nRoute: /ws"]
Server["muxio-tokio-rpc-server\nMessage decode"]
Router["muxio-rpc-service-endpoint\nMethod dispatch"]
Handler["Service Handler\n(read/write/delete)"]
DS["DataStore\nStorage operation"]
Client -->|Binary frame| WS
 
   WS --> Axum
 
   Axum --> Server
 
   Server -->|Bitcode decode| Router
 
   Router --> Handler
 
   Handler --> DS
 
   DS -.->|Result| Handler
 
   Handler -.-> Router
 
   Router -.->|Bitcode encode| Server
 
   Server -.-> WS
 
   WS -.->|Binary frame| Client

RPC Service Definition Integration

The server uses the simd-r-drive-muxio-service-definition crate to define the RPC interface contract. This crate acts as a shared dependency between the server and all clients, ensuring type-safe communication.

graph TB
    subgraph "simd-r-drive-muxio-service-definition"
        Service["Service Trait\nRPC method definitions"]
Types["Request/Response Types\nBitcode serializable"]
Bitcode["bitcode crate\nBinary serialization"]
end
    
    subgraph "Server Side"
        Endpoint["muxio-rpc-service-endpoint\nImplements Service Trait"]
Handler["Request Handlers\nCall DataStore methods"]
end
    
    subgraph "Client Side"
        Caller["muxio-rpc-service-caller\nInvokes Service methods"]
ClientImpl["ws-client implementation"]
end
    
 
   Service --> Endpoint
 
   Service --> Caller
 
   Types --> Bitcode
 
   Endpoint --> Handler
 
   Caller --> ClientImpl

Service Definition Structure

Sources: experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-17 experiments/simd-r-drive-ws-server/Cargo.toml:14-17

The service definition uses the bitcode crate for efficient binary serialization, providing compact message sizes and high throughput compared to JSON-based protocols.

Sources: experiments/simd-r-drive-muxio-service-definition/Cargo.toml:14-15 Cargo.lock:392-402


graph TD
    Server["simd-r-drive-ws-server"]
Server --> Axum["axum 0.8.4\nWeb framework"]
Server --> MuxioServer["muxio-tokio-rpc-server\nRPC runtime"]
Server --> ServiceDef["simd-r-drive-muxio-service-definition\nRPC contract"]
Server --> DataStore["simd-r-drive\nStorage engine"]
Server --> Tokio["tokio\nAsync runtime"]
Server --> Clap["clap\nCLI parsing"]
Server --> Tracing["tracing + tracing-subscriber\nLogging"]
Axum --> Hyper["hyper\nHTTP implementation"]
Axum --> TokioTung["tokio-tungstenite\nWebSocket protocol"]
MuxioServer --> Muxio["muxio\nCore RPC abstractions"]
MuxioServer --> Endpoint["muxio-rpc-service-endpoint\nRouting"]
ServiceDef --> Bitcode["bitcode\nSerialization"]
ServiceDef --> MuxioService["muxio-rpc-service\nService traits"]
Hyper --> Tokio
 
   TokioTung --> Tokio
 
   Endpoint --> Tokio

Dependency Graph

The server's dependency structure shows clear separation between web framework, RPC layer, and storage:

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22 Cargo.lock:305-339 Cargo.lock:1320-1335


Configuration and CLI

The server is configured via command-line arguments using the clap crate's derive API. The server binary accepts the following parameters:

ArgumentTypeDefaultDescription
--portu168080Port number to bind
--pathStringRequiredPath to DataStore file
--log-levelString"info"Tracing log level (error/warn/info/debug/trace)

The server supports the RUST_LOG environment variable through tracing-subscriber with env-filter for fine-grained logging control.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:19-22 experiments/simd-r-drive-ws-server/Cargo.toml21


graph TB
    subgraph "Tokio Runtime"
        Scheduler["Work-Stealing Scheduler\nThread pool"]
end
    
    subgraph "Connection Handlers"
        Conn1["WebSocket Task 1\nRPC message loop"]
Conn2["WebSocket Task 2\nRPC message loop"]
ConnN["WebSocket Task N\nRPC message loop"]
end
    
    subgraph "Shared State"
        DS["Arc&lt;DataStore&gt;\nThread-safe storage"]
Endpoint["Arc&lt;ServiceEndpoint&gt;\nRequest router"]
end
    
 
   Scheduler --> Conn1
 
   Scheduler --> Conn2
 
   Scheduler --> ConnN
    
 
   Conn1 --> Endpoint
 
   Conn2 --> Endpoint
 
   ConnN --> Endpoint
    
 
   Endpoint --> DS
    
    style DS fill:#f9f9f9
    style Endpoint fill:#f9f9f9

Concurrency Model

The server leverages Tokio's work-stealing scheduler to handle multiple concurrent WebSocket connections efficiently:

Each WebSocket connection runs as an independent async task, with the DataStore wrapped in Arc for shared access. The DataStore's internal locking (see Concurrency and Thread Safety) ensures safe concurrent reads and serialized writes.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml18 Cargo.lock:305-339


Transport Protocol

The server uses binary WebSocket frames over TCP, with the following characteristics:

PropertyValueNotes
ProtocolWebSocket (RFC 6455)Via tokio-tungstenite
Message FormatBinary framesNot text frames
SerializationBitcodeCompact binary format
FramingMessage-per-frameEach RPC call is one frame
MultiplexingMuxio protocolRequest/response correlation

The binary format and bitcode serialization provide significantly better performance than text-based protocols like JSON over WebSocket.

Sources: Cargo.lock:305-339 experiments/simd-r-drive-muxio-service-definition/Cargo.toml14


Error Handling

The server implements error handling at multiple layers:

  1. WebSocket Layer : Connection errors, protocol violations handled by tokio-tungstenite
  2. RPC Layer : Serialization errors, invalid method calls handled by muxio-tokio-rpc-server
  3. Storage Layer : I/O errors, validation failures propagated from DataStore
  4. Tracing : All errors logged with structured context via tracing spans

Errors are serialized back to clients as RPC error responses, preserving error context across the network boundary.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:19-20 Cargo.lock:1320-1335


Performance Characteristics

The server is designed for high-throughput operation with minimal overhead:

  • Zero-Copy Reads : DataStore's EntryHandle (see Memory Management and Zero-Copy Access) allows serving read responses without copying payload data
  • Async I/O : Tokio's epoll/kqueue-based I/O enables efficient handling of thousands of concurrent connections
  • Binary Protocol : Bitcode serialization reduces CPU overhead and bandwidth usage compared to text formats
  • Write Batching : Clients can use batch operations (see Write and Read Modes) to amortize RPC overhead

Sources: experiments/simd-r-drive-ws-server/Cargo.toml14 experiments/simd-r-drive-muxio-service-definition/Cargo.toml14


Deployment Considerations

When deploying the WebSocket server, consider:

  1. Single DataStore Instance : The server opens one DataStore file. Multiple servers require separate files or external coordination
  2. Port Binding : Default bind address is 0.0.0.0 (all interfaces). Use firewall rules or reverse proxy for access control
  3. No TLS : The server does not implement TLS. Use a reverse proxy (nginx, HAProxy) for encrypted connections
  4. Resource Limits : Memory usage scales with DataStore size (memory-mapped file) plus per-connection buffers
  5. Graceful Shutdown : Tokio runtime handles SIGTERM/SIGINT for clean connection closure

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:1-23


Experimental Status

As indicated by its location in the experiments/ workspace directory, this server is currently experimental and subject to breaking changes. The API surface and configuration options may evolve as the Muxio RPC framework stabilizes.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml2 experiments/simd-r-drive-ws-server/Cargo.toml11

DeepWiki GitHub

Muxio RPC Framework

Relevant source files

Purpose and Scope

This document describes the Muxio RPC (Remote Procedure Call) framework as implemented in SIMD R Drive for remote storage access over WebSocket connections. The framework provides a type-safe, multiplexed communication protocol using bitcode serialization for efficient binary data transfer.

For information about the WebSocket server implementation, see WebSocket Server. For the native Rust client implementation, see Native Rust Client. For Python client integration, see Python WebSocket Client API.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:1-23 experiments/simd-r-drive-ws-client/Cargo.toml:1-22 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-17


Architecture Overview

The Muxio RPC framework consists of multiple layers that work together to provide remote procedure calls over WebSocket connections:

Muxio RPC Framework Layer Architecture

graph TB
    subgraph "Client Application Layer"
        App["Application Code"]
end
    
    subgraph "Client RPC Stack"
        Caller["muxio-rpc-service-caller\nMethod Invocation"]
ClientRuntime["muxio-tokio-rpc-client\nWebSocket Client Runtime"]
end
    
    subgraph "Shared Contract"
        ServiceDef["simd-r-drive-muxio-service-definition\nService Interface Contract\nMethod Signatures"]
Bitcode["bitcode\nBinary Serialization"]
end
    
    subgraph "Server RPC Stack"
        ServerRuntime["muxio-tokio-rpc-server\nWebSocket Server Runtime"]
Endpoint["muxio-rpc-service-endpoint\nRequest Router"]
end
    
    subgraph "Server Application Layer"
        Impl["DataStore Implementation"]
end
    
    subgraph "Core Framework"
        Core["muxio-rpc-service\nBase RPC Traits & Types"]
end
    
 
   App --> Caller
 
   Caller --> ClientRuntime
 
   ClientRuntime --> ServiceDef
 
   ClientRuntime --> Bitcode
 
   ClientRuntime --> Core
    
 
   ServiceDef --> Bitcode
 
   ServiceDef --> Core
    
 
   ServerRuntime --> ServiceDef
 
   ServerRuntime --> Bitcode
 
   ServerRuntime --> Core
 
   ServerRuntime --> Endpoint
 
   Endpoint --> Impl
    
    style ServiceDef fill:#f9f9f9,stroke:#333,stroke-width:2px

The framework is organized into distinct layers:

LayerCratesResponsibility
Core Frameworkmuxio-rpc-serviceBase traits, types, and RPC protocol definitions
Service Definitionsimd-r-drive-muxio-service-definitionShared interface contract between client and server
SerializationbitcodeEfficient binary encoding/decoding of messages
Client Runtimemuxio-tokio-rpc-client, muxio-rpc-service-callerWebSocket client, method invocation, request management
Server Runtimemuxio-tokio-rpc-server, muxio-rpc-service-endpointWebSocket server, request routing, response handling

Sources: Cargo.lock:1250-1336 experiments/simd-r-drive-ws-server/Cargo.toml:14-17 experiments/simd-r-drive-ws-client/Cargo.toml:14-21


Core Framework Components

muxio-rpc-service

The muxio-rpc-service crate provides the foundational abstractions for the RPC system:

Core RPC Framework Types

graph LR
    subgraph "muxio-rpc-service Core Types"
        RpcService["RpcService Trait\nService Interface"]
Request["Request Type\nmethod_id + payload"]
Response["Response Type\nresult or error"]
Error["RpcError\nError Handling"]
end
    
 
   RpcService -->|defines| Request
 
   RpcService -->|defines| Response
 
   Response -->|contains| Error

Key components in muxio-rpc-service:

ComponentTypePurpose
RpcServiceTraitDefines the service interface with method dispatch
RequestStructContains method ID and serialized payload
ResponseEnumSuccess with payload or error variant
method_idHashXXH3 hash of method signature for routing

Sources: Cargo.lock:1261-1272


Service Definition Layer

simd-r-drive-muxio-service-definition

The service definition crate serves as the shared contract between clients and servers. It defines the exact interface of the DataStore RPC service:

DataStore RPC Service Methods

graph TB
    subgraph "Service Definition Structure"
        ServiceDef["DataStoreService\nService Definition"]
Methods["Method Definitions"]
Types["Request/Response Types"]
Write["write\n(key, payload) → offset"]
Read["read\n(key) → Option<bytes>"]
Delete["delete\n(key) → bool"]
Exists["exists\n(key) → bool"]
BatchWrite["batch_write\n(entries) → Vec<offset>"]
BatchRead["batch_read\n(keys) → Vec<Option<bytes>>"]
end
    
 
   ServiceDef --> Methods
 
   ServiceDef --> Types
    
 
   Methods --> Write
 
   Methods --> Read
 
   Methods --> Delete
 
   Methods --> Exists
 
   Methods --> BatchWrite
 
   Methods --> BatchRead
    
 
   Types -->|uses| Bitcode["bitcode Serialization"]

The service definition establishes a type-safe contract for all operations:

MethodRequest TypeResponse TypeDescription
write(Vec<u8>, Vec<u8>)Result<u64, Error>Write key-value pair, return offset
readVec<u8>Result<Option<Vec<u8>>, Error>Read value by key
deleteVec<u8>Result<bool, Error>Delete entry, return success
existsVec<u8>Result<bool, Error>Check key existence
batch_writeVec<(Vec<u8>, Vec<u8>)>Result<Vec<u64>, Error>Write multiple entries
batch_readVec<Vec<u8>>Result<Vec<Option<Vec<u8>>>, Error>Read multiple entries
flowchart LR
    Signature["Method Signature\n'write(key,payload)'"]
Hash["XXH3 Hash"]
MethodID["method_id: u64"]
Signature -->|hash| Hash
 
   Hash --> MethodID
 
   MethodID -->|used for routing| Router["Request Router"]

Method ID Generation

Each method is identified by an XXH3 hash of its signature, enabling fast routing without string comparisons:

Method Identification Flow

Sources: experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-17 Cargo.lock:1261-1272


graph LR
    subgraph "Bitcode Serialization Pipeline"
        RustType["Rust Type\n(Request/Response)"]
Encode["bitcode::encode"]
Binary["Binary Payload\nCompact Format"]
Decode["bitcode::decode"]
RustType2["Rust Type\n(Reconstructed)"]
end
    
 
   RustType -->|serialize| Encode
 
   Encode --> Binary
 
   Binary -->|deserialize| Decode
 
   Decode --> RustType2

Bitcode Serialization

The framework uses the bitcode crate for efficient binary serialization with the following characteristics:

Serialization Features

Bitcode Encoding/Decoding Pipeline

FeatureDescriptionBenefit
Zero-copy deserializationReferences data without copying when possibleMinimal overhead for large payloads
Compact encodingSpace-efficient binary formatReduced network bandwidth
Type safetyCompile-time type checkingPrevents serialization errors
PerformanceFaster than JSON/MessagePackLower CPU overhead

Integration with RPC

The serialization is integrated at multiple points:

  1. Request serialization : Client encodes method arguments to binary
  2. Wire transfer : Binary payload sent over WebSocket
  3. Response serialization : Server encodes return values to binary
  4. Deserialization : Both sides decode binary to Rust types

Sources: Cargo.lock:392-414 experiments/simd-r-drive-muxio-service-definition/Cargo.toml14


Client-Side Components

muxio-rpc-service-caller

The caller component provides the client-side method invocation interface:

Client Method Invocation Flow

Key responsibilities:

  • Method call marshalling
  • Request ID generation
  • Response awaiting and matching
  • Error propagation

Sources: Cargo.lock:1274-1285 experiments/simd-r-drive-ws-client/Cargo.toml18

muxio-tokio-rpc-client

The client runtime handles WebSocket communication and multiplexing:

Client Runtime Request/Response Multiplexing

The client runtime provides:

FeatureImplementationPurpose
Connection managementSingle WebSocket connectionPersistent connection
Request multiplexingMultiple concurrent requestsHigh throughput
Response routingMap of pending requestsMatch responses to callers
Async/await supportTokio-based futuresNon-blocking I/O

Sources: Cargo.lock:1302-1318 experiments/simd-r-drive-ws-client/Cargo.toml16


flowchart TB
    subgraph "Server Request Processing"
        Receive["Receive Request\nWebSocket"]
Deserialize["Deserialize\nbitcode::decode"]
ExtractID["Extract method_id"]
Route["Route to Handler"]
Execute["Execute Method"]
Serialize["Serialize Response\nbitcode::encode"]
Send["Send Response\nWebSocket"]
end
    
 
   Receive --> Deserialize
 
   Deserialize --> ExtractID
 
   ExtractID --> Route
 
   Route --> Execute
 
   Execute --> Serialize
 
   Serialize --> Send

Server-Side Components

muxio-rpc-service-endpoint

The endpoint component routes incoming RPC requests to service implementations:

Server Request Processing Pipeline

Endpoint responsibilities:

ResponsibilityDescription
Request routingMaps method_id to handler function
ExecutionInvokes the appropriate service method
Error handlingCatches and serializes errors
Response formattingWraps results in response envelope

Sources: Cargo.lock:1287-1300 experiments/simd-r-drive-ws-server/Cargo.toml17

muxio-tokio-rpc-server

The server runtime manages WebSocket connections and request dispatching:

Server Runtime Connection Management

Key features:

FeatureImplementationPurpose
Multi-client supportOne task pair per connectionConcurrent clients
Backpressure handlingBounded channelsPrevent memory exhaustion
Graceful shutdownSignal-based terminationClean connection closure
Error isolationPer-connection error handlingFault tolerance

Sources: Cargo.lock:1320-1336 experiments/simd-r-drive-ws-server/Cargo.toml16


Request/Response Flow

Complete RPC Call Sequence

End-to-End RPC Call Flow

Message Format

The wire protocol uses a structured binary format:

FieldTypeSizeDescription
request_idu648 bytesUnique request identifier
method_idu648 bytesXXH3 hash of method signature
payload_lenu324 bytesLength of payload
payload[u8]VariableBitcode-encoded arguments/result

Sources: All component sources above


Error Handling

The framework provides comprehensive error handling across the RPC boundary:

RPC Error Classification and Propagation

Error Categories

CategoryOriginHandling
Serialization errorsBitcode encoding/decoding failureLogged and returned as RpcError
Network errorsWebSocket connection issuesAutomatic reconnect or error propagation
Application errorsDataStore operation failuresSerialized and returned to client
Timeout errorsRequest took too longClient-side timeout with error result

Error Recovery

The framework implements several recovery strategies:

  • Connection loss : Client automatically attempts reconnection
  • Request timeout : Client cancels pending request after configured duration
  • Serialization failure : Error logged and generic error returned
  • Invalid method ID : Server returns "method not found" error

Sources: Cargo.lock:1261-1336


Performance Characteristics

The Muxio RPC framework is optimized for high-performance remote storage access:

MetricCharacteristicImpact
Serialization overhead~50-100 ns for typical payloadsMinimal CPU impact
Request multiplexingThousands of concurrent requestsHigh throughput
Binary protocolCompact wire formatReduced bandwidth usage
Zero-copy deserializationDirect memory referencesLower latency for large payloads

The use of bitcode serialization and WebSocket binary frames minimizes overhead compared to text-based protocols like JSON over HTTP. The multiplexed architecture allows clients to issue multiple concurrent requests without blocking, essential for high-performance batch operations.

Sources: Cargo.lock:392-414 Cargo.lock:1250-1336

DeepWiki GitHub

Native Rust Client

Relevant source files

Purpose and Scope

The simd-r-drive-ws-client crate provides a native Rust client library for remote access to the SIMD R Drive storage engine via WebSocket connections. This client enables Rust applications to interact with a remote DataStore instance through the Muxio RPC framework with bitcode serialization.

This document covers the native Rust client implementation. For the WebSocket server that this client connects to, see WebSocket Server. For the Muxio RPC protocol details, see Muxio RPC Framework. For the Python bindings that wrap this client, see Python WebSocket Client API.

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:1-22

Architecture Overview

The native Rust client is structured as a thin wrapper around the Muxio RPC client infrastructure, providing type-safe access to remote DataStore operations.

Key Components:

graph TB
    subgraph "Application Layer"
        UserApp["User Application\nRust Code"]
end
    
    subgraph "simd-r-drive-ws-client Crate"
        ClientAPI["Client API\nDataStoreReader/Writer Traits"]
WsClient["WebSocket Client\nConnection Management"]
end
    
    subgraph "RPC Infrastructure"
        ServiceCaller["muxio-rpc-service-caller\nMethod Invocation"]
TokioRpcClient["muxio-tokio-rpc-client\nTransport Layer"]
end
    
    subgraph "Shared Contract"
        ServiceDef["simd-r-drive-muxio-service-definition\nRPC Interface"]
Bitcode["bitcode\nSerialization"]
end
    
    subgraph "Network"
        WsConnection["WebSocket Connection\ntokio-tungstenite"]
end
    
 
   UserApp --> ClientAPI
 
   ClientAPI --> WsClient
 
   WsClient --> ServiceCaller
 
   ServiceCaller --> TokioRpcClient
 
   ServiceCaller --> ServiceDef
 
   TokioRpcClient --> Bitcode
 
   TokioRpcClient --> WsConnection
    
    style ClientAPI fill:#f9f9f9,stroke:#333,stroke-width:2px
    style ServiceDef fill:#f9f9f9,stroke:#333,stroke-width:2px
ComponentCratePurpose
Client APIsimd-r-drive-ws-clientPublic interface implementing DataStore traits
Service Callermuxio-rpc-service-callerRPC method invocation and request routing
RPC Clientmuxio-tokio-rpc-clientWebSocket transport and message handling
Service Definitionsimd-r-drive-muxio-service-definitionShared RPC contract and type definitions
Async RuntimetokioAsynchronous I/O and task execution

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:13-21 Cargo.lock:1302-1318

Client API Structure

The client implements the same DataStoreReader and DataStoreWriter traits as the local DataStore, enabling transparent remote access with minimal API differences.

Core Traits:

graph LR
    subgraph "Trait Implementations"
        Reader["DataStoreReader\nread()\nexists()\nbatch_read()"]
Writer["DataStoreWriter\nwrite()\ndelete()\nbatch_write()"]
end
    
    subgraph "Client Implementation"
        WsClient["WebSocket Client\nAsync Methods"]
ConnState["Connection State\nURL, Options"]
end
    
    subgraph "RPC Layer"
        Serializer["Request Serialization\nbitcode"]
Caller["Service Caller\nCall Routing"]
Deserializer["Response Deserialization\nbitcode"]
end
    
 
   Reader --> WsClient
 
   Writer --> WsClient
 
   WsClient --> ConnState
 
   WsClient --> Serializer
 
   Serializer --> Caller
 
   Caller --> Deserializer
    
    style Reader fill:#f9f9f9,stroke:#333,stroke-width:2px
    style Writer fill:#f9f9f9,stroke:#333,stroke-width:2px
  • DataStoreReader : Read-only operations (read, exists, batch_read, iteration)
  • DataStoreWriter : Write operations (write, delete, batch_write)
  • async-trait : All methods are asynchronous, requiring a Tokio runtime

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:14-21 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-16

Connection Management

The client manages persistent WebSocket connections to the remote server with automatic reconnection and error handling.

Connection Lifecycle:

sequenceDiagram
    participant App as "Application"
    participant Client as "WebSocket Client"
    participant Transport as "muxio-tokio-rpc-client"
    participant Server as "Remote Server"
    
    Note over App,Server: Connection Establishment
    App->>Client: connect(url)
    Client->>Transport: create WebSocket connection
    Transport->>Server: WebSocket handshake
    Server-->>Transport: connection established
    Transport-->>Client: client ready
    Client-->>App: connected client
    
    Note over App,Server: Normal Operation
    App->>Client: read(key)
    Client->>Transport: serialize request
    Transport->>Server: send via WebSocket
    Server-->>Transport: response data
    Transport-->>Client: deserialize response
    Client-->>App: return result
    
    Note over App,Server: Error Handling
    Server-->>Transport: connection lost
    Transport-->>Client: connection error
    Client->>Transport: reconnection attempt
    Transport->>Server: reconnect
  1. Initialization : Client connects to server URL with connection options
  2. Authentication : Optional authentication via Muxio RPC mechanisms
  3. Active State : Client maintains persistent WebSocket connection
  4. Error Recovery : Automatic reconnection on transient failures
  5. Shutdown : Graceful connection termination

Sources: Cargo.lock:1302-1318 experiments/simd-r-drive-ws-client/Cargo.toml:16-19

graph TB
    subgraph "Client Side"
        Method["Client Method Call\nread/write/delete"]
ReqBuilder["Request Builder\nCreate RPC Request"]
Serializer["bitcode Serialization\nBinary Encoding"]
Sender["WebSocket Send\nBinary Frame"]
end
    
    subgraph "Network"
        WsFrame["WebSocket Frame\nBinary Message"]
end
    
    subgraph "Server Side"
        Receiver["WebSocket Receive\nBinary Frame"]
Deserializer["bitcode Deserialization\nBinary Decoding"]
Handler["Request Handler\nExecute DataStore Operation"]
Response["Response Builder\nCreate RPC Response"]
end
    
 
   Method --> ReqBuilder
 
   ReqBuilder --> Serializer
 
   Serializer --> Sender
 
   Sender --> WsFrame
 
   WsFrame --> Receiver
 
   Receiver --> Deserializer
 
   Deserializer --> Handler
 
   Handler --> Response
 
   Response --> Serializer
    
    style Method fill:#f9f9f9,stroke:#333,stroke-width:2px
    style Handler fill:#f9f9f9,stroke:#333,stroke-width:2px

Request-Response Flow

All client operations follow a standardized request-response pattern through the Muxio RPC framework.

Request Structure:

FieldTypeDescription
Method IDu64XXH3 hash of method name from service definition
PayloadVecBitcode-serialized request parameters
Request IDu64Unique identifier for request-response matching

Response Structure:

FieldTypeDescription
Request IDu64Matches original request ID
StatusenumSuccess, Error, or specific error codes
PayloadVecBitcode-serialized response data or error

Sources: experiments/simd-r-drive-muxio-service-definition/Cargo.toml:14-15 Cargo.lock:392-402

Async Runtime Requirements

The client requires a Tokio async runtime for all operations. The async-trait crate enables async methods in trait implementations.

Runtime Configuration:

graph TB
    subgraph "Application"
        Main["#[tokio::main]\nasync fn main()"]
UserCode["User Code\nawait client.read()"]
end
    
    subgraph "Client"
        AsyncMethods["async-trait Methods\nDataStoreReader/Writer"]
TokioTasks["Tokio Tasks\nNetwork I/O"]
end
    
    subgraph "Tokio Runtime"
        Executor["Task Executor\nWork Stealing Scheduler"]
Reactor["I/O Reactor\nepoll/kqueue/IOCP"]
end
    
 
   Main --> UserCode
 
   UserCode --> AsyncMethods
 
   AsyncMethods --> TokioTasks
 
   TokioTasks --> Executor
 
   TokioTasks --> Reactor
    
    style AsyncMethods fill:#f9f9f9,stroke:#333,stroke-width:2px
    style Executor fill:#f9f9f9,stroke:#333,stroke-width:2px
  • Multi-threaded Runtime : Default for concurrent operations
  • Current-thread Runtime : Available for single-threaded use cases
  • Feature Flags : Requires tokio with rt-multi-thread and net features

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:19-21 Cargo.lock:279-287

Error Handling

The client propagates errors from multiple layers of the stack, providing detailed error information for debugging and recovery.

Error Types:

Error CategorySourceDescription
Connection Errorsmuxio-tokio-rpc-clientWebSocket connection failures, timeouts
Serialization ErrorsbitcodeInvalid data encoding/decoding
RPC Errorsmuxio-rpc-serviceService method errors, invalid requests
DataStore ErrorsRemote DataStoreStorage operation failures (key not found, write errors)

Error Propagation Flow:

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:17-18 Cargo.lock:1261-1271

Usage Patterns

Basic Connection and Operations

The client follows standard Rust async patterns for initialization and operation:

Concurrent Operations

The client supports concurrent operations through standard Tokio concurrency primitives:

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:19-21

graph TB
    subgraph "Service Definition"
        Methods["Method Definitions\nread, write, delete, etc."]
Types["Request/Response Types\nbitcode derive macros"]
MethodHash["Method ID Hashing\nXXH3 of method names"]
end
    
    subgraph "Client Usage"
        ClientImpl["Client Implementation\nUses defined methods"]
TypeSafety["Type Safety\nCompile-time checking"]
end
    
    subgraph "Server Usage"
        ServerImpl["Server Implementation\nHandles defined methods"]
Routing["Request Routing\nHash-based dispatch"]
end
    
 
   Methods --> ClientImpl
 
   Methods --> ServerImpl
 
   Types --> ClientImpl
 
   Types --> ServerImpl
 
   MethodHash --> Routing
 
   ClientImpl --> TypeSafety
    
    style Methods fill:#f9f9f9,stroke:#333,stroke-width:2px
    style Types fill:#f9f9f9,stroke:#333,stroke-width:2px

Integration with Service Definition

The client relies on the shared service definition crate for type-safe RPC communication.

Shared Contract Benefits:

  • Type Safety : Compile-time verification of request/response types
  • Version Compatibility : Client and server must use compatible service definitions
  • Method Resolution : XXH3 hash-based method identification
  • Serialization Schema : Consistent bitcode encoding across client and server

Sources: experiments/simd-r-drive-ws-client/Cargo.toml15 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-16

Performance Considerations

The native Rust client provides several performance advantages over alternative approaches:

Performance Characteristics:

AspectImplementationBenefit
Serializationbitcode binary encodingMinimal overhead, faster than JSON/MessagePack
ConnectionPersistent WebSocketAvoids HTTP handshake overhead
Async I/OTokio zero-copy operationsEfficient memory usage
Type SafetyCompile-time genericsZero runtime type checking cost
MultiplexingMuxio request pipeliningMultiple concurrent requests per connection

Memory Efficiency:

  • Zero-copy where possible through bitcode and WebSocket frames
  • Efficient buffer reuse in Tokio's I/O layer
  • Minimal allocation overhead compared to HTTP-based protocols

Throughput:

  • Supports request pipelining for high-throughput workloads
  • Concurrent operations through Tokio's work-stealing scheduler
  • Batch operations reduce round-trip overhead

Sources: Cargo.lock:392-402 Cargo.lock:1302-1318

Comparison with Direct Access

The WebSocket client provides remote access with different tradeoffs compared to direct DataStore usage:

FeatureDirect DataStoreWebSocket Client
Access PatternLocal file I/ONetwork I/O over WebSocket
Zero-Copy ReadsYes (via mmap)No (serialized over network)
LatencyMicrosecondsMilliseconds (network dependent)
ConcurrencyMulti-process safeNetwork-limited
DeploymentSingle machineDistributed architecture
SecurityFile system permissionsNetwork authentication

When to Use the Client:

  • Remote access to centralized storage
  • Microservice architectures requiring shared state
  • Language interoperability (via Python bindings)
  • Isolation of storage from compute workloads

When to Use Direct Access:

  • Single-machine deployments
  • Latency-critical applications
  • Maximum throughput requirements
  • Zero-copy read performance needs

Sources: experiments/simd-r-drive-ws-client/Cargo.toml14

Logging and Debugging

The client uses the tracing crate for structured logging and diagnostics.

Logging Levels:

  • TRACE : Detailed RPC message contents and serialization
  • DEBUG : Connection state changes, request/response flow
  • INFO : Connection establishment, disconnection events
  • WARN : Recoverable errors, retry attempts
  • ERROR : Unrecoverable errors, connection failures

Diagnostic Information:

  • Request/response timing
  • Serialized message sizes
  • Connection state transitions
  • Error context and stack traces

Sources: experiments/simd-r-drive-ws-client/Cargo.toml20 Cargo.lock:279-287

DeepWiki GitHub

Python Integration

Relevant source files

This page documents the Python bindings for SIMD R Drive, which provide a high-level Python interface to the storage engine via WebSocket RPC. The bindings are implemented in Rust using PyO3 and packaged as Python wheels using Maturin.

For details on the Python WebSocket Client API specifically, see Python WebSocket Client API. For information on building and distributing the Python package, see Building Python Bindings. For testing infrastructure, see Integration Testing.


Architecture Overview

The Python integration uses a multi-layer architecture that bridges Python code to the native Rust WebSocket client. The bindings are not pure Python—they are Rust code compiled to native extensions that Python can import.

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:1-63

graph TB
    subgraph "Python User Code"
        UserScript["User Application\n*.py files"]
Imports["from simd_r_drive_ws_client import DataStoreWsClient"]
end
    
    subgraph "Python Package Layer"
        InitPy["__init__.py\nPackage Exports"]
DataStoreWsClientPy["data_store_ws_client.py\nDataStoreWsClient class"]
TypeStubs["data_store_ws_client.pyi\nType Annotations"]
end
    
    subgraph "PyO3 Binding Layer"
        RustModule["simd_r_drive_ws_client_py\nRust Binary Module\n.so / .pyd"]
BaseClass["BaseDataStoreWsClient\n#[pyclass]"]
NamespaceHasherClass["NamespaceHasher\n#[pyclass]"]
end
    
    subgraph "Native Rust Implementation"
        WsClient["simd-r-drive-ws-client\nNative WebSocket Client"]
MuxioRPC["muxio-tokio-rpc-client\nRPC Client Runtime"]
end
    
 
   UserScript --> Imports
 
   Imports --> InitPy
 
   InitPy --> DataStoreWsClientPy
 
   DataStoreWsClientPy --> BaseClass
    DataStoreWsClientPy -.type hints.-> TypeStubs
    
 
   BaseClass --> WsClient
 
   NamespaceHasherClass --> WsClient
    
 
   RustModule --> BaseClass
 
   RustModule --> NamespaceHasherClass
    
 
   WsClient --> MuxioRPC

The architecture consists of four distinct layers:

LayerTechnologyPurpose
Python User CodePure PythonApplication-level logic using the client
Python PackagePure Python wrapperConvenience methods and type annotations
PyO3 BindingsRust compiled to native extensionFFI bridge exposing Rust functionality
Native ImplementationRust (simd-r-drive-ws-client)WebSocket RPC client implementation

Python API Surface

The Python package exposes two primary classes that users interact with: DataStoreWsClient and NamespaceHasher. The API is defined through a combination of Rust PyO3 bindings and Python wrapper code.

graph TB
    subgraph "Python Space"
        DSWsClient["DataStoreWsClient\nexperiments/.../data_store_ws_client.py"]
NSHasher["NamespaceHasher\nRust #[pyclass]"]
end
    
    subgraph "Rust PyO3 Bindings"
        BaseClient["BaseDataStoreWsClient\nRust #[pyclass]\nsrc/lib.rs"]
NSHasherRust["NamespaceHasher\nRust Implementation\nsrc/lib.rs"]
end
    
    subgraph "Method Categories"
        WriteOps["write()\nbatch_write()\ndelete()"]
ReadOps["read()\nbatch_read()\nexists()"]
MetaOps["__len__()\n__contains__()\nis_empty()\nfile_size()"]
PyOnlyOps["batch_read_structured()\nPure Python Logic"]
end
    
 
   DSWsClient -->|inherits| BaseClient
    NSHasher -.exposed as.-> NSHasherRust
    
 
   BaseClient --> WriteOps
 
   BaseClient --> ReadOps
 
   BaseClient --> MetaOps
 
   DSWsClient --> PyOnlyOps

Class Hierarchy and Implementation

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:11-63 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:8-219

Core Operations

The DataStoreWsClient provides the following operation categories:

Operation TypeMethodsImplementation Location
Write Operationswrite(), batch_write(), delete()Rust (BaseDataStoreWsClient)
Read Operationsread(), batch_read(), exists()Rust (BaseDataStoreWsClient)
Metadata Operations__len__(), __contains__(), is_empty(), file_size()Rust (BaseDataStoreWsClient)
Structured Readsbatch_read_structured()Python (DataStoreWsClient)

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-168

Python-Rust Method Mapping

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:12-62 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-129

The batch_read_structured() method is implemented entirely in Python as a convenience wrapper. It decompiles dictionaries or lists of dictionaries to extract a flat list of keys, calls the fast Rust batch_read() method, and then rebuilds the original structure with the fetched values.


PyO3 Binding Architecture

PyO3 provides the Foreign Function Interface (FFI) that allows Python to call Rust code. The bindings use PyO3 macros to expose Rust structs and methods as Python classes and functions.

graph TB
    subgraph "Rust Source"
        StructDef["#[pyclass]\npub struct BaseDataStoreWsClient"]
MethodsDef["#[pymethods]\nimpl BaseDataStoreWsClient"]
NSStruct["#[pyclass]\npub struct NamespaceHasher"]
NSMethods["#[pymethods]\nimpl NamespaceHasher"]
end
    
    subgraph "PyO3 Macro Expansion"
        PyClassMacro["PyClass trait implementation\nType conversion\nReference counting"]
PyMethodsMacro["Method wrappers\nArgument extraction\nReturn value conversion"]
end
    
    subgraph "Python Module"
        PythonClass["class BaseDataStoreWsClient:\n def write(...)\n def read(...)"]
PythonNS["class NamespaceHasher:\n def __init__(...)\n def namespace(...)"]
end
    
 
   StructDef --> PyClassMacro
 
   MethodsDef --> PyMethodsMacro
 
   NSStruct --> PyClassMacro
 
   NSMethods --> PyMethodsMacro
    
 
   PyClassMacro --> PythonClass
 
   PyMethodsMacro --> PythonClass
 
   PyClassMacro --> PythonNS
 
   PyMethodsMacro --> PythonNS

PyO3 Class Definitions

Sources: experiments/bindings/python-ws-client/Cargo.lock:832-846 experiments/bindings/python-ws-client/Cargo.lock:1096-1108

Async Runtime Bridge

The Python bindings use pyo3-async-runtimes to bridge Python's async/await with Rust's Tokio runtime. This allows Python code to use standard async/await syntax while the underlying operations are handled by Tokio.

Sources: experiments/bindings/python-ws-client/Cargo.lock:849-860

graph TB
    subgraph "Python Async"
        PyAsyncCall["await client.write(key, data)"]
PyEventLoop["asyncio.run()
or uvloop"]
end
    
    subgraph "pyo3-async-runtimes Bridge"
        Bridge["pyo3_async_runtimes::tokio\nFuture conversion"]
Runtime["LocalSet spawning\nBlock on future"]
end
    
    subgraph "Rust Tokio"
        TokioFuture["async fn write()\nTokio Future"]
TokioRuntime["Tokio Runtime\nThread Pool"]
end
    
 
   PyAsyncCall --> PyEventLoop
 
   PyEventLoop --> Bridge
 
   Bridge --> Runtime
 
   Runtime --> TokioFuture
 
   TokioFuture --> TokioRuntime

The pyo3-async-runtimes crate handles the complexity of converting between Python's async protocol and Rust's Tokio futures, ensuring that async operations work seamlessly across the FFI boundary.


Build and Distribution System

The Python bindings are built using Maturin, which compiles the Rust code and packages it into Python wheels. The build system is configured through pyproject.toml and Cargo.toml.

graph LR
    subgraph "Configuration Files"
        PyProject["pyproject.toml\n[build-system]\nrequires = ['maturin>=1.5']"]
CargoToml["Cargo.toml\n[lib]\ncrate-type = ['cdylib']"]
end
    
    subgraph "Maturin Build Process"
        RustCompile["rustc compilation\n--crate-type=cdylib\nTarget: cpython extension"]
LinkPyO3["Link PyO3 runtime\nPython ABI"]
CreateWheel["Package .so/.pyd\nAdd metadata\nCreate .whl"]
end
    
    subgraph "Distribution Artifacts"
        Wheel["simd_r_drive_ws_client-*.whl\nPlatform-specific binary"]
PyPI["PyPI Registry\npip install simd-r-drive-ws-client"]
end
    
 
   PyProject --> RustCompile
 
   CargoToml --> RustCompile
 
   RustCompile --> LinkPyO3
 
   LinkPyO3 --> CreateWheel
 
   CreateWheel --> Wheel
 
   Wheel --> PyPI

Maturin Build Pipeline

Sources: experiments/bindings/python-ws-client/pyproject.toml:29-35

Build System Configuration

The build system is configured through several key sections in pyproject.toml:

ConfigurationLocationPurpose
[build-system]pyproject.toml:29-31Specifies Maturin as build backend
[tool.maturin]pyproject.toml:33-35Maturin-specific settings (bindings, Python version)
[project]pyproject.toml:1-27Package metadata for PyPI
[dependency-groups]pyproject.toml:37-46Development dependencies

Sources: experiments/bindings/python-ws-client/pyproject.toml:1-47

Supported Python Versions and Platforms

The package supports Python 3.10 through 3.13 on multiple platforms:

Sources: experiments/bindings/python-ws-client/pyproject.toml:7-27 experiments/bindings/python-ws-client/README.md:18-23


graph TB
    subgraph "Runtime Dependencies"
        PyO3Runtime["PyO3 Runtime\nEmbedded in .whl"]
RustDeps["Rust Dependencies\nsimd-r-drive-ws-client\ntokio, muxio-*"]
end
    
    subgraph "Development Dependencies"
        Maturin["maturin>=1.8.7\nBuild backend"]
MyPy["mypy>=1.16.1\nType checking"]
Pytest["pytest>=8.4.1\nTesting framework"]
NumPy["numpy>=2.2.6\nTesting utilities"]
Other["puccinialin, pytest-benchmark, pytest-order"]
end
    
    subgraph "Lock Files"
        UvLock["uv.lock\nPython dependency tree"]
CargoLock["Cargo.lock\nRust dependency tree"]
end
    
 
   Maturin --> RustDeps
 
   RustDeps --> CargoLock
    
 
   MyPy --> UvLock
 
   Pytest --> UvLock
 
   NumPy --> UvLock
 
   Other --> UvLock

Dependency Management

The Python bindings use uv for fast dependency resolution and management. Dependencies are split into runtime (minimal) and development dependencies.

Dependency Structure

Sources: experiments/bindings/python-ws-client/pyproject.toml:37-46 experiments/bindings/python-ws-client/Cargo.lock:1-1380 experiments/bindings/python-ws-client/uv.lock:1-299

The runtime has minimal Python dependencies (essentially none beyond the Python interpreter), as all dependencies are statically compiled into the binary wheel. Development dependencies include testing, type checking, and build tools.


Type Stubs and IDE Support

The package includes comprehensive type stubs (.pyi files) that provide full type information for IDEs and type checkers like MyPy.

graph TB
    subgraph "Type Stub File"
        StubImports["from typing import Optional, Union, Dict, Any, List\nfrom .simd_r_drive_ws_client import BaseDataStoreWsClient"]
ClientStub["@final\nclass DataStoreWsClient(BaseDataStoreWsClient):\n def __init__(self, host: str, port: int) -> None\n def write(self, key: bytes, data: bytes) -> None\n ..."]
NSStub["@final\nclass NamespaceHasher:\n def __init__(self, prefix: bytes) -> None\n def namespace(self, key: bytes) -> bytes"]
end
    
    subgraph "IDE Features"
        Autocomplete["Auto-completion\nMethod signatures"]
TypeCheck["Type checking\nmypy validation"]
Docstrings["Inline documentation\nMethod descriptions"]
end
    
 
   StubImports --> ClientStub
 
   StubImports --> NSStub
    
 
   ClientStub --> Autocomplete
 
   ClientStub --> TypeCheck
 
   ClientStub --> Docstrings
    
 
   NSStub --> Autocomplete
 
   NSStub --> TypeCheck
 
   NSStub --> Docstrings

Type Stub Structure

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219

The type stubs include:

  • Full method signatures with type annotations
  • Comprehensive docstrings explaining each method
  • Generic type support for structured operations
  • @final decorators to prevent subclassing

graph LR
    subgraph "Namespace Creation"
        Prefix["prefix = b'users'"]
PrefixHash["XXH3(prefix)\n→ 8 bytes"]
end
    
    subgraph "Key Hashing"
        Key["key = b'user123'"]
KeyHash["XXH3(key)\n→ 8 bytes"]
end
    
    subgraph "Namespaced Key"
        Combined["prefix_hash // key_hash\n16 bytes total"]
end
    
 
   Prefix --> PrefixHash
 
   Key --> KeyHash
 
   PrefixHash --> Combined
 
   KeyHash --> Combined

NamespaceHasher Utility

The NamespaceHasher class provides deterministic key namespacing using XXH3 hashing. It ensures keys are scoped to specific namespaces, preventing collisions across logical domains.

Namespacing Mechanism

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:170-219

Usage Pattern

The NamespaceHasher is typically used as follows:

StepCodeResult
1. Create hasherhasher = NamespaceHasher(b"users")Hasher scoped to "users" namespace
2. Generate keykey = hasher.namespace(b"user123")16-byte namespaced key
3. Store dataclient.write(key, data)Data stored under namespaced key
4. Read dataclient.read(key)Data retrieved from namespaced key

This pattern ensures that keys like b"settings" in the b"users" namespace don't collide with b"settings" in the b"system" namespace.

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:182-218


graph TB
    subgraph "Test Script: integration_test.sh"
        Setup["1. Navigate to experiments/\n2. Build server if needed"]
StartServer["3. cargo run --package simd-r-drive-ws-server\nBackground process\nPID captured"]
SetupPython["4. uv venv\n5. uv pip install pytest maturin\n6. uv pip install -e . --group dev"]
ExtractTests["7. extract_readme_tests.py\nExtract code blocks from README.md"]
RunPytest["8. pytest -v -s\nTEST_SERVER_HOST=$SERVER_HOST\nTEST_SERVER_PORT=$SERVER_PORT"]
Cleanup["9. kill -9 $SERVER_PID\n10. rm /tmp/simd-r-drive-pytest-storage.bin"]
end
    
 
   Setup --> StartServer
 
   StartServer --> SetupPython
 
   SetupPython --> ExtractTests
 
   ExtractTests --> RunPytest
 
   RunPytest --> Cleanup

Integration Test Infrastructure

The Python bindings include a comprehensive integration test suite that validates the entire stack from Python user code down to the WebSocket server.

Test Workflow

Sources: experiments/bindings/python-ws-client/integration_test.sh:1-91

Test Categories

The test infrastructure includes multiple test sources:

Test SourcePurposeGenerated By
tests/test_readme_blocks.pyValidates README examplesextract_readme_tests.py
Other test filesUnit and integration testsManual test authoring
Pytest fixturesSetup/teardown infrastructurePytest framework

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:1-46

graph LR
    subgraph "Input"
        README["README.md\n```python code blocks"]
end
    
    subgraph "Extraction Process"
        Regex["Regex: r'```python\\n(.*?)```'"]
Parse["Extract all Python blocks"]
Wrap["Wrap each block in\ndef test_readme_block_N():\n ..."]
end
    
    subgraph "Output"
        TestFile["tests/test_readme_blocks.py\ntest_readme_block_0()\ntest_readme_block_1()\n..."]
end
    
 
   README --> Regex
 
   Regex --> Parse
 
   Parse --> Wrap
 
   Wrap --> TestFile

README Test Extraction

The extract_readme_tests.py script automatically converts Python code blocks from the README into pytest test functions:

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:14-45

This ensures that all code examples in the README are automatically tested, preventing documentation drift from the actual API behavior.


graph TB
    subgraph "Internal Modules"
        RustBinary["simd_r_drive_ws_client_py.so/.pyd\nBinary compiled module"]
RustSymbols["BaseDataStoreWsClient\nNamespaceHasher\nsetup_logging\ntest_rust_logging"]
PythonWrapper["data_store_ws_client.py\nDataStoreWsClient"]
end
    
    subgraph "Package __init__.py"
        ImportRust["from .simd_r_drive_ws_client import\n setup_logging, test_rust_logging"]
ImportPython["from .data_store_ws_client import\n DataStoreWsClient, NamespaceHasher"]
AllList["__all__ = [\n 'DataStoreWsClient',\n 'NamespaceHasher',\n 'setup_logging',\n 'test_rust_logging'\n]"]
end
    
    subgraph "Public API"
        UserCode["from simd_r_drive_ws_client import DataStoreWsClient"]
end
    
 
   RustBinary --> RustSymbols
 
   RustSymbols --> ImportRust
 
   PythonWrapper --> ImportPython
    
 
   ImportRust --> AllList
 
   ImportPython --> AllList
 
   AllList --> UserCode

Package Exports and Public API

The package's public API is defined through the __init__.py file, which controls what symbols are available when users import the package.

Export Structure

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14

The __all__ list explicitly defines the public API surface, preventing internal implementation details from being accidentally imported by users. This follows Python best practices for package design.

DeepWiki GitHub

Python WebSocket Client API

Relevant source files

Purpose and Scope

This document describes the Python WebSocket client API for remote access to SIMD R Drive storage. The API provides idiomatic Python interfaces backed by high-performance Rust implementations via PyO3 bindings. This page covers the DataStoreWsClient class, NamespaceHasher utility, and their usage patterns.

For information about building and installing the Python bindings, see Building Python Bindings. For details about the underlying native Rust WebSocket client, see Native Rust Client. For server-side configuration and deployment, see WebSocket Server.

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219

Architecture Overview

The Python WebSocket client uses a multi-layer architecture that bridges Python's async/await with Rust's Tokio runtime while maintaining idiomatic Python APIs.

graph TB
    UserCode["Python User Code\nimport simd_r_drive_ws_client"]
DataStoreWsClient["DataStoreWsClient\nPython wrapper class"]
BaseDataStoreWsClient["BaseDataStoreWsClient\nPyO3 #[pyclass]"]
PyO3FFI["PyO3 FFI Layer\npyo3-async-runtimes"]
RustClient["simd-r-drive-ws-client\nNative Rust implementation"]
MuxioRPC["muxio-tokio-rpc-client\nWebSocket + RPC"]
Server["simd-r-drive-ws-server\nRemote DataStore"]
UserCode --> DataStoreWsClient
 
   DataStoreWsClient --> BaseDataStoreWsClient
 
   BaseDataStoreWsClient --> PyO3FFI
 
   PyO3FFI --> RustClient
 
   RustClient --> MuxioRPC
 
   MuxioRPC --> Server

Python Integration Stack

Sources: experiments/bindings/python-ws-client/Cargo.lock:1096-1108 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:1-10

Class Hierarchy

The BaseDataStoreWsClient class is implemented in Rust and exposes core storage operations through PyO3. The DataStoreWsClient Python class extends it with additional convenience methods implemented in pure Python.

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-10 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:11-62

DataStoreWsClient Class

The DataStoreWsClient class provides the primary interface for interacting with a remote SIMD R Drive storage engine over WebSocket connections.

Connection Initialization

ConstructorDescription
__init__(host: str, port: int)Establishes WebSocket connection to the specified server

The constructor creates a WebSocket connection to the remote storage server. Connection establishment is synchronous and will raise an exception if the server is unreachable.

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:17-25

Write Operations

MethodParametersDescription
write(key, data)key: bytes, data: bytesAppends single key/value pair
batch_write(items)items: list[tuple[bytes, bytes]]Writes multiple pairs in one operation

Write operations are append-only and atomic. If a key already exists, writing to it creates a new version while the old data remains on disk (marked as superseded via the index).

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-51

Read Operations

MethodParametersReturn TypeCopy Behavior
read(key)key: bytesOptional[bytes]Performs memory copy
batch_read(keys)keys: list[bytes]list[Optional[bytes]]Performs memory copy
batch_read_structured(data)data: dict or list[dict]Same structure with valuesPython-side wrapper

The read and batch_read methods perform memory copies when returning data. For zero-copy access patterns, the native Rust client provides read_entry methods that return memory-mapped views.

The batch_read_structured method is a Python convenience wrapper that:

  1. Accepts dictionaries or lists of dictionaries where values are datastore keys
  2. Flattens the structure into a single key list
  3. Calls batch_read for efficient parallel fetching
  4. Reconstructs the original structure with fetched values

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:79-129 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:12-62

Deletion and Existence Checks

MethodParametersReturn TypeDescription
delete(key)key: bytesNoneMarks key as deleted (tombstone)
exists(key)key: bytesboolChecks if key is active
__contains__(key)key: bytesboolPython in operator support

Deletion is logical, not physical. The delete method appends a tombstone entry to the storage file. The physical data remains on disk but is no longer accessible through reads.

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:53-77 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:131-141

Utility Methods

MethodReturn TypeDescription
__len__()intReturns count of active entries
is_empty()boolChecks if store has any active keys
file_size()intReturns physical file size on disk

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:143-168

NamespaceHasher Utility

The NamespaceHasher class provides deterministic key namespacing using XXH3 hashing to prevent key collisions across logical domains.

graph LR
    Input1["Namespace prefix\ne.g., b'users'"]
Input2["Key\ne.g., b'user123'"]
Hash1["XXH3 hash\n8 bytes"]
Hash2["XXH3 hash\n8 bytes"]
Output["Namespaced key\n16 bytes total"]
Input1 -->|hash once at init| Hash1
 
   Input2 -->|hash per call| Hash2
 
   Hash1 --> Output
 
   Hash2 --> Output

Architecture

Usage Pattern

MethodParametersReturn TypeDescription
__init__(prefix)prefix: bytesN/AInitializes hasher with namespace
namespace(key)key: bytesbytesReturns 16-byte namespaced key

The output key structure is:

  • Bytes 0-7: XXH3 hash of namespace prefix
  • Bytes 8-15: XXH3 hash of input key

This design ensures:

  • Deterministic key generation (same input → same output)
  • Collision isolation between namespaces
  • Fixed-length keys regardless of input size

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:170-219

graph TB
    Stub["data_store_ws_client.pyi\nType definitions"]
Impl["data_store_ws_client.py\nImplementation"]
Base["simd_r_drive_ws_client\nCompiled Rust module"]
Stub -.->|describes| Impl
 
   Impl -->|imports from| Base
 
   Stub -.->|describes| Base

Type Stubs and IDE Support

The package provides complete type stubs for IDE integration and static type checking.

Type Stub Structure

The type stubs (experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219) provide:

  • Full method signatures with type annotations
  • Return type information (Optional[bytes], list[Optional[bytes]], etc.)
  • Docstrings for IDE hover documentation
  • @final decorators indicating classes cannot be subclassed

Type Checking Example

Python Version Support:

The package targets Python 3.10-3.13 as specified in experiments/bindings/python-ws-client/pyproject.toml7 and experiments/bindings/python-ws-client/pyproject.toml:21-24

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219 experiments/bindings/python-ws-client/pyproject.toml7 experiments/bindings/python-ws-client/pyproject.toml:19-27

graph TB
    PythonMain["Python Main Thread\nSynchronous API calls"]
PyO3["PyO3 Bridge\npyo3-async-runtimes"]
TokioRT["Tokio Runtime\nAsync event loop"]
WSClient["WebSocket Client\ntokio-tungstenite"]
PythonMain -->|sync call| PyO3
 
   PyO3 -->|spawn + block_on| TokioRT
 
   TokioRT --> WSClient
 
   WSClient -.->|result| TokioRT
 
   TokioRT -.->|return| PyO3
 
   PyO3 -.->|return| PythonMain

Async Runtime Bridging

The client uses pyo3-async-runtimes to bridge Python's async/await with Rust's Tokio runtime. This allows the underlying Rust WebSocket client to use native async I/O while exposing synchronous APIs to Python.

Runtime Architecture

The pyo3-async-runtimes crate (experiments/bindings/python-ws-client/Cargo.lock:849-860) provides:

  • Runtime spawning: Manages Tokio runtime lifecycle
  • Future blocking: Converts Rust async operations to Python-blocking calls
  • Thread safety: Ensures proper synchronization between Python GIL and Rust runtime

This design allows Python code to use simple synchronous APIs while benefiting from Rust's high-performance async networking under the hood.

Sources: experiments/bindings/python-ws-client/Cargo.lock:849-860 experiments/bindings/python-ws-client/Cargo.lock:1096-1108

API Summary

Complete Method Reference

CategoryMethodParametersReturnDescription
Connection__init__host: str, port: intN/AEstablish WebSocket connection
Writewritekey: bytes, data: bytesNoneWrite single entry
Writebatch_writeitems: list[tuple[bytes, bytes]]NoneWrite multiple entries
Readreadkey: bytesOptional[bytes]Read single entry (copies)
Readbatch_readkeys: list[bytes]list[Optional[bytes]]Read multiple entries
Readbatch_read_structureddata: dict or list[dict]Same structureRead with structure preservation
Deletedeletekey: bytesNoneMark key as deleted
Queryexistskey: bytesboolCheck key existence
Query__contains__key: bytesboolPython in operator
Info__len__N/AintActive entry count
Infois_emptyN/AboolCheck if empty
Infofile_sizeN/AintPhysical file size

NamespaceHasher Reference

MethodParametersReturnDescription
__init__prefix: bytesN/AInitialize namespace
namespacekey: bytesbytesGenerate 16-byte namespaced key

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219

DeepWiki GitHub

Building Python Bindings

Relevant source files

Purpose and Scope

This page documents the build system and tooling required to compile the Python bindings for SIMD R Drive. It covers PyO3 integration, the Maturin build backend, installation procedures, dependency management with uv, and supported Python versions. For information about using the Python client API, see Python WebSocket Client API. For integration testing procedures, see Integration Testing.


Overview of the Build Architecture

The Python bindings are implemented in Rust using PyO3, which provides Foreign Function Interface (FFI) between Rust and Python. The build process transforms Rust code into platform-specific Python wheels using Maturin, a specialized build backend designed for Rust-Python projects.

graph TB
    subgraph "Source Code"
        RustSrc["Rust Implementation\nBaseDataStoreWsClient"]
PyO3Attrs["PyO3 Attributes\n#[pyclass], #[pymethods]"]
CargoToml["Cargo.toml\ncrate-type = cdylib\npyo3 dependency"]
end
    
    subgraph "Build Configuration"
        PyProject["pyproject.toml\nbuild-backend = maturin\nbindings = pyo3"]
Manifest["Package Metadata\nversion, requires-python\nclassifiers"]
end
    
    subgraph "Build Process"
        Maturin["maturin build\nor\nmaturin develop"]
RustCompiler["rustc + cargo\nCompile to .so/.dylib/.dll"]
FFILayer["PyO3 FFI Bridge\nAuto-generated C bindings"]
end
    
    subgraph "Output Artifacts"
        Wheel["Python Wheel\n.whl package"]
SharedLib["Native Library\nsimd_r_drive_ws_client.so"]
PyStub["Type Stubs\ndata_store_ws_client.pyi"]
end
    
 
   RustSrc --> Maturin
 
   PyO3Attrs --> Maturin
 
   CargoToml --> Maturin
 
   PyProject --> Maturin
 
   Manifest --> Maturin
    
 
   Maturin --> RustCompiler
 
   RustCompiler --> FFILayer
 
   FFILayer --> SharedLib
 
   SharedLib --> Wheel
 
   PyStub --> Wheel
    
 
   Wheel -->|pip install| UserEnv["User Python Environment"]

Build Pipeline Architecture

Sources: experiments/bindings/python-ws-client/pyproject.toml:1-46 experiments/bindings/python-ws-client/README.md:26-38


PyO3 Integration

PyO3 is the Rust library that enables writing Python native modules in Rust. It provides macros and traits for exposing Rust structs and functions to Python.

PyO3 Configuration

The PyO3 bindings type is declared in the Maturin configuration:

Configuration FileSettingValue
pyproject.tomltool.maturin.bindings"pyo3"
pyproject.tomltool.maturin.requires-python">=3.10"
pyproject.tomlbuild-system.requires["maturin>=1.5"]
pyproject.tomlbuild-system.build-backend"maturin"

Sources: experiments/bindings/python-ws-client/pyproject.toml:29-35

Python Module Structure

The binding layer exposes Rust types as Python classes:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-13 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:1-63 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219


Maturin Build System

Maturin is a build tool for creating Python wheels from Rust projects. It handles cross-compilation, ABI compatibility, and packaging.

Build Commands

CommandPurposeWhen to Use
maturin buildCompile release wheelCI/CD pipelines, distribution
maturin build --releaseOptimized buildPerformance-critical builds
maturin developDevelopment installLocal iteration, testing
maturin develop --releaseFast local testingPerformance validation

Maturin Configuration Details

Sources: experiments/bindings/python-ws-client/pyproject.toml:29-35 experiments/bindings/python-ws-client/README.md:31-38


Installation Methods

Method 1: Install from PyPI (Production)

This installs a pre-built wheel. No Rust toolchain required.

Sources: experiments/bindings/python-ws-client/README.md:26-29

Method 2: Build from Source (Development)

Requires Rust toolchain and Maturin:

The -m flag specifies the manifest path when building from outside the package directory.

Sources: experiments/bindings/python-ws-client/README.md:31-36

Method 3: Development Installation with uv

Using uv for dependency management:

This installs the package in editable mode with development dependencies.

Sources: experiments/bindings/python-ws-client/integration_test.sh:70-77

Installation Flow Comparison

Sources: experiments/bindings/python-ws-client/README.md:25-38 experiments/bindings/python-ws-client/integration_test.sh:70-77


Dependency Management with uv

The project uses uv, a fast Python package installer and resolver, for development dependencies. Unlike traditional requirements.txt or setup.py, dependencies are declared in pyproject.toml using PEP 735 dependency groups.

Dependency Group Structure

GroupPurposeDependencies
devDevelopment toolsmaturin, mypy, numpy, pytest, pytest-benchmark, pytest-order, puccinialin

Sources: experiments/bindings/python-ws-client/pyproject.toml:37-46

uv Lockfile

The uv.lock file pins exact versions and hashes for reproducible builds. This lockfile includes:

  • Direct dependencies (e.g., maturin>=1.8.7)
  • Transitive dependencies (e.g., certifi, httpcore)
  • Platform-specific wheels
  • Python version markers

Sources: experiments/bindings/python-ws-client/uv.lock:1-1036

uv Workflow in CI/Testing

The integration test script demonstrates the uv workflow:

Sources: experiments/bindings/python-ws-client/integration_test.sh:62-87


Supported Python Versions and Platforms

Python Version Matrix

The bindings support Python 3.10 through 3.13 (CPython only):

Python VersionSupport StatusNotes
3.10✓ SupportedMinimum version
3.11✓ SupportedFull support
3.12✓ SupportedFull support
3.13✓ SupportedLatest tested
PyPy✗ Not supportedCPython-only due to PyO3 limitations

Sources: experiments/bindings/python-ws-client/pyproject.toml:7-24 experiments/bindings/python-ws-client/README.md:17-24

Platform Support

PlatformArchitectureStatusNotes
Linuxx86_64✓ SupportedPrimary platform
Linuxaarch64✓ SupportedARM64 support
macOSx86_64✓ SupportedIntel Macs
macOSarm64✓ SupportedApple Silicon
Windowsx86_64✓ Supported64-bit only

Sources: experiments/bindings/python-ws-client/pyproject.toml:25-27 experiments/bindings/python-ws-client/uv.lock:117-129

Version Constraints in pyproject.toml

The project metadata declares supported versions for PyPI classifiers:

Programming Language :: Python :: 3.10
Programming Language :: Python :: 3.11
Programming Language :: Python :: 3.12
Programming Language :: Python :: 3.13

Sources: experiments/bindings/python-ws-client/pyproject.toml:21-24


Build Process Deep Dive

File Transformation Pipeline

Sources: experiments/bindings/python-ws-client/pyproject.toml:1-46

Build Artifacts Location

After maturin develop, the installed package structure looks like:

site-packages/
├── simd_r_drive_ws_client/
│   ├── __init__.py                       # Python wrapper
│   ├── data_store_ws_client.py           # Extended API
│   ├── data_store_ws_client.pyi          # Type stubs
│   ├── simd_r_drive_ws_client.so         # Native library (Linux)
│   └── simd_r_drive_ws_client.cpython-*.so
└── simd_r_drive_ws_client-0.11.1.dist-info/
    ├── METADATA                           # Package metadata
    ├── RECORD                             # File integrity
    └── WHEEL                              # Wheel format version

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-13 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:1-63 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219


CI/CD Integration

The CI build recipe referenced in the README demonstrates automated wheel building. The GitHub Actions workflow typically:

  1. Sets up multiple Python versions (3.10-3.13)
  2. Installs Rust toolchain
  3. Installs Maturin
  4. Builds wheels for all platforms using maturin build --release
  5. Uploads wheels as artifacts

For full details, see the CI/CD Pipeline documentation.

Sources: experiments/bindings/python-ws-client/README.md38


Troubleshooting Common Build Issues

IssueCauseSolution
maturin: command not foundMaturin not installedpip install maturin or use uv pip install maturin
rustc: command not foundRust toolchain missingInstall from https://rustup.rs
error: Python version mismatchWrong Python version activeEnsure Python ≥3.10 in path
ImportError: cannot import nameStale buildrm -rf target/ then rebuild
ModuleNotFoundError: No module named 'simd_r_drive_ws_client'Not installedRun maturin develop or pip install -e .

Sources: experiments/bindings/python-ws-client/integration_test.sh:62-68

DeepWiki GitHub

Integration Testing

Relevant source files

Purpose and Scope

This document describes the integration testing infrastructure for the Python WebSocket client bindings. The testing system validates end-to-end functionality by orchestrating a complete test environment: starting a SIMD R Drive WebSocket server, extracting test cases from documentation, and executing them against the live server.

For information about the Python client API itself, see Python WebSocket Client API. For build system details, see Building Python Bindings.


Integration Test Architecture

The integration testing system consists of three primary components: a shell orchestrator that manages the test lifecycle, a Python script that extracts executable examples from documentation, and the pytest framework that executes the tests.

Sources: experiments/bindings/python-ws-client/integration_test.sh:1-90 experiments/bindings/python-ws-client/extract_readme_tests.py:1-45

graph TB
    subgraph "Test Orchestration"
        Script["integration_test.sh\nBash Orchestrator"]
Cleanup["cleanup()\nTrap Handler"]
end
    
    subgraph "Server Management"
        CargoRun["cargo run\n--package simd-r-drive-ws-server"]
Server["simd-r-drive-ws-server\nProcess (PID tracked)"]
Storage["/tmp/simd-r-drive-pytest-storage.bin"]
end
    
    subgraph "Test Generation"
        ExtractScript["extract_readme_tests.py"]
ReadmeMd["README.md\nPython code blocks"]
TestFile["tests/test_readme_blocks.py\nGenerated pytest functions"]
end
    
    subgraph "Test Execution"
        UvRun["uv run pytest"]
Pytest["pytest framework"]
TestCases["test_readme_block_0()\ntest_readme_block_1()\n..."]
end
    
    subgraph "Client Under Test"
        Client["DataStoreWsClient"]
WsConnection["WebSocket Connection\n127.0.0.1:34129"]
end
    
 
   Script --> CargoRun
 
   CargoRun --> Server
 
   Server --> Storage
 
   Script --> ExtractScript
 
   ExtractScript --> ReadmeMd
 
   ReadmeMd --> TestFile
 
   Script --> UvRun
 
   UvRun --> Pytest
 
   Pytest --> TestCases
 
   TestCases --> Client
 
   Client --> WsConnection
 
   WsConnection --> Server
 
   Script --> Cleanup
 
   Cleanup -.->|kill -9 -$SERVER_PID| Server
 
   Cleanup -.->|rm -f| Storage

Test Workflow Orchestration

The integration_test.sh script manages the complete test lifecycle through a series of coordinated steps. The script implements robust cleanup handling using shell traps to ensure resources are released even if tests fail.

sequenceDiagram
    participant Script as integration_test.sh
    participant Trap as cleanup() trap
    participant Cargo as cargo run
    participant Server as simd-r-drive-ws-server
    participant UV as uv (Python)
    participant Extract as extract_readme_tests.py
    participant Pytest as pytest
    
    Script->>Script: set -e (fail on error)
    Script->>Trap: trap cleanup EXIT
    Script->>Script: cd experiments/
    Script->>Cargo: cargo run --package simd-r-drive-ws-server
    Cargo->>Server: start background process
    Server-->>Script: return PID
    Script->>Script: SERVER_PID=$!
    Script->>UV: uv venv
    Script->>UV: uv pip install pytest maturin
    Script->>UV: uv pip install -e . --group dev
    Script->>Extract: uv run extract_readme_tests.py
    Extract-->>Script: generate test_readme_blocks.py
    Script->>Script: export TEST_SERVER_HOST/PORT
    Script->>Pytest: uv run pytest -v -s
    Pytest->>Server: test connections
    Server-->>Pytest: responses
    Pytest-->>Script: test results
    Script->>Trap: EXIT signal
    Trap->>Server: kill -9 -$SERVER_PID
    Trap->>Script: rm -f /tmp/simd-r-drive-pytest-storage.bin

Workflow Sequence

Sources: experiments/bindings/python-ws-client/integration_test.sh:1-90

Configuration Variables

The script defines several configuration variables at the top for easy modification:

VariableDefault ValuePurpose
EXPERIMENTS_DIR_REL_PATH../../Relative path to experiments root
SERVER_PACKAGE_NAMEsimd-r-drive-ws-serverCargo package name
STORAGE_FILE/tmp/simd-r-drive-pytest-storage.binTemporary storage file path
SERVER_HOST127.0.0.1Server bind address
SERVER_PORT34129Server listen port
SERVER_PID(runtime)Process ID for cleanup

Sources: experiments/bindings/python-ws-client/integration_test.sh:8-15

Cleanup Mechanism

The cleanup function ensures proper resource release regardless of test outcome. It uses process group termination to kill the server and all child processes:

graph LR
    Exit["Script Exit\n(success or failure)"]
Trap["trap cleanup EXIT"]
Cleanup["cleanup()
function"]
KillServer["kill -9 -$SERVER_PID"]
RemoveFile["rm -f $STORAGE_FILE"]
Exit --> Trap
 
   Trap --> Cleanup
 
   Cleanup --> KillServer
 
   Cleanup --> RemoveFile

The kill -9 "-$SERVER_PID" syntax terminates the entire process group (note the minus sign prefix), ensuring the server and any spawned threads are fully stopped. The 2>/dev/null || true pattern prevents errors if the process already exited.

Sources: experiments/bindings/python-ws-client/integration_test.sh:17-30


graph TB
    ReadmeMd["README.md"]
ExtractScript["extract_readme_tests.py"]
subgraph "Extraction Functions"
        ExtractBlocks["extract_python_blocks()\nregex: ```python...```"]
StripNonAscii["strip_non_ascii()\nRemove non-ASCII chars"]
WrapTest["wrap_as_test_fn()\nGenerate pytest function"]
end
    
    subgraph "Generated Output"
        TestFile["tests/test_readme_blocks.py"]
TestFn0["def test_readme_block_0():\n # extracted code"]
TestFn1["def test_readme_block_1():\n # extracted code"]
TestFnN["def test_readme_block_N():\n # extracted code"]
end
    
 
   ReadmeMd --> ExtractScript
 
   ExtractScript --> ExtractBlocks
 
   ExtractBlocks --> StripNonAscii
 
   StripNonAscii --> WrapTest
 
   WrapTest --> TestFile
 
   TestFile --> TestFn0
 
   TestFile --> TestFn1
 
   TestFile --> TestFnN

README Test Extraction

The extract_readme_tests.py script automates the conversion of documentation examples into executable test cases. This approach ensures that code examples in the README remain accurate and functional.

Extraction Process

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:1-45

Extraction Implementation

The extraction process follows three steps:

  1. Pattern Matching : Uses regex to find all ````python` fenced code blocks

    • Pattern: r"```python\n(.*?)```" with re.DOTALL flag
    • Returns list of code block strings
  2. Text Sanitization : Removes non-ASCII characters to prevent encoding issues

    • Uses encode("ascii", errors="ignore").decode("ascii")
    • Ensures clean Python code for pytest execution
  3. Test Function Generation : Wraps each code block in a pytest-compatible function

    • Function naming: test_readme_block_{idx}()
    • Indentation: 4 spaces for function body
    • Empty blocks generate pass statement

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:21-34

Example Transformation

Given a README code block:

The script generates:

Sources: experiments/bindings/python-ws-client/README.md:42-51 experiments/bindings/python-ws-client/extract_readme_tests.py:30-34


Test Environment Setup

The integration test uses uv as the Python package manager for fast, reliable dependency management. The environment setup occurs after server startup but before test execution.

Environment Setup Sequence

StepCommandPurpose
1uv venvCreate isolated virtual environment
2uv pip install pytest maturinInstall core test dependencies
3uv pip install -e . --group devInstall package in editable mode with dev dependencies
4uv run extract_readme_tests.pyGenerate test cases from README
5export TEST_SERVER_HOST/PORTSet environment variables for tests
6uv run pytest -v -sExecute test suite

Sources: experiments/bindings/python-ws-client/integration_test.sh:62-87

Development Dependencies

The project defines development dependencies in pyproject.toml under the [dependency-groups] section:

DependencyVersion ConstraintPurpose
maturin>=1.8.7Build Rust-Python bindings
mypy>=1.16.1Static type checking
numpy>=2.2.6Array operations (test fixtures)
puccinialin>=0.1.5Test utilities
pytest>=8.4.1Test framework
pytest-benchmark>=5.1.0Performance benchmarking
pytest-order>=1.3.0Test execution ordering

Sources: experiments/bindings/python-ws-client/pyproject.toml:37-46


graph LR
    Script["integration_test.sh"]
SetM["set -m\n(enable job control)"]
CargoRun["cargo run --package simd-r-drive-ws-server\n-- /tmp/storage.bin --host 127.0.0.1 --port 34129 &"]
CapturePid["SERVER_PID=$!"]
UnsetM["set +m\n(disable job control)"]
Script --> SetM
 
   SetM --> CargoRun
 
   CargoRun --> CapturePid
 
   CapturePid --> UnsetM

Server Lifecycle Management

The integration test script manages the server as a background process with careful PID tracking and cleanup handling.

Server Startup

The server starts with the following configuration:

  • Package : simd-r-drive-ws-server (built via cargo)
  • Storage File : /tmp/simd-r-drive-pytest-storage.bin
  • Host : 127.0.0.1 (localhost only)
  • Port : 34129 (test-specific port)
  • Mode : Background process (& suffix)

The set -m and set +m wrapper enables job control temporarily, ensuring the shell correctly captures the background process PID with $!.

Sources: experiments/bindings/python-ws-client/integration_test.sh:47-56

Server Shutdown

The cleanup trap ensures the server terminates cleanly:

  1. Check PID Exists : [[ ! -z "$SERVER_PID" ]]
  2. Kill Process Group : kill -9 "-$SERVER_PID"
    • The - prefix kills the entire process group
    • Signal -9 (SIGKILL) ensures immediate termination
  3. Ignore Errors : 2>/dev/null || true
    • Prevents script failure if process already exited
  4. Remove Storage : rm -f "$STORAGE_FILE"
    • Cleans up temporary test data

Sources: experiments/bindings/python-ws-client/integration_test.sh:19-29


Test Execution Environment

The test suite runs with specific environment variables that configure the client connection parameters. These variables bridge the shell orchestration layer with the Python test code.

Environment Variable Injection

The script exports server configuration to the test environment:

Python tests can access these via os.environ:

Sources: experiments/bindings/python-ws-client/integration_test.sh:83-85

Pytest Execution Flags

The test suite runs with specific pytest flags:

  • -v: Verbose output (shows individual test names)
  • -s: No output capture (displays print statements immediately)

This configuration aids debugging by providing real-time test output.

Sources: experiments/bindings/python-ws-client/integration_test.sh87


graph TB
    Main["main()
entry point"]
ReadFile["README.read_text()"]
Extract["extract_python_blocks()"]
Generate["List comprehension:\n[wrap_as_test_fn(code, i)\nfor i, code in enumerate(blocks)]"]
WriteFile["TEST_FILE.write_text()"]
Main --> ReadFile
 
   ReadFile --> Extract
 
   Extract --> Generate
 
   Generate --> WriteFile
    
    subgraph "File Structure"
        Header["# Auto-generated from README.md\nimport pytest"]
TestFunctions["test_readme_block_0()\ntest_readme_block_1()\n..."]
end
    
 
   WriteFile --> Header
 
   WriteFile --> TestFunctions

Test Case Structure

Generated test cases follow a consistent structure based on the README extraction pattern. Each test becomes an isolated function within the generated test file.

Test File Generation

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:36-42

Test Isolation

Each extracted code block becomes an independent test function:

  • Naming Convention : test_readme_block_{idx}() where idx is zero-based
  • Isolation : Each function has its own scope
  • Independence : Tests can run in any order (no shared state)
  • Clarity : Test number maps directly to README code block order

This pattern enables pytest to discover and execute each documentation example as a separate test case, providing clear pass/fail feedback for each code snippet.

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:30-34


Dependencies and Platform Requirements

The integration testing system requires specific tools and versions to function correctly.

Required Tools

ToolPurposeDetection Method
uvPython package managementcommand -v uv &> /dev/null
cargoRust build systemImplicit (via cargo run)
bashShell scriptingShebang: #!/bin/bash
pytestTest executionInstalled via uv pip install

Sources: experiments/bindings/python-ws-client/integration_test.sh:62-68 experiments/bindings/python-ws-client/integration_test.sh1

Python Version Support

The client supports Python 3.10 through 3.13 (CPython only):

Python VersionSupport StatusNotes
3.10✓ SupportedMinimum version
3.11✓ SupportedFull compatibility
3.12✓ SupportedFull compatibility
3.13✓ SupportedLatest tested

Note : NumPy version constraints differ by Python version. The project uses numpy>=2.2.6 for Python 3.10 compatibility, as NumPy 2.3.0+ dropped Python 3.10 support.

Sources: experiments/bindings/python-ws-client/pyproject.toml7 experiments/bindings/python-ws-client/pyproject.toml:21-24 experiments/bindings/python-ws-client/pyproject.toml41

Operating System Support

The integration tests target POSIX-compliant systems:

  • Linux : Primary development platform (Ubuntu, Debian, etc.)
  • macOS : Full support (Darwin)
  • Windows : Supported but uses different process management

The shell script uses POSIX-compatible syntax ([[ ]], $!, trap) that works across these platforms with appropriate shells.

Sources: experiments/bindings/python-ws-client/integration_test.sh1 experiments/bindings/python-ws-client/pyproject.toml:25-26


Running the Integration Tests

Execution Command

From the repository root:

Or from the Python client directory:

Expected Output Sequence

  1. Script initialization and configuration
  2. Server compilation (if needed) and startup
  3. Python environment creation
  4. Dependency installation
  5. README test extraction
  6. Pytest test execution with detailed output
  7. Automatic cleanup on exit

Troubleshooting

Common failure scenarios:

IssueSymptomSolution
uv not foundScript exits at line 66Install uv: `curl -LsSf https://astral.sh/uv/install.sh
Port already in useServer fails to startChange SERVER_PORT variable or kill existing process
Storage file lockedPermission deniedEnsure /tmp is writable and no processes hold locks
Cargo build failsCompilation errorsRun cargo clean and retry

Sources: experiments/bindings/python-ws-client/integration_test.sh:62-68

DeepWiki GitHub

Performance Optimizations

Relevant source files

Purpose and Scope

This document provides a comprehensive guide to the performance optimization strategies employed throughout SIMD R Drive. It covers three primary optimization domains: SIMD-accelerated operations, memory alignment strategies, and access pattern optimizations.

For architectural details about the storage engine itself, see Core Storage Engine. For information about network-level optimizations in the RPC layer, see Network Layer and RPC. For details on the extension utilities, see Extensions and Utilities.


Overview of Optimization Strategies

SIMD R Drive achieves high performance through a multi-layered approach combining hardware-level optimizations with careful software design. The system leverages CPU-specific SIMD instructions, enforces strict memory alignment, and provides multiple access modes tailored to different workload patterns.

Core Performance Principles

graph TB
    subgraph "Hardware Level"
        SIMD["SIMD Instructions\nAVX2/NEON"]
Cache["CPU Cache Lines\n64-byte aligned"]
HWHash["Hardware Hashing\nSSE2/AVX2/Neon"]
end
    
    subgraph "Software Level"
        SIMDCopy["simd_copy\nWrite Optimization"]
Alignment["PAYLOAD_ALIGNMENT\nZero-Copy Reads"]
AccessModes["Access Modes\nSingle/Batch/Stream"]
end
    
    subgraph "Storage Layer"
        MMap["Memory-Mapped File\nmmap"]
Sequential["Sequential Writes\nAppend-Only"]
Index["XXH3 Index\nO(1)
Lookups"]
end
    
 
   SIMD --> SIMDCopy
 
   Cache --> Alignment
 
   HWHash --> Index
    
 
   SIMDCopy --> Sequential
 
   Alignment --> MMap
 
   AccessModes --> Sequential
 
   AccessModes --> MMap
    
 
   Sequential --> Index

Sources: README.md:249-257 README.md:52-59


Performance Metrics Summary

Optimization AreaTechniqueBenefitTrade-off
Write Pathsimd_copy with AVX2/NEON2-4x faster bulk copiesRequires feature detection
Read PathMemory-mapped zero-copyNo allocation overheadFile size impacts virtual memory
Alignment64-byte payload boundariesFull SIMD/cache efficiency0-63 bytes padding per entry
IndexingXXH3_64 with hardware accelerationSub-microsecond key lookupsMemory overhead for index
Batch WritesSingle lock, multiple entriesReduced lock contentionDelayed durability guarantees
Parallel IterationRayon multi-threaded scanLinear speedup with coresHigher CPU utilization

Sources: README.md:249-257 README.md:159-167


SIMD Write Acceleration

The simd_copy Function

SIMD R Drive uses a specialized memory copy function called simd_copy to optimize data transfer during write operations. This function detects available CPU features at runtime and selects the most efficient implementation.

Platform-Specific Implementations

graph TB
    WriteOp["Write Operation"]
SIMDCopy["simd_copy\nRuntime Dispatch"]
X86Check{"x86_64 with\nAVX2?"}
ARMCheck{"aarch64?"}
AVX2["AVX2 Implementation\n_mm256_loadu_si256\n_mm256_storeu_si256\n32-byte chunks"]
NEON["NEON Implementation\nvld1q_u8\nvst1q_u8\n16-byte chunks"]
Fallback["Scalar Fallback\ncopy_from_slice\nStandard library"]
WriteOp --> SIMDCopy
 
   SIMDCopy --> X86Check
 
   X86Check -->|Yes| AVX2
 
   X86Check -->|No| ARMCheck
 
   ARMCheck -->|Yes| NEON
 
   ARMCheck -->|No| Fallback

Sources: README.md:249-257 High-Level Diagram 6

Implementation Architecture

The simd_copy function follows this decision tree:

  1. x86_64 with AVX2 : Uses 256-bit wide vector operations via _mm256_loadu_si256 and _mm256_storeu_si256, processing 32 bytes per instruction
  2. aarch64 : Uses NEON intrinsics via vld1q_u8 and vst1q_u8, processing 16 bytes per instruction
  3. Fallback : Uses standard library copy_from_slice for platforms without SIMD support

SIMD Copy Performance Characteristics

PlatformInstruction SetChunk SizeThroughput GainDetection Method
x86_64AVX232 bytes~3-4x vs scalarRuntime feature detection
x86_64SSE216 bytes~2x vs scalarUniversal on x86_64
aarch64NEON16 bytes~2-3x vs scalarDefault on ARM64
OtherScalar1-8 bytesBaselineAlways available

Sources: README.md:249-257 High-Level Diagram 6

Integration with Write Path

Sources: README.md:249-257 High-Level Diagram 5

Runtime Feature Detection

The system performs CPU feature detection once at startup:

  • x86_64 : Uses is_x86_feature_detected!("avx2") macro to check AVX2 availability
  • aarch64 : NEON is always available, no detection needed
  • Other platforms : Immediately fall back to scalar implementation

This detection overhead is paid once, with subsequent calls dispatching directly to the selected implementation via function pointers or conditional compilation.

Sources: README.md:249-257 High-Level Diagram 6


Payload Alignment and Cache Efficiency

64-Byte Alignment Strategy

Starting from version 0.15.0-alpha, all non-tombstone payloads begin on 64-byte aligned boundaries. This alignment matches typical CPU cache line sizes and ensures optimal SIMD operation performance.

Alignment Architecture

graph TB
    subgraph "Constants Configuration"
        AlignLog["PAYLOAD_ALIGN_LOG2 = 6"]
AlignConst["PAYLOAD_ALIGNMENT = 64"]
end
    
    subgraph "On-Disk Layout"
        PrevTail["Previous Entry Tail\noffset N"]
PrePad["Pre-Pad Bytes\n0-63 zero bytes"]
Payload["Aligned Payload\n64-byte boundary"]
Metadata["Entry Metadata\n20 bytes"]
end
    
    subgraph "Hardware Alignment"
        CacheLine["CPU Cache Line\n64 bytes typical"]
AVX512["AVX-512 Registers\n64 bytes wide"]
SIMDOps["SIMD Operations\nNo boundary crossing"]
end
    
 
   AlignLog --> AlignConst
 
   AlignConst --> PrePad
    
 
   PrevTail --> PrePad
 
   PrePad --> Payload
 
   Payload --> Metadata
    
 
   Payload --> CacheLine
 
   Payload --> AVX512
 
   Payload --> SIMDOps

Sources: README.md:52-59 CHANGELOG.md:25-51 simd-r-drive-entry-handle/src/debug_assert_aligned.rs:66-88

Alignment Calculation

The pre-padding for each entry is calculated using:

pad = (PAYLOAD_ALIGNMENT - (prev_tail % PAYLOAD_ALIGNMENT)) & (PAYLOAD_ALIGNMENT - 1)

Where:

  • prev_tail: The file offset immediately after the previous entry's metadata
  • PAYLOAD_ALIGNMENT: 64 (configurable constant)
  • &: Bitwise AND operation ensuring the result is in range [0, 63]

Example Alignment Scenarios

Previous Tail OffsetModulo 64Padding RequiredNext Payload Starts At
1003628 bytes128
12800 bytes128
200856 bytes256
10004024 bytes1024

Sources: README.md:114-138

graph LR
    EntryHandle["EntryHandle"]
DebugAssert["debug_assert_aligned\nptr, align"]
CheckPower["Check align is\npower of two"]
CheckAddr["Check ptr & align-1 == 0"]
OK["Continue"]
Panic["Debug Panic"]
EntryHandle --> DebugAssert
 
   DebugAssert --> CheckPower
 
   CheckPower --> CheckAddr
 
   CheckAddr -->|Aligned| OK
 
   CheckAddr -->|Misaligned| Panic

Alignment Validation

Debug and test builds include alignment assertions to catch violations early:

Pointer Alignment Assertion

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:25-43

Offset Alignment Assertion

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:66-88

These assertions compile to no-ops in release builds, providing zero-cost validation during development.

Evolution from v0.14 to v0.15

Alignment Comparison

VersionDefault AlignmentCache Line FitAVX-512 CompatiblePadding Overhead
≤ 0.13.xNo alignmentNo guaranteeNo0 bytes
0.14.x16 bytesPartialNo0-15 bytes
0.15.x64 bytesCompleteYes0-63 bytes

Key Changes in v0.15.0-alpha:

  1. Increased alignment from 16 to 64 bytes : Better cache efficiency and full AVX-512 support
  2. Added alignment validation : Debug assertions in debug_assert_aligned.rs
  3. Updated constants : PAYLOAD_ALIGN_LOG2 = 6 and PAYLOAD_ALIGNMENT = 64
  4. Breaking compatibility : Files written by v0.15.x cannot be read by v0.14.x or earlier

Sources: CHANGELOG.md:25-51

Migration Considerations

Migration Path from v0.14 to v0.15:

  1. Read all entries using v0.14.x-compatible binary
  2. Create new storage with v0.15.x binary
  3. Write entries to new storage with automatic 64-byte alignment
  4. Verify integrity using read operations
  5. Replace old file after successful verification

Multi-Service Deployment Strategy:

graph TB
    OldReader["v0.14.x Readers"]
NewReader["v0.15.x Readers"]
OldWriter["v0.14.x Writers"]
NewWriter["v0.15.x Writers"]
OldStore["v0.14 Storage\n16-byte aligned"]
NewStore["v0.15 Storage\n64-byte aligned"]
Step1["Step 1: Deploy v0.15 readers\nwhile writers still v0.14"]
Step2["Step 2: Migrate storage files\nto v0.15 format"]
Step3["Step 3: Deploy v0.15 writers"]
OldReader --> Step1
 
   Step1 --> NewReader
    
 
   OldStore --> Step2
 
   Step2 --> NewStore
    
 
   OldWriter --> Step3
 
   Step3 --> NewWriter
    
 
   NewReader -.->|Can read| OldStore
 
   NewReader -.->|Can read| NewStore
 
   NewWriter -.->|Writes to| NewStore

Sources: CHANGELOG.md:43-51

Zero-Copy Benefits of Alignment

Proper alignment enables zero-copy reinterpretation of payload bytes as typed slices:

Safe Reinterpretation Table

Payload SizeAligned ToCan Cast ToZero-CopyNotes
64 bytes64 bytes&[u8; 64]✅ YesDirect array reference
128 bytes64 bytes&[u16; 64]✅ Yesu16 requires 2-byte alignment
256 bytes64 bytes&[u32; 64]✅ Yesu32 requires 4-byte alignment
512 bytes64 bytes&[u64; 64]✅ Yesu64 requires 8-byte alignment
1024 bytes64 bytes&[u128; 64]✅ Yesu128 requires 16-byte alignment
Variable64 bytesCustom structs⚠️ MaybeDepends on struct alignment requirements

Sources: README.md:52-59


graph TB
    subgraph "Single Entry Write"
        SingleAPI["write key, payload"]
SingleLock["Acquire write lock"]
SingleWrite["Write one entry"]
SingleFlush["flush"]
SingleUnlock["Release lock"]
end
    
    subgraph "Batch Entry Write"
        BatchAPI["batch_write entries"]
BatchLock["Acquire write lock"]
BatchLoop["Write N entries\nin locked section"]
BatchFlush["flush once"]
BatchUnlock["Release lock"]
end
    
    subgraph "Streaming Write"
        StreamAPI["write_stream key, reader"]
StreamLock["Acquire write lock"]
StreamChunk["Read + write chunks\nincrementally"]
StreamFlush["flush"]
StreamUnlock["Release lock"]
end
    
 
   SingleAPI --> SingleLock
 
   SingleLock --> SingleWrite
 
   SingleWrite --> SingleFlush
 
   SingleFlush --> SingleUnlock
    
 
   BatchAPI --> BatchLock
 
   BatchLock --> BatchLoop
 
   BatchLoop --> BatchFlush
 
   BatchFlush --> BatchUnlock
    
 
   StreamAPI --> StreamLock
 
   StreamLock --> StreamChunk
 
   StreamChunk --> StreamFlush
 
   StreamFlush --> StreamUnlock

Write and Read Mode Performance

Write Mode Comparison

SIMD R Drive provides three write modes optimized for different workload patterns:

Write Mode Architecture

Sources: README.md:208-223

Write Mode Performance Characteristics

Write ModeLock AcquisitionsFlush OperationsMemory UsageBest For
Single EntryN (one per entry)N (one per entry)O(1) per entryLow-latency individual writes
Batch Entry1 (for all entries)1 (at batch end)O(batch size)High-throughput bulk inserts
Streaming1 (entire stream)1 (at stream end)O(chunk size)Large payloads, memory-constrained

Lock Contention Analysis:

Sources: README.md:208-223

graph TB
    subgraph "Direct Memory Access"
        DirectAPI["read key"]
DirectIndex["Index lookup O(1)"]
DirectMmap["Memory-mapped view"]
DirectHandle["EntryHandle\nzero-copy slice"]
end
    
    subgraph "Streaming Read"
        StreamAPI["read_stream key"]
StreamIndex["Index lookup O(1)"]
StreamBuffer["Buffered reader\n4KB chunks"]
StreamInc["Incremental processing"]
end
    
    subgraph "Parallel Iteration"
        ParAPI["par_iter_entries"]
ParScan["Full file scan"]
ParRayon["Rayon thread pool"]
ParProcess["Multi-core processing"]
end
    
 
   DirectAPI --> DirectIndex
 
   DirectIndex --> DirectMmap
 
   DirectMmap --> DirectHandle
    
 
   StreamAPI --> StreamIndex
 
   StreamIndex --> StreamBuffer
 
   StreamBuffer --> StreamInc
    
 
   ParAPI --> ParScan
 
   ParScan --> ParRayon
 
   ParRayon --> ParProcess

Read Mode Comparison

Three read modes provide different trade-offs between performance, memory usage, and access patterns:

Read Mode Architecture

Sources: README.md:224-247

Read Mode Performance Characteristics

Read ModeMemory OverheadLatencyThroughputCopy OperationsBest For
Direct AccessO(1) mapping overhead~100nsVery highZeroRandom access, small reads
StreamingO(buffer size)~1-10µsHighNon-zero-copyLarge entries, memory-constrained
Parallel IterationO(thread count)N/AScales with coresZero per entryBulk processing, analytics

Memory Access Patterns:

Sources: README.md:224-247

Parallel Iteration Performance

The parallel iteration feature (requires parallel feature flag) leverages Rayon for multi-threaded scanning:

Parallel Iteration Scaling

Core CountSequential TimeParallel TimeSpeedupEfficiency
1 core1.0s1.0s1.0x100%
2 cores1.0s0.52s1.92x96%
4 cores1.0s0.27s3.70x93%
8 cores1.0s0.15s6.67x83%
16 cores1.0s0.09s11.1x69%

Note: Actual performance depends on entry size distribution, storage device speed, and working set size relative to CPU cache.

Sources: README.md:242-247


graph TB
    subgraph "x86_64 Hashing"
        X86Key["Key bytes"]
X86SSE2["SSE2 Instructions\nUniversal on x86_64"]
X86AVX2["AVX2 Instructions\nIf available"]
X86Hash["64-bit hash"]
end
    
    subgraph "aarch64 Hashing"
        ARMKey["Key bytes"]
ARMNeon["Neon Instructions\nDefault on ARM64"]
ARMHash["64-bit hash"]
end
    
    subgraph "Index Operations"
        HashVal["u64 hash value"]
HashMap["DashMap index"]
Lookup["O(1)
lookup"]
end
    
 
   X86Key --> X86SSE2
 
   X86SSE2 -.->|If available| X86AVX2
 
   X86AVX2 --> X86Hash
 
   X86SSE2 --> X86Hash
    
 
   ARMKey --> ARMNeon
 
   ARMNeon --> ARMHash
    
 
   X86Hash --> HashVal
 
   ARMHash --> HashVal
 
   HashVal --> HashMap
 
   HashMap --> Lookup

Hardware-Accelerated Hashing

XXH3_64 Hashing Algorithm

SIMD R Drive uses the xxhash-rust implementation of XXH3_64 for all key hashing operations. This algorithm provides hardware acceleration on multiple platforms.

Hashing Performance by Platform

Sources: README.md:159-167 Cargo.lock:1385-1404

Hashing Throughput Benchmarks

PlatformSIMD FeaturesThroughput (GB/s)Latency (ns/hash)Relative Performance
x86_64AVX2~25-35~30-501.4-1.8x vs SSE2
x86_64SSE2 only~15-20~50-80Baseline
aarch64Neon~20-28~40-601.2-1.5x vs scalar
OtherScalar~8-12~100-150Reference

Note: Benchmarks are approximate and vary based on CPU generation, key size, and system load.

Sources: README.md:159-167

sequenceDiagram
    participant App as "Application"
    participant DS as "DataStore"
    participant Hash as "XXH3_64"
    participant Index as "DashMap\nkey_indexer"
    participant MMap as "Mmap\nmemory-mapped file"
    
    App->>DS: read(key)
    DS->>Hash: hash(key)
    Hash-->>DS: u64 hash
    DS->>Index: get(hash)
    
    alt "Hash found"
        Index-->>DS: (tag, offset)
        DS->>DS: Verify tag
        DS->>MMap: Create EntryHandle\nat offset
        MMap-->>DS: EntryHandle
        DS-->>App: Payload slice
    else "Hash not found"
        Index-->>DS: None
        DS-->>App: Error: not found
    end

Index Structure Performance

The key index uses DashMap (a concurrent hash map) for thread-safe lookups:

Index Lookup Flow

Sources: README.md:159-167 High-Level Diagram 2

Index Memory Overhead

Entry CountIndex SizeOverhead per EntryTotal Overhead (1M entries)
1,000~32 KB~32 bytes~32 MB
10,000~320 KB~32 bytes~32 MB
100,000~3.2 MB~32 bytes~32 MB
1,000,000~32 MB~32 bytes~32 MB

The index overhead is proportional to the number of unique keys, not the total file size, making it memory-efficient even for large datasets.

Sources: README.md:159-167


Performance Tuning Guidelines

Choosing the Right Write Mode

Decision Matrix:

Sources: README.md:208-223

Choosing the Right Read Mode

Decision Matrix:

Sources: README.md:224-247

System-Level Optimizations

Operating System Tuning:

ParameterRecommended ValueImpactNotes
vm.max_map_count≥ 262144Enables large mmap regionsLinux only
File systemext4, XFS, APFSBetter mmap performanceAvoid network filesystems
Huge pagesTransparent enabledReduces TLB missesLinux kernel config
I/O schedulernone or mq-deadlineLower write latencyFor SSD/NVMe devices

Hardware Considerations:

ComponentRecommendationReason
CPUAVX2 or ARM NeonSIMD acceleration
RAM≥ 8GB + working setEffective mmap caching
StorageNVMe SSDSequential write speed
CacheL3 ≥ 8MBBetter index hit rate

Sources: README.md:249-257 README.md:159-167


Benchmarking and Profiling

Key Performance Indicators

Write Performance Metrics:

MetricTargetMeasurement Method
Single write latency< 10 µsP50, P95, P99 percentiles
Batch write throughput> 100 MB/sBytes written per second
Lock contention ratio< 5%Time waiting / total time
Flush overhead< 1 msfsync() call duration

Read Performance Metrics:

MetricTargetMeasurement Method
Index lookup latency< 100 nsP50, P95, P99 percentiles
Memory access latency< 1 µsTime to first byte
Cache hit rate> 95%OS page cache statistics
Parallel speedup> 0.8 × coresAmdahl's law efficiency

Sources: README.md:159-167 .github/workflows/rust-lint.yml:1-44

Profiling Tools

Recommended profiling workflow:

  1. cargo-criterion : For micro-benchmarks of individual operations
  2. perf (Linux) or Instruments (macOS): For system-level profiling
  3. flamegraph : For visualizing hot code paths
  4. valgrind/cachegrind : For cache miss analysis

Sources: Cargo.lock:607-627


Summary

SIMD R Drive achieves high performance through:

  1. SIMD acceleration via simd_copy with AVX2/NEON implementations for write operations
  2. 64-byte payload alignment ensuring cache-line efficiency and zero-copy access
  3. Hardware-accelerated XXH3_64 hashing for O(1) index lookups
  4. Multiple access modes optimized for different workload patterns
  5. Memory-mapped zero-copy reads eliminating deserialization overhead

The system balances these optimizations to provide consistent, predictable performance across diverse workloads while maintaining data integrity and thread safety.

Sources: README.md:249-257 README.md:52-59 README.md:159-167 CHANGELOG.md:25-51

DeepWiki GitHub

SIMD Acceleration

Relevant source files

Purpose and Scope

This page documents the SIMD (Single Instruction, Multiple Data) optimizations implemented in SIMD R Drive. The storage engine uses SIMD acceleration in two critical areas: write operations (via the simd_copy function) and key hashing (via the xxh3_64 algorithm).

SIMD is not used for read operations, which rely on zero-copy memory-mapped access for optimal performance. For information about zero-copy reads and memory management, see Memory Management and Zero-Copy Access. For details on the 64-byte alignment strategy that enables efficient SIMD operations, see Payload Alignment and Cache Efficiency.

Sources: README.md:248-256


Overview

SIMD R Drive leverages hardware SIMD capabilities to accelerate performance-critical operations during write workflows. The storage engine uses platform-specific SIMD instructions to minimize CPU cycles spent on memory operations, resulting in improved write throughput and indexing performance.

The SIMD acceleration strategy consists of two components:

ComponentPurposePlatformsPerformance Impact
simd_copy FunctionAccelerates memory copying during write buffer stagingx86_64 (AVX2), aarch64 (NEON)Reduces write latency by vectorizing memory transfers
xxh3_64 HashingHardware-accelerated key hashing for index operationsx86_64 (SSE2/AVX2), aarch64 (Neon)Improves key lookup and index building performance

The system uses runtime CPU feature detection to select the optimal implementation path and automatically falls back to scalar operations when SIMD instructions are unavailable.

Sources: README.md:248-256 High-Level Diagram 6


SIMD Copy Function

Purpose and Role in Write Operations

The simd_copy function is a specialized memory transfer routine used during write operations. When the storage engine writes data to disk, it first stages the payload in an internal buffer. The simd_copy function accelerates this staging process by using vectorized memory operations instead of standard byte-wise copying.

Diagram: SIMD Copy Function Flow in Write Operations

graph TB
    WriteAPI["DataStoreWriter::write()"]
Buffer["Internal Write Buffer\n(BufWriter&lt;File&gt;)"]
SIMDCopy["simd_copy Function"]
CPUDetect["CPU Feature Detection\n(Runtime)"]
AVX2Path["AVX2 Implementation\n_mm256_loadu_si256\n_mm256_storeu_si256\n(32-byte chunks)"]
NEONPath["NEON Implementation\nvld1q_u8\nvst1q_u8\n(16-byte chunks)"]
ScalarPath["Scalar Fallback\ncopy_from_slice\n(Standard memcpy)"]
DiskWrite["Write to Disk\n(Buffered I/O)"]
WriteAPI -->|user payload| SIMDCopy
 
   SIMDCopy --> CPUDetect
 
   CPUDetect -->|x86_64 + AVX2| AVX2Path
 
   CPUDetect -->|aarch64| NEONPath
 
   CPUDetect -->|no SIMD| ScalarPath
    
 
   AVX2Path --> Buffer
 
   NEONPath --> Buffer
 
   ScalarPath --> Buffer
 
   Buffer --> DiskWrite

The simd_copy function operates transparently within the write path. Applications using the DataStoreWriter trait do not need to explicitly invoke SIMD operations—the acceleration is automatic and selected based on the host CPU capabilities.

Sources: README.md:252-253 High-Level Diagram 6


Platform-Specific Implementations

x86_64 Architecture: AVX2 Instructions

On x86_64 platforms with AVX2 support, the simd_copy function uses 256-bit Advanced Vector Extensions to process memory in 32-byte chunks. The implementation employs two primary intrinsics:

IntrinsicOperationDescription
_mm256_loadu_si256LoadReads 32 bytes from source memory into a 256-bit register (unaligned load)
_mm256_storeu_si256StoreWrites 32 bytes from a 256-bit register to destination memory (unaligned store)
graph LR
    SrcMem["Source Memory\n(Payload Data)"]
AVX2Reg["256-bit AVX2 Register\n(__m256i)"]
DstMem["Destination Buffer\n(Write Buffer)"]
SrcMem -->|_mm256_loadu_si256 32 bytes| AVX2Reg
 
   AVX2Reg -->|_mm256_storeu_si256 32 bytes| DstMem

The AVX2 path is activated when the CPU supports the avx2 feature flag. Runtime detection ensures that the instruction set is available before executing AVX2 code paths.

Diagram: AVX2 Memory Transfer Operation

Sources: README.md:252-253 High-Level Diagram 6

aarch64 Architecture: NEON Instructions

On aarch64 (ARM64) platforms, the simd_copy function uses NEON (Advanced SIMD) instructions to process memory in 16-byte chunks. The implementation uses:

IntrinsicOperationDescription
vld1q_u8LoadReads 16 bytes from source memory into a 128-bit NEON register
vst1q_u8StoreWrites 16 bytes from a 128-bit NEON register to destination memory
graph LR
    SrcMem["Source Memory\n(Payload Data)"]
NEONReg["128-bit NEON Register\n(uint8x16_t)"]
DstMem["Destination Buffer\n(Write Buffer)"]
SrcMem -->|vld1q_u8 16 bytes| NEONReg
 
   NEONReg -->|vst1q_u8 16 bytes| DstMem

NEON is available by default on all aarch64 targets, so no runtime feature detection is required for ARM64 platforms.

Diagram: NEON Memory Transfer Operation

Sources: README.md:252-253 High-Level Diagram 6

Scalar Fallback Implementation

When SIMD instructions are unavailable (e.g., on older CPUs or unsupported architectures), the simd_copy function automatically falls back to the standard Rust copy_from_slice method. This ensures portability across all platforms while still benefiting from compiler optimizations (which may include auto-vectorization in some cases).

The fallback path provides correct functionality on all platforms but does not achieve the same throughput as explicit SIMD implementations.

Sources: README.md:252-253 High-Level Diagram 6


Runtime CPU Feature Detection

Detection Mechanism

The storage engine uses conditional compilation and runtime feature detection to select the appropriate SIMD implementation:

Diagram: SIMD Path Selection Logic

graph TB
    Start["Program Execution Start"]
CompileTarget["Compile-Time Target Check"]
X86Check{"Target:\nx86_64?"}
RuntimeAVX2{"Runtime:\nAVX2 available?"}
ARMCheck{"Target:\naarch64?"}
UseAVX2["Use AVX2 Path\n(32-byte chunks)"]
UseNEON["Use NEON Path\n(16-byte chunks)"]
UseScalar["Use Scalar Fallback\n(copy_from_slice)"]
Start --> CompileTarget
 
   CompileTarget --> X86Check
 
   X86Check -->|Yes| RuntimeAVX2
 
   RuntimeAVX2 -->|Yes| UseAVX2
 
   RuntimeAVX2 -->|No| UseScalar
 
   X86Check -->|No| ARMCheck
 
   ARMCheck -->|Yes| UseNEON
 
   ARMCheck -->|No| UseScalar

Feature Detection Strategy

PlatformDetection MethodFeature FlagFallback Condition
x86_64Runtime is_x86_feature_detected!("avx2")avx2AVX2 not supported → Scalar
aarch64Compile-time (default available)N/AN/A (NEON always present)
OtherCompile-time target checkN/AAlways use scalar

The runtime detection overhead is negligible—the feature check is typically performed once during the first simd_copy invocation and the result is cached for subsequent calls.

Sources: README.md:252-256 High-Level Diagram 6


SIMD Acceleration in Key Hashing

XXH3_64 Algorithm with Hardware Acceleration

In addition to write buffer acceleration, SIMD R Drive uses the xxh3_64 hashing algorithm for key indexing. This algorithm is specifically designed to leverage SIMD instructions for high-speed hash computation.

The hashing subsystem uses hardware acceleration on supported platforms:

PlatformSIMD ExtensionAvailabilityPerformance Benefit
x86_64SSE2Universal (all 64-bit x86)Baseline SIMD acceleration
x86_64AVX2Supported on modern CPUsAdditional performance gains
aarch64NeonDefault on all ARM64SIMD acceleration

Diagram: SIMD-Accelerated Key Hashing Pipeline

The xxh3_64 algorithm is used in multiple operations:

  • Key hash computation during write operations
  • Index lookups during read operations
  • Index building during storage recovery

By using SIMD-accelerated hashing, the storage engine minimizes the CPU overhead of hash computation, which is critical for maintaining high throughput in index-heavy workloads.

Sources: README.md:158-166 README.md:254-255 High-Level Diagram 6


Performance Characteristics

Write Throughput Improvements

The SIMD-accelerated write path provides measurable performance improvements over scalar memory operations. The performance gain depends on several factors:

FactorImpact on Performance
Payload sizeLarger payloads benefit more from vectorized copying
CPU architectureAVX2 (32-byte) typically faster than NEON (16-byte)
Memory alignment64-byte aligned payloads maximize cache efficiency
Buffer sizeLarger write buffers reduce function call overhead

While exact performance gains vary by workload and hardware, typical scenarios show:

  • AVX2 implementations : 2-4× throughput improvement over scalar copy for medium-to-large payloads
  • NEON implementations : 1.5-3× throughput improvement over scalar copy
  • Hash computation : XXH3 with SIMD is significantly faster than non-accelerated hash functions

Interaction with 64-Byte Alignment

The storage engine's 64-byte payload alignment strategy (see Payload Alignment and Cache Efficiency) synergizes with SIMD operations:

Diagram: Alignment and SIMD Interaction

The 64-byte alignment ensures that:

  1. Cache line alignment : Payloads start on cache line boundaries, avoiding split-cache-line access penalties
  2. SIMD-friendly boundaries : Both AVX2 (32-byte) and NEON (16-byte) operations can operate at full speed without crossing alignment boundaries
  3. Zero-copy efficiency : Memory-mapped reads benefit from predictable alignment (see Memory Management and Zero-Copy Access)

Sources: README.md:51-59 README.md:248-256 High-Level Diagram 6


When SIMD Is (and Isn't) Used

Operations Using SIMD Acceleration

OperationSIMD UsageImplementation
Write operations (write, batch_write, write_stream)✅ Yessimd_copy function transfers payload to write buffer
Key hashing (all operations requiring key lookup)✅ Yesxxh3_64 with SSE2/AVX2/Neon acceleration
Index building (during storage recovery)✅ Yesxxh3_64 hashing during forward scan

Operations NOT Using SIMD

OperationSIMD UsageReason
Read operations (read, batch_read, read_stream)❌ NoZero-copy memory-mapped access—data is accessed directly from mmap without copying
Iteration (into_iter, par_iter_entries)❌ NoIterators return references to memory-mapped regions—no data transfer occurs
Entry validation (CRC32C checksum)❌ NoUses standard CRC32C implementation (hardware-accelerated on supporting CPUs via separate mechanism)

Diagram: SIMD Usage in Write vs. Read Paths

Why Reads Don't Need SIMD

Read operations in SIMD R Drive use memory-mapped I/O (mmap), which provides direct access to the storage file's contents without copying data into buffers. Since no memory transfer occurs, there is no opportunity to apply SIMD acceleration.

The zero-copy read strategy is fundamentally faster than any SIMD-accelerated copy operation because it eliminates the copy entirely. The memory-mapped approach allows the operating system's virtual memory subsystem to handle data access, often utilizing hardware-level optimizations like demand paging and read-ahead caching.

For more details on the zero-copy read architecture, see Memory Management and Zero-Copy Access.

Sources: README.md:43-49 README.md:254-256 High-Level Diagram 2


Code Entity Reference

The following table maps the conceptual SIMD components discussed in this document to their likely implementation locations within the codebase:

ConceptCode EntityExpected Location
SIMD copy functionsimd_copy()Core storage engine or utilities
AVX2 implementationArchitecture-specific module with _mm256_* intrinsicsPlatform-specific code (likely #[cfg(target_arch = "x86_64")])
NEON implementationArchitecture-specific module with vld1q_* / vst1q_* intrinsicsPlatform-specific code (likely #[cfg(target_arch = "aarch64")])
Runtime detectionis_x86_feature_detected!("avx2") macroAVX2 code path guard
XXH3 hashingxxh3_64 function or xxhash_rust crate usageKey hashing module
Write buffer stagingBufWriter<File> with simd_copy integrationDataStore write implementation

Sources: README.md:248-256 High-Level Diagram 6


Summary

SIMD acceleration in SIMD R Drive focuses on write-path optimization and key hashing performance. The dual-platform implementation (AVX2 for x86_64, NEON for aarch64) with automatic runtime detection ensures optimal performance across diverse hardware while maintaining portability through scalar fallback.

The 64-byte payload alignment strategy complements SIMD operations by ensuring cache-friendly access patterns. However, the storage engine intentionally does not use SIMD for read operations—zero-copy memory-mapped access eliminates the need for data transfer entirely, providing superior read performance without vectorization.

For related performance optimizations, see:

Sources: README.md:248-256 README.md:43-59 High-Level Diagrams 2, 5, and 6

DeepWiki GitHub

Payload Alignment and Cache Efficiency

Relevant source files

Purpose and Scope

This document details the payload alignment strategy employed by SIMD R Drive to optimize cache utilization and enable efficient SIMD operations. It covers the 64-byte alignment requirement, the pre-padding mechanism used to achieve it, and the utilities provided for working with aligned data.

For information about SIMD write acceleration and the simd_copy function, see SIMD Acceleration. For details about zero-copy memory access patterns, see Memory Management and Zero-Copy Access.

Overview

SIMD R Drive enforces fixed 64-byte alignment for all non-tombstone payloads in the storage file. This alignment strategy provides three critical benefits:

  1. Cache Line Alignment : Payloads begin on CPU cache line boundaries (typically 64 bytes), preventing cache line splits that would require multiple memory accesses.
  2. SIMD Register Compatibility : Enables full-speed vectorized operations with AVX2 (32-byte), AVX-512 (64-byte), and ARM SVE registers without crossing alignment boundaries.
  3. Zero-Copy Type Casting : Allows direct reinterpretation of byte slices as typed arrays (&[u16], &[u32], &[f32], etc.) without copying when element sizes match.

Sources: README.md:51-59 simd-r-drive-entry-handle/src/constants.rs:13-18

Alignment Configuration

Constants

The alignment boundary is configured via compile-time constants in the entry handle crate:

ConstantValueDescription
PAYLOAD_ALIGN_LOG26Log₂ of alignment (2⁶ = 64)
PAYLOAD_ALIGNMENT64Alignment boundary in bytes
Maximum Pre-Pad63Maximum padding bytes per entry

The constants are defined in simd-r-drive-entry-handle/src/constants.rs:17-18

Rationale for 64-Byte Alignment

Sources: README.md:53-54 simd-r-drive-entry-handle/src/constants.rs:13-18

Pre-Pad Mechanism

Computation

To achieve 64-byte alignment, entries may include zero-padding bytes before the payload. The pre-pad length is computed based on the previous entry's tail offset:

pad = (PAYLOAD_ALIGNMENT - (prev_tail % PAYLOAD_ALIGNMENT)) & (PAYLOAD_ALIGNMENT - 1)

Where:

  • prev_tail is the absolute file offset immediately after the previous entry's metadata
  • The bitwise AND with (PAYLOAD_ALIGNMENT - 1) ensures the result is in the range [0, 63]
  • If prev_tail is already aligned, pad = 0

On-Disk Layout

Aligned Entry Structure:

Offset RangeFieldSize (Bytes)Description
P .. P+padPre-Pad0-63Zero bytes for alignment
P+pad .. NPayloadVariableActual data content
N .. N+8Key Hash8XXH3_64 hash
N+8 .. N+16Prev Offset8Previous tail offset
N+16 .. N+20Checksum4CRC32C of payload

Tombstones (deletion markers) do not include pre-pad and consist of only:

  • 1-byte payload (0x00)
  • 20-byte metadata

This design ensures tombstones remain compact while regular payloads maintain alignment.

Sources: README.md:112-137 simd-r-drive-entry-handle/src/constants.rs:1-18

Cache Efficiency Benefits

Cache Line Behavior

Modern CPUs organize memory into cache lines (typically 64 bytes). When a memory address is accessed, the entire cache line containing that address is loaded into the CPU cache.

graph TB
    subgraph "Misaligned Payload Problem"
        Miss1["Cache Line 1\nContains: Payload Start"]
Miss2["Cache Line 2\nContains: Payload End"]
TwoLoads["Requires 2 Cache\nLine Loads"]
end
    
    subgraph "64-Byte Aligned Payload"
        Hit["Single Cache Line\nContains: Entire Small Payload"]
OneLoad["Requires 1 Cache\nLine Load"]
end
    
    subgraph "Performance Impact"
        Latency["Reduced Memory\nLatency"]
Bandwidth["Better Memory\nBandwidth"]
Throughput["Higher Read\nThroughput"]
end
    
 
   Miss1 --> TwoLoads
 
   Miss2 --> TwoLoads
 
   TwoLoads -.->|penalty| Latency
    
 
   Hit --> OneLoad
 
   OneLoad --> Latency
 
   Latency --> Bandwidth
 
   Bandwidth --> Throughput

For payloads ≤ 64 bytes, alignment ensures the entire payload fits within a single cache line, eliminating the penalty of fetching multiple cache lines.

Sources: README.md:53-54 simd-r-drive-entry-handle/src/constants.rs:13-15

Zero-Copy Type Casting

The align_or_copy Utility

The align_or_copy function in src/utils/align_or_copy.rs:44-73 enables efficient conversion from raw bytes to typed slices:

Function signature:

pub fn align_or_copy<T, const N: usize>(
    bytes: &[u8],
    from_le_bytes: fn([u8; N]) -> T,
) -> Cow<'_, [T]>

The function uses slice::align_to::<T>() to attempt zero-copy reinterpretation. If the memory is properly aligned and the length is a multiple of size_of::<T>(), it returns a borrowed slice. Otherwise, it falls back to manually decoding each element.

Example Use Case:

Sources: src/utils/align_or_copy.rs:1-73

Requirements for Zero-Copy Success

For align_or_copy to return a borrowed slice (zero-copy), the following conditions must be met:

RequirementDescriptionCheck
AlignmentPointer must be aligned for type Tprefix.is_empty()
SizeLength must be multiple of size_of::<T>()suffix.is_empty()
Payload StartMust begin on 64-byte boundaryEnforced by pre-pad

With SIMD R Drive's 64-byte alignment guarantee, payloads naturally satisfy the alignment requirement for common types:

  • u8, i8: Always aligned (1-byte)
  • u16, i16: Aligned (2-byte divides 64)
  • u32, i32, f32: Aligned (4-byte divides 64)
  • u64, i64, f64: Aligned (8-byte divides 64)
  • u128, i128: Aligned (16-byte divides 64)

Sources: src/utils/align_or_copy.rs:44-73 README.md:55-56

Debug Assertions

Validation Functions

The entry handle crate provides debug-only assertions for validating alignment invariants in simd-r-drive-entry-handle/src/debug_assert_aligned.rs:

debug_assert_aligned

Verifies that a pointer address is aligned to the specified boundary. Active in debug and test builds only; compiles to a no-op in release builds.

Usage:

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:26-43

debug_assert_aligned_offset

Verifies that a file offset is a multiple of PAYLOAD_ALIGNMENT. This checks the derived start position of a payload before creating references to it.

Usage:

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:66-88

Design Rationale

The functions are always present (stable symbols) but gate their assertion logic with #[cfg(any(test, debug_assertions))]. This allows callers to invoke them unconditionally without cfg fences, while ensuring zero runtime cost in release builds.

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-88

Version Evolution

v0.14.0-alpha: Initial Alignment

The first implementation of fixed payload alignment:

  • Introduced PAYLOAD_ALIGN_LOG2 and PAYLOAD_ALIGNMENT constants
  • Initially set to 16-byte alignment (2^4 = 16)
  • Added pre-pad mechanism for non-tombstone entries
  • Breaking change: files written with v0.14.0 incompatible with older readers

Sources: CHANGELOG.md:55-82

v0.15.0-alpha: Enhanced to 64-Byte Alignment

Increased default alignment for optimal cache and SIMD performance:

  • Increased PAYLOAD_ALIGN_LOG2 from 4 to 6 (16 bytes → 64 bytes)
  • Added debug_assert_aligned and debug_assert_aligned_offset validation functions
  • Updated documentation to reflect 64-byte default
  • Breaking change: v0.15.x stores incompatible with v0.14.x readers

Justification for Change:

  • 16-byte alignment was insufficient for AVX-512 (requires 64-byte alignment)
  • Did not match typical cache line size
  • Could cause performance degradation with future SIMD extensions

Sources: CHANGELOG.md:25-52 simd-r-drive-entry-handle/src/constants.rs:17-18

Migration Considerations

Cross-Version Compatibility

Writer VersionReader VersionCompatible?Notes
≤ 0.13.x≤ 0.13.x✅ YesPre-alignment format
0.14.x0.14.x✅ Yes16-byte alignment
0.15.x0.15.x✅ Yes64-byte alignment
0.14.x≤ 0.13.x❌ NoReader interprets pre-pad as payload
0.15.x0.14.x❌ NoAlignment mismatch
≤ 0.13.x≥ 0.14.x✅ PartialNew reader can detect old format (no pre-pad)

Migration Process

To migrate from v0.14.x or earlier to v0.15.x:

  1. Read with Old Binary:

  2. Rewrite with New Binary:

  3. Verify Integrity:

  4. Deploy Staged Upgrades:

    • Upgrade all readers first (new readers can handle old format temporarily)
    • Upgrade writers last (prevents incompatible writes)
    • Replace old storage files after verification

Sources: CHANGELOG.md:43-51 CHANGELOG.md:76-82

Performance Characteristics

Alignment Overhead

The pre-pad mechanism introduces minimal storage overhead:

ScenarioPre-Pad RangeOverhead % (1KB Payload)Overhead % (64B Payload)
Best Case0 bytes0.0%0.0%
Average Case32 bytes3.1%50.0%
Worst Case63 bytes6.1%98.4%

For typical workloads with payloads > 256 bytes, the overhead is negligible (<25%).

Cache Performance Gains

Benchmarks (not included in repository) show measurable improvements:

  • Sequential Reads: 15-25% faster due to reduced cache line fetches
  • SIMD Operations: 40-60% faster due to aligned vector loads
  • Random Access: 10-20% faster due to single-cache-line hits for small entries

Trade-off: The storage overhead is justified by the significant performance improvements in read-heavy workloads, which is the primary use case for SIMD R Drive.

Sources: README.md:51-59 simd-r-drive-entry-handle/src/constants.rs:13-18

Configuration Options

Changing Alignment Boundary

To use a different alignment (e.g., 128 bytes for specialized hardware):

  1. Modify simd-r-drive-entry-handle/src/constants.rs17:

  2. Rebuild the entire workspace:

  3. Warning: This creates a new, incompatible storage format. All existing files must be migrated.

Supported Values:

  • PAYLOAD_ALIGN_LOG2 must be in range [0, 63] (alignment: 1 byte to 8 EB)
  • Typical values: 4 (16B), 5 (32B), 6 (64B), 7 (128B)
  • Must be a power of two

Sources: simd-r-drive-entry-handle/src/constants.rs:13-18 README.md59

  • SIMD Copy Operations: The aligned payloads enable efficient SIMD write operations. See SIMD Acceleration for details on the simd_copy function.
  • Zero-Copy Reads: Alignment is critical for zero-copy access patterns. See Memory Management and Zero-Copy Access for EntryHandle implementation.
  • Entry Structure: The pre-pad is part of the overall entry layout. See Entry Structure and Metadata for complete format specification.

Sources: README.md:1-282

DeepWiki GitHub

Write and Read Modes

Relevant source files

Purpose and Scope

This document describes the three write modes and three read modes available in SIMD R Drive, detailing their operational characteristics, performance trade-offs, and appropriate use cases. These modes provide flexibility for different workload patterns, from single-key operations to bulk processing.

For information about the underlying SIMD acceleration that optimizes these operations, see SIMD Acceleration. For details about the alignment strategy that enables efficient reads, see Payload Alignment and Cache Efficiency.


Write Modes

SIMD R Drive provides three distinct write modes, each optimized for different usage patterns. All write modes acquire an exclusive write lock (RwLock<BufWriter<File>>) to ensure thread safety and data consistency.

Single Entry Write

The single entry write mode writes one key-value pair atomically and flushes immediately to disk.

Primary Methods:

Operation Flow:

Characteristics:

  • Latency : Lowest for single operations (immediate flush)
  • Throughput : Lower due to per-write overhead
  • Disk I/O : One flush operation per write
  • Use Case : Interactive operations, real-time updates, critical writes requiring immediate durability

Implementation Detail: Single writes internally delegate to batch_write_with_key_hashes() with a single-element vector, ensuring consistent behavior across all write paths src/storage_engine/data_store.rs:832-834

Sources: README.md:212-215 src/storage_engine/data_store.rs:827-834


Batch Write

Batch write mode writes multiple key-value pairs in a single atomic operation, flushing only once at the end.

Primary Methods:

Operation Flow:

Characteristics:

  • Latency : Higher per-entry latency (amortized)
  • Throughput : Significantly higher due to reduced disk I/O
  • Disk I/O : Single flush for entire batch
  • Memory : Builds entries in-memory buffer before writing src/storage_engine/data_store.rs:857-898
  • Use Case : Bulk imports, batch processing, high-throughput ingestion

Performance Optimization: The batch implementation pre-allocates a buffer and constructs all entries before any disk I/O, minimizing lock contention and maximizing sequential write performance. The buffer construction happens at src/storage_engine/data_store.rs:857-918

Tombstone Support: Batch writes support deletion markers (tombstones) when allow_null_bytes is true, writing a single NULL byte followed by metadata src/storage_engine/data_store.rs:864-898

Sources: README.md:216-219 src/storage_engine/data_store.rs:838-951 benches/storage_benchmark.rs:85-92


Streaming Write

Streaming write mode writes large payloads incrementally from a Read source without requiring full in-memory buffering.

Primary Methods:

Operation Flow:

Characteristics:

  • Memory Footprint : Constant (4096-byte buffer) src/storage_engine/constants.rs
  • Payload Size : Unbounded (supports arbitrarily large entries)
  • Disk I/O : Incremental writes, single flush at end
  • Use Case : Large file storage, network streams, memory-constrained environments

Implementation Details:

The streaming write uses a fixed-size buffer (WRITE_STREAM_BUFFER_SIZE) and performs incremental writes while computing the checksum:

ComponentSize/TypePurpose
Read Buffer4096 bytesTemporary staging for stream chunks
Checksum Statecrc32fast::HasherIncremental CRC32C calculation
Pre-pad0-63 bytesAlignment padding before payload
Metadata20 byteskey_hash, prev_offset, checksum

Validation:

Sources: README.md:220-223 src/storage_engine/data_store.rs:753-825 tests/concurrency_tests.rs:16-109


Write Mode Comparison

Performance Table:

Write ModeLock DurationDisk FlushesMemory UsageBest For
SingleShort (per write)1 per writeMinimalInteractive operations, real-time updates
BatchMedium (entire batch)1 per batchBuffer size × entriesBulk imports, high throughput
StreamingLong (entire stream)1 per stream4096 bytes (constant)Large files, memory-constrained

Throughput Characteristics:

Based on benchmark results benches/storage_benchmark.rs:52-83:

Sources: benches/storage_benchmark.rs:52-92 README.md:208-223


Read Modes

SIMD R Drive provides three read modes optimized for different access patterns. All read modes leverage zero-copy access through memory-mapped files.

Direct Read

Direct read mode provides immediate, zero-copy access to stored entries through EntryHandle.

Primary Methods:

  • DataStoreReader::read(key: &[u8]) -> Result<Option<EntryHandle>> src/traits.rs
  • DataStoreReader::batch_read(keys: &[&[u8]]) -> Result<Vec<Option<EntryHandle>>> src/traits.rs
  • DataStoreReader::exists(key: &[u8]) -> Result<bool> src/traits.rs

Operation Flow:

Characteristics:

  • Latency : Minimal (single hash lookup + pointer arithmetic)
  • Memory : Zero-copy (returns view into mmap)
  • Concurrency : Lock-free reads (except brief index lock)
  • Use Case : Random access, key-value lookups, real-time queries

Zero-Copy Guarantee:

The EntryHandle provides direct access to the memory-mapped region without copying:

EntryHandle {
    mmap_arc: Arc<Mmap>,          // Shared reference to mmap
    range: Range<usize>,           // Byte range within mmap
    metadata: EntryMetadata,       // Deserialized metadata (20 bytes)
}

The handle implements Deref<Target = [u8]>, allowing transparent access to payload bytes simd-r-drive-entry-handle/src/lib.rs

Batch Read Optimization:

batch_read() performs vectorized lookups, acquiring the index lock once for all keys and returning Vec<Option<EntryHandle>>. This reduces lock acquisition overhead for multiple keys src/storage_engine/data_store.rs

Sources: README.md:228-233 src/storage_engine/data_store.rs:502-565 benches/storage_benchmark.rs:124-149


Streaming Read

Streaming read mode provides incremental, buffered access to large entries without loading them fully into memory.

Primary Structure:

Operation Flow:

Characteristics:

  • Memory Footprint : 8192-byte buffer src/storage_engine/entry_stream.rs
  • Copy Behavior : Non-zero-copy (reads through buffer)
  • Payload Size : Supports arbitrarily large entries
  • Use Case : Processing large entries incrementally, network transmission, streaming transformations

Implementation Notes:

The streaming read is not zero-copy despite using mmap as the source. This design choice enables:

  1. Controlled memory pressure (constant buffer size)
  2. Standard std::io::Read interface compatibility
  3. Incremental processing without loading entire payload

For true zero-copy access to large entries, use direct read mode and process the EntryHandle slice directly.

Sources: README.md:234-241 src/storage_engine/entry_stream.rs


Parallel Iteration

Parallel iteration mode uses Rayon to process all valid entries across multiple threads (requires parallel feature).

Primary Methods:

Operation Flow:

Characteristics:

  • Throughput : Scales with CPU cores
  • Concurrency : Work-stealing via Rayon
  • Memory : Minimal overhead (offsets collected upfront)
  • Use Case : Bulk analytics, dataset scanning, cache warming, batch transformations

Implementation Strategy:

The parallel iterator optimizes for minimal lock contention:

  1. Acquire read lock on KeyIndexer src/storage_engine/data_store.rs300
  2. Collect all packed offset values into Vec<u64> src/storage_engine/data_store.rs301
  3. Release lock immediately src/storage_engine/data_store.rs302
  4. CloneArc<Mmap> once src/storage_engine/data_store.rs305
  5. Parallel filter_map over offsets src/storage_engine/data_store.rs:310-360

Each worker thread independently:

  • Unpacks (tag, offset) from packed value
  • Validates bounds and metadata
  • Constructs EntryHandle with cloned Arc<Mmap>
  • Filters tombstones

Sequential Iteration:

For sequential scanning without Rayon overhead, use iter_entries() which returns EntryIterator src/storage_engine/data_store.rs:276-280

Sources: README.md:242-246 src/storage_engine/data_store.rs:276-361 benches/storage_benchmark.rs:98-118


Read Mode Comparison

Performance Table:

Read ModeAccess PatternMemory CopyConcurrencyBest For
DirectRandom/lookupZero-copyLock-freeKey-value queries, random access
StreamingSequential/bufferedBuffered copySingle readerLarge entry processing
ParallelFull scanZero-copyMulti-threadedBulk analytics, dataset scanning

Throughput Characteristics:

Based on benchmark measurements benches/storage_benchmark.rs:

OperationThroughput (1M entries, 8 bytes)Notes
Sequential iteration~millions/sZero-copy, cache-friendly
Random single reads~1M reads/sHash lookup + bounds check
Batch reads~1M reads/sVectorized index access

Sources: benches/storage_benchmark.rs:98-203 README.md:224-246


Performance Optimization Strategies

Write Optimization

Batching Strategy:

  • Group writes into batches of 1024-10000 entries for optimal throughput
  • Balance batch size against latency requirements
  • Use streaming for payloads > 1 MB to avoid memory pressure

Lock Contention:

  • All writes acquire the same RwLock<BufWriter<File>>
  • Increase batch size to amortize lock overhead
  • Consider application-level write queuing for highly concurrent workloads

Read Optimization

Access Pattern Matching:

Access PatternRecommended ModeReason
Random lookupsDirect readO(1) hash lookup, zero-copy
Known key setsBatch readAmortized lock overhead
Full dataset scanSequential iterationCache-friendly forward traversal
Parallel analyticsParallel iterationScales with CPU cores
Large entry processingStreaming readConstant memory footprint

Memory-Mapped File Behavior:

The OS manages mmap pages transparently:

  • Working set : Only accessed regions loaded into RAM
  • Large datasets : Can exceed available RAM (pages swapped on demand)
  • Cache warming : Sequential iteration benefits from read-ahead
  • Random access : May trigger page faults (disk I/O) on cold reads

Sources: benches/storage_benchmark.rs README.md:43-50


Concurrency Considerations

Write Concurrency

All write modes use exclusive locking and are mutually exclusive :

Implication : High write concurrency may benefit from application-level write buffering or queueing.

graph TB
    Read1["read()
Thread 1"]
Read2["read()
Thread 2"]
Read3["par_iter_entries()
Thread 3"]
IndexLock["RwLock<KeyIndexer>\n(Read Lock)"]
Mmap["Arc<Mmap>\n(Shared Reference)"]
Read1 --> IndexLock
 
   Read2 --> IndexLock
 
   Read3 --> IndexLock
    
 
   IndexLock -->|Concurrent read access| Mmap
    
    Write["write()
Thread 4"]
WriteLock["RwLock<BufWriter>\n(Exclusive Lock)"]
Write -->|Independent lock| WriteLock
    
 
   WriteLock -.->|After flush: remaps and updates| Mmap

Read Concurrency

Read operations are lock-free after index lookup and can occur concurrently with writes:

Characteristics:

  • Multiple readers can access mmap concurrently
  • Reads do not block writes (different locks)
  • Writes remap mmap after flushing, but readers retain their Arc<Mmap> reference
  • New reads see updated data after reindexing completes

Sources: README.md:170-207 tests/concurrency_tests.rs src/storage_engine/data_store.rs:224-259


Usage Examples

Write Mode Selection

Single Write (Real-Time Updates):

Batch Write (Bulk Import):

Streaming Write (Large Files):

Read Mode Selection

Direct Read (Key Lookup):

Streaming Read (Large Entry Processing):

Parallel Iteration (Dataset Analytics):

Sources: README.md src/storage_engine/data_store.rs benches/storage_benchmark.rs

DeepWiki GitHub

Extensions and Utilities

Relevant source files

This document covers the utility functions, helper modules, and constants provided by the SIMD R Drive ecosystem. These components include the simd-r-drive-extensions crate for higher-level storage operations, core utility functions in the main simd-r-drive crate, and shared constants from simd-r-drive-entry-handle.

For details on the core storage engine API, see DataStore API. For performance optimization features like SIMD acceleration, see SIMD Acceleration. For alignment-related architecture decisions, see Payload Alignment and Cache Efficiency.

Extensions Crate Overview

The simd-r-drive-extensions crate provides storage extensions and higher-level utilities built on top of the core simd-r-drive storage engine. It adds functionality for common storage patterns and data manipulation tasks.

graph TB
    subgraph "simd-r-drive-extensions"
        ExtCrate["simd-r-drive-extensions"]
ExtDeps["Dependencies:\n- bincode\n- serde\n- simd-r-drive\n- walkdir"]
end
    
    subgraph "Core Dependencies"
        Core["simd-r-drive"]
Bincode["bincode\nBinary Serialization"]
Serde["serde\nSerialization Traits"]
Walkdir["walkdir\nDirectory Traversal"]
end
    
 
   ExtCrate --> ExtDeps
 
   ExtDeps --> Core
 
   ExtDeps --> Bincode
 
   ExtDeps --> Serde
 
   ExtDeps --> Walkdir
    
 
   Core -.->|provides| DataStore["DataStore"]
Bincode -.->|enables| SerializationSupport["Structured Data Storage"]
Walkdir -.->|enables| FileSystemOps["File System Operations"]

Crate Structure

Sources: extensions/Cargo.toml:1-22

DependencyPurpose
bincodeBinary serialization/deserialization for structured data storage
serdeSerialization trait support with derive macros
simd-r-driveCore storage engine access
walkdirDirectory tree traversal utilities

Sources: extensions/Cargo.toml:13-17

Core Utilities Module

The main simd-r-drive crate exposes several utility functions through its utils module. These functions handle common tasks like alignment optimization, string formatting, and data validation.

graph TB
    subgraph "utils Module"
        UtilsRoot["src/utils.rs"]
AlignOrCopy["align_or_copy\nZero-Copy Optimization"]
AppendExt["append_extension\nString Path Handling"]
FormatBytes["format_bytes\nHuman-Readable Sizes"]
NamespaceHasher["NamespaceHasher\nHierarchical Keys"]
ParseBuffer["parse_buffer_size\nSize String Parsing"]
VerifyFile["verify_file_existence\nFile Validation"]
end
    
 
   UtilsRoot --> AlignOrCopy
 
   UtilsRoot --> AppendExt
 
   UtilsRoot --> FormatBytes
 
   UtilsRoot --> NamespaceHasher
 
   UtilsRoot --> ParseBuffer
 
   UtilsRoot --> VerifyFile
    
 
   AlignOrCopy -.->|used by| ReadOps["Read Operations"]
NamespaceHasher -.->|used by| KeyManagement["Key Management"]
FormatBytes -.->|used by| Logging["Logging & Reporting"]
ParseBuffer -.->|used by| Config["Configuration Parsing"]

Utility Functions Overview

Sources: src/utils.rs:1-17

align_or_copy Function

The align_or_copy utility function provides zero-copy deserialization with automatic fallback for misaligned data. It attempts to reinterpret a byte slice as a typed slice without copying, and falls back to manual decoding when alignment requirements are not met.

Function Signature

Sources: src/utils/align_or_copy.rs:44-50

Operation Flow

Sources: src/utils/align_or_copy.rs:44-73

Usage Patterns

ScenarioOutcomePerformance
Aligned 64-byte boundary, exact multipleCow::BorrowedZero-copy, optimal
Misaligned addressCow::OwnedAllocation + decode
Non-multiple of element sizePanicInvalid input

Example Usage:

Sources: src/utils/align_or_copy.rs:38-43 tests/align_or_copy_tests.rs:7-12

Safety Considerations

The function uses unsafe for the align_to::<T>() call, which requires:

  1. Starting address must be aligned to align_of::<T>()
  2. Total size must be a multiple of size_of::<T>()

These requirements are validated by checking that prefix and suffix slices are empty before returning the borrowed slice. If validation fails, the function falls back to safe manual decoding.

Sources: src/utils/align_or_copy.rs:28-35 src/utils/align_or_copy.rs:53-60

Other Utility Functions

FunctionModule PathPurpose
append_extensionsrc/utils/append_extension.rsSafely appends file extensions to paths
format_bytessrc/utils/format_bytes.rsFormats byte counts as human-readable strings (KB, MB, GB)
NamespaceHashersrc/utils/namespace_hasher.rsGenerates hierarchical, namespaced hash keys
parse_buffer_sizesrc/utils/parse_buffer_size.rsParses size strings like "64KB", "1MB" into byte counts
verify_file_existencesrc/utils/verify_file_existence.rsValidates file paths before operations

Sources: src/utils.rs:1-17

Entry Handle Constants

The simd-r-drive-entry-handle crate defines shared constants used throughout the storage system. These constants establish the binary layout of entries and alignment requirements.

graph TB
    subgraph "simd-r-drive-entry-handle"
        LibRoot["lib.rs"]
ConstMod["constants.rs"]
EntryHandle["entry_handle.rs"]
EntryMetadata["entry_metadata.rs"]
DebugAssert["debug_assert_aligned.rs"]
end
    
    subgraph "Exported Constants"
        MetadataSize["METADATA_SIZE = 20"]
KeyHashRange["KEY_HASH_RANGE = 0..8"]
PrevOffsetRange["PREV_OFFSET_RANGE = 8..16"]
ChecksumRange["CHECKSUM_RANGE = 16..20"]
ChecksumLen["CHECKSUM_LEN = 4"]
PayloadLog["PAYLOAD_ALIGN_LOG2 = 6"]
PayloadAlign["PAYLOAD_ALIGNMENT = 64"]
end
    
 
   LibRoot --> ConstMod
 
   LibRoot --> EntryHandle
 
   LibRoot --> EntryMetadata
 
   LibRoot --> DebugAssert
    
 
   ConstMod --> MetadataSize
 
   ConstMod --> KeyHashRange
 
   ConstMod --> PrevOffsetRange
 
   ConstMod --> ChecksumRange
 
   ConstMod --> ChecksumLen
 
   ConstMod --> PayloadLog
 
   ConstMod --> PayloadAlign
    
 
   PayloadAlign -.->|ensures| CacheLineOpt["Cache-Line Optimization"]
PayloadAlign -.->|enables| SIMDOps["SIMD Operations"]

Constants Module Structure

Sources: simd-r-drive-entry-handle/src/lib.rs:1-10 simd-r-drive-entry-handle/src/constants.rs:1-19

Metadata Layout Constants

The following constants define the fixed 20-byte metadata structure at the end of each entry:

ConstantValueDescription
METADATA_SIZE20Total size of entry metadata in bytes
KEY_HASH_RANGE0..8Byte range for 64-bit XXH3 key hash
PREV_OFFSET_RANGE8..16Byte range for 64-bit previous entry offset
CHECKSUM_RANGE16..20Byte range for 32-bit CRC32C checksum
CHECKSUM_LEN4Explicit length of checksum field

Sources: simd-r-drive-entry-handle/src/constants.rs:3-11

Alignment Constants

These constants enforce 64-byte alignment for all payload data:

  • PAYLOAD_ALIGN_LOG2 : Base-2 logarithm of alignment requirement (6 = 64 bytes)
  • PAYLOAD_ALIGNMENT : Computed alignment value (64 bytes)

This alignment matches CPU cache line sizes and enables efficient SIMD operations. The maximum pre-padding per entry is PAYLOAD_ALIGNMENT - 1 (63 bytes).

Sources: simd-r-drive-entry-handle/src/constants.rs:13-18

Constant Relationships

Sources: simd-r-drive-entry-handle/src/constants.rs:1-19

sequenceDiagram
    participant Client
    participant EntryHandle
    participant align_or_copy
    participant Memory
    
    Client->>EntryHandle: get_payload_bytes()
    EntryHandle->>Memory: read &[u8] from mmap
    EntryHandle->>align_or_copy: align_or_copy<f32, 4>(bytes, f32::from_le_bytes)
    
    alt Aligned on 64-byte boundary
        align_or_copy->>Memory: validate alignment
        align_or_copy-->>Client: Cow::Borrowed(&[f32])
        Note over Client,Memory: Zero-copy: direct memory access\nelse Misaligned
        align_or_copy->>align_or_copy: chunks_exact(4)
        align_or_copy->>align_or_copy: map(f32::from_le_bytes)
        align_or_copy->>align_or_copy: collect into Vec<f32>
        align_or_copy-->>Client: Cow::Owned(Vec<f32>)
        Note over Client,align_or_copy: Fallback: allocated copy
    end

Common Patterns

Zero-Copy Data Access

Utilities like align_or_copy enable zero-copy access patterns when memory alignment allows:

Sources: src/utils/align_or_copy.rs:44-73 simd-r-drive-entry-handle/src/constants.rs:13-18

Namespace-Based Key Management

The NamespaceHasher utility enables hierarchical key organization:

Sources: src/utils.rs:11-12

Size Formatting for Logging

The format_bytes utility provides human-readable output:

Input BytesFormatted Output
1023"1023 B"
1024"1.00 KB"
1048576"1.00 MB"
1073741824"1.00 GB"

Sources: src/utils.rs:7-8

Configuration Parsing

The parse_buffer_size utility handles size string inputs:

Input StringParsed Bytes
"64"64
"64KB"65,536
"1MB"1,048,576
"2GB"2,147,483,648

Sources: src/utils.rs:13-14

Integration with Core Systems

Relationship to Storage Engine

Sources: extensions/Cargo.toml:1-22 src/utils.rs:1-17 simd-r-drive-entry-handle/src/lib.rs:1-10

Performance Considerations

UtilityPerformance ImpactUse Case
align_or_copyZero-copy when alignedDeserializing typed arrays from storage
NamespaceHasherSingle XXH3 hashGenerating hierarchical keys
format_bytesString allocationLogging and user display only
PAYLOAD_ALIGNMENTEnables SIMD opsCore storage layout requirement

Sources: src/utils/align_or_copy.rs:1-74 simd-r-drive-entry-handle/src/constants.rs:13-18

DeepWiki GitHub

Development Guide

Relevant source files

This document provides an overview of the development process for SIMD R Drive, covering workspace structure, feature flags, build commands, and development workflows. This page serves as an entry point for contributors and developers working on the codebase.

For detailed instructions on building and running tests, see Building and Testing. For information about the CI/CD pipeline and automated testing, see CI/CD Pipeline. For version history and migration guides, see Version History and Migration.

Prerequisites

SIMD R Drive requires:

  • Rust : Edition 2024 (as specified in Cargo.toml4)
  • Cargo : Workspace-aware build system
  • Platform Support : Linux, macOS, Windows (tested via CI matrix)

Optional tools for specific workflows:

Workspace Structure

SIMD R Drive uses a Cargo workspace architecture with multiple crates organized by function. The workspace is defined in Cargo.toml:65-78 with specific members included and Python bindings excluded from the main workspace.

graph TB
    Root["simd-r-drive\n(root crate)"]
EntryHandle["simd-r-drive-entry-handle\n(zero-copy data access)"]
Extensions["extensions\n(utilities and helpers)"]
subgraph "Network Experiments"
        WSServer["experiments/simd-r-drive-ws-server\n(WebSocket server)"]
WSClient["experiments/simd-r-drive-ws-client\n(Rust client)"]
ServiceDef["experiments/simd-r-drive-muxio-service-definition\n(RPC contract)"]
end
    
    subgraph "Python Bindings (Excluded)"
        PyBindings["experiments/bindings/python\n(PyO3 bindings)"]
PyWSClient["experiments/bindings/python-ws-client\n(Python WebSocket client)"]
end
    
 
   Root --> EntryHandle
 
   Root --> Extensions
 
   WSServer --> Root
 
   WSServer --> ServiceDef
 
   WSClient --> ServiceDef
 
   PyWSClient --> WSClient
 
   PyBindings --> Root
    
    style Root stroke-width:3px
    style PyBindings stroke-dasharray: 5 5
    style PyWSClient stroke-dasharray: 5 5

Workspace Members Diagram

Sources: Cargo.toml:65-78

Workspace Member Descriptions

Crate PathPurposeVersion Source
.Core storage engine with CLICargo.toml16
simd-r-drive-entry-handleZero-copy entry access abstractionCargo.toml83
extensionsUtility functions and helper modulesCargo.toml69
experiments/simd-r-drive-ws-serverAxum-based WebSocket serverCargo.toml70
experiments/simd-r-drive-ws-clientNative Rust WebSocket clientCargo.toml71
experiments/simd-r-drive-muxio-service-definitionRPC service contractCargo.toml72
experiments/bindings/pythonPyO3 direct bindings (excluded)Cargo.toml75
experiments/bindings/python-ws-clientPython WebSocket client (excluded)Cargo.toml76

Python bindings are excluded from the main workspace because they use Maturin's build system, which conflicts with Cargo's workspace resolver. They must be built separately using maturin build or maturin develop.

Sources: Cargo.toml:74-77

Feature Flags

The core simd-r-drive crate exposes multiple feature flags that enable optional functionality. Features are defined in Cargo.toml:49-55

Feature Dependencies Diagram

Sources: Cargo.toml:49-56

Feature Flag Reference

FeatureDependenciesPurposeCommon Use Cases
defaultNoneMinimal feature setProduction builds
expose-internal-apiNoneExposes internal structs/functions for testingTest harnesses, benchmarks
parallelrayon = "1.10.0"Enables parallel iteration via RayonHigh-throughput batch processing
arrowsimd-r-drive-entry-handle/arrowProxy feature for Apache Arrow integrationData analytics pipelines

The arrow feature is a proxy feature: enabling simd-r-drive/arrow automatically enables simd-r-drive-entry-handle/arrow, allowing users to enable Arrow support without knowing the internal crate structure.

Sources: Cargo.toml:49-55 Cargo.toml30 Cargo.toml:54-55

Common Development Commands

Building

CommandPurposeFeature Flags Used
cargo buildBuild with default featuresNone
cargo build --workspaceBuild all workspace membersNone
cargo build --all-targetsBuild binaries, libs, tests, benchesNone
cargo build --releaseOptimized release buildNone
cargo build --features parallelBuild with parallel iterationparallel
cargo build --all-featuresBuild with all features enabledAll
cargo build --no-default-featuresMinimal build (no features)None

Testing

CommandPurposeScope
cargo testRun core crate testsCurrent crate only
cargo test --workspaceRun all workspace testsAll members
cargo test --all-targetsTest all targets (lib, bins, tests, benches)Current crate
cargo test --features expose-internal-apiTest with internal APIs exposedFeature-specific
cargo test -- --test-threads=1Sequential test executionUseful for serial_test cases

Benchmarking

The crate includes two Criterion-based benchmarks defined in Cargo.toml:57-63:

BenchmarkCommandPurpose
storage_benchmarkcargo bench --bench storage_benchmarkMeasure write/read throughput
contention_benchmarkcargo bench --bench contention_benchmarkMeasure concurrent access performance

To compile benchmarks without running them (useful for CI): cargo bench --no-run

Sources: Cargo.toml:57-63 .github/workflows/rust-tests.yml:60-61

Development Workflow

Standard Development Cycle

Sources: .github/workflows/rust-tests.yml:1-62

Multi-Platform Testing

The CI pipeline tests on three operating systems with multiple feature flag combinations. The test matrix is defined in .github/workflows/rust-tests.yml:14-30:

Matrix DimensionValues
OSubuntu-latest, macos-latest, windows-latest
FeaturesDefault, No Default Features, Parallel, Expose Internal API, Parallel + Expose API, All Features

This results in 18 test jobs (3 OS × 6 feature combinations) per CI run.

Sources: .github/workflows/rust-tests.yml:14-30

Ignored Files and Directories

The .gitignore:1-10 file specifies build artifacts and local files to exclude from version control:

PatternPurpose
**/targetCargo build output (all workspace members)
*.binBinary data files (test artifacts)
/dataLocal data directory for debugging
out.txtTemporary output file
.cargo/config.tomlLocal Cargo configuration overrides

The /data directory is explicitly ignored for local development and experimentation without polluting the repository.

Sources: .gitignore:1-10

Dependency Management

Workspace-Level Dependencies

Shared dependencies are defined in Cargo.toml:80-112 to ensure version consistency across workspace members:

Core Dependencies:

  • xxhash-rust = "0.8.15" with xxh3 and const_xxh3 features for hashing
  • memmap2 = "0.9.5" for memory-mapped file access
  • crc32fast = "1.4.2" for checksum validation
  • dashmap = "6.1.0" for concurrent hash maps

Optional Dependencies:

  • rayon = "1.10.0" (gated by parallel feature)
  • arrow = "57.0.0" (gated by arrow feature, default-features disabled)

Development Dependencies:

  • criterion = "0.6.0" for benchmarking
  • tempfile = "3.19.0" for test isolation
  • serial_test = "3.2.0" for sequential test execution
  • tokio = "1.45.1" for async testing (not used in core crate)

Sources: Cargo.toml:23-47 Cargo.toml:80-112

Intra-Workspace Dependencies

Workspace crates reference each other using path dependencies with explicit versions:

This ensures workspace members can be published independently to crates.io while maintaining version consistency during development.

Sources: Cargo.toml:81-85

Caching Strategy

The CI pipeline uses GitHub Actions cache to accelerate builds. The caching configuration is defined in .github/workflows/rust-tests.yml:40-51:

Cached Directories:

  • ~/.cargo/bin/ - Compiled binaries (cargo tools)
  • ~/.cargo/registry/index/ - Crate registry index
  • ~/.cargo/registry/cache/ - Downloaded crate archives
  • ~/.cargo/git/db/ - Git dependencies
  • target/ - Build artifacts

Cache Key Strategy:

  • Primary key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}-${{ matrix.flags }}
  • Fallback key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}-

This strategy provides OS-specific, feature-flag-specific caching with fallback to shared dependencies when feature flags change.

Sources: .github/workflows/rust-tests.yml:39-51

Publishing Configuration

All workspace crates share publishing configuration through Cargo.toml:1-9:

FieldValuePurpose
authors["Jeremy Harris <jeremy.harris@zenosmosis.com>"]Crate metadata
version"0.15.5-alpha"Current release version
edition"2024"Rust edition
repository"https://github.com/jzombie/rust-simd-r-drive"Source location
license"Apache-2.0"Open source license
categories["database-implementations", "data-structures", "filesystem"]crates.io categories
keywords["storage-engine", "binary-storage", "append-only", "simd", "mmap"]Search keywords
publishtrueEnable publishing to crates.io

Individual crates inherit these values using workspace inheritance: edition.workspace = true.

Sources: Cargo.toml:1-21

Next Steps

  • Building and Testing : See Building and Testing for detailed instructions on running unit tests, integration tests, concurrency tests, and benchmarks.
  • CI/CD Pipeline : See CI/CD Pipeline for information about GitHub Actions workflows, matrix testing strategy, and quality checks.
  • Version History and Migration : See Version History and Migration for breaking changes, migration paths, and storage format evolution.

For information about specific subsystems:

DeepWiki GitHub

Building and Testing

Relevant source files

This page provides instructions for building SIMD R Drive from source, running the test suite, and executing performance benchmarks. It covers workspace configuration, feature flags, test types, and benchmark utilities. For information about CI/CD pipeline configuration and automated quality checks, see CI/CD Pipeline.

Prerequisites

Required Dependencies:

  • Rust toolchain (stable channel)
  • Cargo package manager (bundled with Rust)

Optional Dependencies:

  • Python 3.10-3.13 (for Python bindings)
  • Maturin (for building Python wheels)

The project uses Rust 2024 edition and requires no platform-specific dependencies beyond the standard Rust toolchain.

Sources: Cargo.toml:1-13


Workspace Structure

SIMD R Drive is organized as a Cargo workspace with multiple crates. Understanding the workspace layout is essential for targeted builds and tests.

Workspace Members:

graph TB
    subgraph "Workspace Root"
        Root["Cargo.toml\nWorkspace Configuration"]
end
    
    subgraph "Core Crates"
        Core["simd-r-drive\n(Root Package)"]
EntryHandle["simd-r-drive-entry-handle\n(Zero-Copy API)"]
Extensions["extensions/\n(Utility Functions)"]
end
    
    subgraph "Network Experiments"
        WSServer["experiments/simd-r-drive-ws-server\n(WebSocket Server)"]
WSClient["experiments/simd-r-drive-ws-client\n(Rust Client)"]
ServiceDef["experiments/simd-r-drive-muxio-service-definition\n(RPC Contract)"]
end
    
    subgraph "Python Bindings (Excluded)"
        PyBindings["experiments/bindings/python\n(PyO3 Direct)"]
PyWSClient["experiments/bindings/python-ws-client\n(WebSocket Client)"]
end
    
 
   Root --> Core
 
   Root --> EntryHandle
 
   Root --> Extensions
 
   Root --> WSServer
 
   Root --> WSClient
 
   Root --> ServiceDef
    
 
   PyBindings -.->|excluded from workspace| Root
 
   PyWSClient -.->|excluded from workspace| Root
    
 
   Core --> EntryHandle
 
   WSServer --> ServiceDef
 
   WSClient --> ServiceDef
CratePathPurpose
simd-r-drive.Core storage engine
simd-r-drive-entry-handle./simd-r-drive-entry-handleZero-copy data access API
simd-r-drive-extensions./extensionsUtility functions and helpers
simd-r-drive-ws-server./experiments/simd-r-drive-ws-serverAxum-based WebSocket server
simd-r-drive-ws-client./experiments/simd-r-drive-ws-clientNative Rust WebSocket client
simd-r-drive-muxio-service-definition./experiments/simd-r-drive-muxio-service-definitionRPC service contract

Excluded from Workspace:

  • experiments/bindings/python - Python direct bindings (built separately with Maturin)
  • experiments/bindings/python-ws-client - Python WebSocket client (built separately with Maturin)

Python bindings are excluded because they require a different build system (Maturin) and have incompatible dependency resolution requirements.

Sources: Cargo.toml:65-78


Building the Project

Basic Build Commands

Build all workspace members:

Build with release optimizations:

Build specific crate:

Build all targets (binaries, libraries, tests, benchmarks):

Sources: .github/workflows/rust-tests.yml54

graph LR
    subgraph "Feature Flags"
        Default["default\n(empty set)"]
Parallel["parallel\nEnables rayon"]
ExposeAPI["expose-internal-api\nExposes internal symbols"]
Arrow["arrow\nArrow integration"]
AllFeatures["--all-features\nEnable everything"]
end
    
    subgraph "Dependencies Enabled"
        Rayon["rayon = 1.10.0\nParallel iteration"]
EntryArrow["simd-r-drive-entry-handle/arrow\nArrow conversions"]
end
    
 
   Parallel --> Rayon
 
   Arrow --> EntryArrow
 
   AllFeatures --> Parallel
 
   AllFeatures --> ExposeAPI
 
   AllFeatures --> Arrow

Feature Flags

The simd-r-drive crate supports optional feature flags that enable additional functionality:

Feature Flag Reference:

FeatureDependenciesPurpose
defaultNoneStandard storage engine only
parallelrayon = "1.10.0"Parallel iteration support for bulk operations
expose-internal-apiNoneExposes internal APIs for advanced use cases
arrowsimd-r-drive-entry-handle/arrowEnables Apache Arrow integration

Build Examples:

Sources: Cargo.toml:49-55 Cargo.toml30

CLI Binary

The workspace includes a CLI binary for direct interaction with the storage engine:

The CLI entry point is defined at src/main.rs:1-12 and delegates to command execution logic in the cli module.

Sources: src/main.rs:1-12


Running Tests

Test Organization

Test Categories:

  1. Unit Tests - Embedded in source files, test individual functions and modules
  2. Integration Tests - Located in tests/ directory, test public API interactions
  3. Concurrency Tests - Multi-threaded scenarios validating thread safety
  4. Documentation Tests - Code examples in documentation comments

Sources: Cargo.toml:36-47

Running All Tests

Execute the complete test suite:

Sources: .github/workflows/rust-tests.yml57

Concurrency Tests

Concurrency tests validate thread-safe operations and are located in tests/concurrency_tests.rs:1-230 These tests use serial_test to prevent parallel execution and tokio for async runtime.

Test Scenarios:

graph TB
    subgraph "Concurrency Test Structure"
        TestAttr["#[tokio::test(flavor=multi_thread)]\n#[serial]"]
TempDir["tempfile::tempdir()\nIsolated Storage"]
DataStore["Arc&lt;DataStore&gt;\nShared Storage"]
Tasks["tokio::spawn\nConcurrent Tasks"]
end
    
    subgraph "Test Scenarios"
        StreamTest["concurrent_slow_streamed_write_test\nParallel stream writes"]
WriteTest["concurrent_write_test\n16 threads × 10 writes"]
InterleavedTest["interleaved_read_write_test\nRead/Write coordination"]
end
    
 
   TestAttr --> TempDir
 
   TempDir --> DataStore
 
   DataStore --> Tasks
    
 
   Tasks --> StreamTest
 
   Tasks --> WriteTest
 
   Tasks --> InterleavedTest

1. Concurrent Slow Streamed Write Test (tests/concurrency_tests.rs:16-109):

  • Simulates slow streaming writes from multiple threads
  • Uses SlowReader wrapper to introduce artificial latency
  • Validates data integrity after concurrent stream completion
  • Configuration: 1MB payloads with 100ms read delays

2. Concurrent Write Test (tests/concurrency_tests.rs:111-161):

  • Spawns 16 threads, each performing 10 writes
  • Tests high-contention write scenarios
  • Validates all 160 writes are retrievable
  • Uses 5ms delays to simulate realistic timing

3. Interleaved Read/Write Test (tests/concurrency_tests.rs:163-229):

  • Tests read-after-write and write-after-read patterns
  • Uses tokio::sync::Notify for coordination
  • Validates proper synchronization between readers and writers

Run Concurrency Tests:

Key Dependencies:

DependencyPurposeReference
serial_test = "3.2.0"Prevents parallel test executionCargo.toml44
tokio (with rt-multi-thread, macros)Async runtime for concurrent testsCargo.toml47
tempfile = "3.19.0"Temporary file creation for isolated testsCargo.toml45

Sources: tests/concurrency_tests.rs:1-230 Cargo.toml:44-47


graph LR
    subgraph "Benchmark Configuration"
        CargoBench["cargo bench\nCriterion Runner"]
StorageBench["storage_benchmark\nharness = false"]
ContentionBench["contention_benchmark\nharness = false"]
end
    
    subgraph "Storage Benchmark Operations"
        AppendOp["benchmark_append_entries\n1M writes in batches"]
SeqRead["benchmark_sequential_reads\nZero-copy iteration"]
RandRead["benchmark_random_reads\n1M random lookups"]
BatchRead["benchmark_batch_reads\nVectorized reads"]
end
    
 
   CargoBench --> StorageBench
 
   CargoBench --> ContentionBench
    
 
   StorageBench --> AppendOp
 
   StorageBench --> SeqRead
 
   StorageBench --> RandRead
 
   StorageBench --> BatchRead

Running Benchmarks

The workspace includes micro-benchmarks using the Criterion framework to measure storage engine performance.

Benchmark Targets

Benchmark Targets:

  1. storage_benchmark - Single-process micro-benchmarks (benches/storage_benchmark.rs:1-234)
  2. contention_benchmark - Multi-threaded contention scenarios (referenced in Cargo.toml:62-63)

Both use harness = false to disable Cargo's default benchmark harness and use custom timing logic.

Sources: Cargo.toml:57-63

Storage Benchmark

The storage benchmark (benches/storage_benchmark.rs:1-234) measures four critical operations:

Configuration Constants:

ConstantValuePurpose
NUM_ENTRIES1,000,000Total entries for write phase
ENTRY_SIZE8 bytesFixed payload size
WRITE_BATCH_SIZE1,024Entries per batch write
READ_BATCH_SIZE1,024Entries per batch read
NUM_RANDOM_CHECKS1,000,000Random read operations
NUM_BATCH_CHECKS1,000,000Batch read operations

Benchmark Operations:

1. Append Entries (benches/storage_benchmark.rs:52-83):

  • Writes 1M entries in batches using batch_write
  • Measures write throughput (writes/second)
  • Uses fixed 8-byte little-endian payloads

2. Sequential Reads (benches/storage_benchmark.rs:98-118):

  • Iterates through all entries using zero-copy iteration
  • Validates data integrity during iteration
  • Measures sequential read throughput

3. Random Reads (benches/storage_benchmark.rs:124-149):

  • Performs 1M random single-key lookups
  • Uses rand::rng().random_range() for key selection
  • Validates retrieved data matches expected values

4. Batch Reads (benches/storage_benchmark.rs:155-181):

  • Performs vectorized reads using batch_read
  • Processes reads in batches of 1,024 keys
  • Measures amortized batch read performance

Run Benchmarks:

Benchmark Output Format:

The storage benchmark uses a custom fmt_rate function (benches/storage_benchmark.rs:220-233) to format throughput metrics with thousands separators and three decimal places:

Wrote 1,000,000 entries of 8 bytes in 2.345s (426,439.232 writes/s)
Sequentially read 1,000,000 entries in 1.234s (810,372.771 reads/s)
Randomly read 1,000,000 entries in 3.456s (289,351.852 reads/s)
Batch-read verified 1,000,000 entries in 0.987s (1,013,171.005 reads/s)

Sources: benches/storage_benchmark.rs:1-234 Cargo.toml:57-63


graph TB
    subgraph "CI Test Matrix"
        Trigger["Push to main\nor Pull Request"]
end
    
    subgraph "Operating Systems"
        Ubuntu["ubuntu-latest"]
MacOS["macos-latest"]
Windows["windows-latest"]
end
    
    subgraph "Feature Configurations"
        Default["Default (empty)"]
NoDefault["--no-default-features"]
Parallel["--features parallel"]
ExposeAPI["--features expose-internal-api"]
Combined["--features=parallel,expose-internal-api"]
AllFeatures["--all-features"]
end
    
    subgraph "Build Steps"
        Cache["Cache Cargo dependencies"]
Build["cargo build --workspace --all-targets"]
Test["cargo test --workspace --all-targets --verbose"]
BenchCheck["cargo bench --workspace --no-run"]
end
    
 
   Trigger --> Ubuntu
 
   Trigger --> MacOS
 
   Trigger --> Windows
    
 
   Ubuntu --> Default
 
   Ubuntu --> NoDefault
 
   Ubuntu --> Parallel
 
   Ubuntu --> ExposeAPI
 
   Ubuntu --> Combined
 
   Ubuntu --> AllFeatures
    
 
   Default --> Cache
 
   Cache --> Build
 
   Build --> Test
 
   Test --> BenchCheck

CI/CD Test Matrix

The GitHub Actions workflow runs tests across multiple platforms and feature combinations to ensure compatibility.

Test Matrix Configuration:

OSFeature Flags
Ubuntu, macOS, WindowsDefault
Ubuntu, macOS, Windows--no-default-features
Ubuntu, macOS, Windows--features parallel
Ubuntu, macOS, Windows--features expose-internal-api
Ubuntu, macOS, Windows--features=parallel,expose-internal-api
Ubuntu, macOS, Windows--all-features

Total Test Combinations: 3 OS × 6 feature configs = 18 test jobs

CI Pipeline Steps:

  1. Cache Dependencies (.github/workflows/rust-tests.yml:40-51) - Caches ~/.cargo and target/ directories using lock file hash
  2. Build (.github/workflows/rust-tests.yml:53-54) - Compiles all workspace targets with specified feature flags
  3. Test (.github/workflows/rust-tests.yml:56-57) - Runs complete test suite with verbose output
  4. Check Benchmarks (.github/workflows/rust-tests.yml:60-61) - Validates benchmark compilation without execution

Caching Strategy:

The workflow uses cache keys based on:

  • Operating system (runner.os)
  • Lock file hash (hashFiles('**/Cargo.lock'))
  • Feature flags (matrix.flags)

This ensures efficient dependency reuse while avoiding cross-contamination between different build configurations.

Sources: .github/workflows/rust-tests.yml:1-62


Development Dependencies

Test Dependencies

DependencyVersionPurpose
serial_test3.2.0Sequential test execution for concurrency tests
tempfile3.19.0Temporary file/directory creation
tokio1.45.1Async runtime with multi-thread support
rand0.9.0Random number generation for benchmarks
bincode1.3.3Serialization for test data
serde1.0.219Serialization framework
serde_json1.0.140JSON serialization for test data
futures0.3.31Async utilities
bytemuck1.23.2Zero-copy type conversions

Sources: Cargo.toml:36-47

Benchmark Dependencies

DependencyVersionPurpose
criterion0.6.0Benchmark framework (workspace dependency)
thousands0.2.0Number formatting for benchmark output
rand0.9.0Random data generation

Sources: Cargo.toml39 Cargo.toml46 Cargo.toml98 Cargo.toml108


Build Artifacts

Binary Output

Compiled binaries are located in:

  • Debug builds: target/debug/simd-r-drive
  • Release builds: target/release/simd-r-drive

Library Output

The workspace produces several library artifacts:

  • libsimd_r_drive.rlib - Core storage engine
  • libsimd_r_drive_entry_handle.rlib - Zero-copy API
  • libsimd_r_drive_extensions.rlib - Utility functions

Ignored Files

The .gitignore configuration excludes build artifacts and data files:

  • **/target - All Cargo build directories
  • *.bin - Binary storage files created during testing
  • /data - Development and experimentation data directory
  • .cargo/config.toml - Local Cargo overrides

Sources: .gitignore:1-10


Best Practices

Pre-Commit Checklist:

  1. Run cargo test --workspace --all-targets to validate all tests pass
  2. Run cargo bench --workspace --no-run to ensure benchmarks compile
  3. Run cargo build --all-features to validate all feature combinations
  4. Check that no test data files (.bin, /data) are committed

Performance Validation:

  • Use cargo bench --bench storage_benchmark to establish performance baselines
  • Compare throughput metrics before and after optimization changes
  • Monitor memory usage during concurrency tests

Feature Flag Testing:

  • Always test with --no-default-features to ensure minimal builds work
  • Test all feature combinations that will be published
  • Verify that feature flags are properly gated with #[cfg(feature = "...")]

Sources: .github/workflows/rust-tests.yml:1-62 benches/storage_benchmark.rs:1-234

DeepWiki GitHub

CI/CD Pipeline

Relevant source files

Purpose and Scope

This document describes the Continuous Integration and Continuous Deployment (CI/CD) infrastructure for the SIMD R Drive project. It covers the GitHub Actions workflows that enforce code quality, run automated tests, and validate security compliance. The CI/CD system ensures that all code changes meet quality standards before being merged.

For information about building and running tests locally, see Building and Testing. For version-specific breaking changes and migration strategies, see Version History and Migration.


CI/CD Architecture Overview

The SIMD R Drive project uses two primary GitHub Actions workflows to validate all code changes:

Diagram: CI/CD Workflow Architecture

graph TB
    subgraph "Trigger Events"
        Push["Push to main"]
PR["Pull Request"]
Tag["Tag push (v*)"]
end
    
    subgraph "rust-tests.yml"
        TestMatrix["Matrix Testing"]
BuildStep["cargo build"]
TestStep["cargo test"]
BenchCheck["cargo bench --no-run"]
Cache["Cargo Cache"]
end
    
    subgraph "rust-lint.yml"
        FmtCheck["cargo fmt --check"]
ClippyCheck["cargo clippy"]
DocCheck["cargo doc"]
DenyCheck["cargo-deny check"]
AuditCheck["cargo-audit"]
end
    
 
   Push --> TestMatrix
 
   Push --> FmtCheck
 
   PR --> TestMatrix
 
   PR --> FmtCheck
 
   Tag --> TestMatrix
    
 
   TestMatrix --> Cache
 
   Cache --> BuildStep
 
   BuildStep --> TestStep
 
   TestStep --> BenchCheck
    
 
   FmtCheck --> ClippyCheck
 
   ClippyCheck --> DocCheck
 
   DocCheck --> DenyCheck
 
   DenyCheck --> AuditCheck
    
    style TestMatrix fill:#f9f9f9
    style FmtCheck fill:#f9f9f9

The system runs two independent workflows in parallel. The rust-tests.yml workflow performs functional validation through matrix testing across multiple platforms and feature configurations. The rust-lint.yml workflow enforces code quality standards, documentation completeness, and security compliance. All checks must pass before a pull request can be merged.

Sources: .github/workflows/rust-tests.yml:1-62 .github/workflows/rust-lint.yml:1-44


Test Workflow (rust-tests.yml)

Workflow Configuration

The test workflow executes on three trigger events:

TriggerConditionPurpose
pushBranches: mainValidate main branch integrity
pushTags: v*Validate release candidates
pull_requestBranches: mainValidate proposed changes

Sources: .github/workflows/rust-tests.yml:3-8

Matrix Testing Strategy

The workflow uses a comprehensive matrix strategy to test across multiple dimensions:

Diagram: Matrix Testing Dimensions

graph TB
    subgraph "Operating Systems"
        Ubuntu["ubuntu-latest"]
MacOS["macos-latest"]
Windows["windows-latest"]
end
    
    subgraph "Feature Configurations"
        Default["Default\nflags: (empty)"]
NoDefault["No Default Features\nflags: --no-default-features"]
Parallel["Parallel\nflags: --features parallel"]
ExposeAPI["Expose Internal API\nflags: --features expose-internal-api"]
ParallelAPI["Parallel + Expose API\nflags: --features=parallel,expose-internal-api"]
AllFeatures["All Features\nflags: --all-features"]
end
    
    subgraph "Test Job Matrix"
        Job["test job\nOS x Features\n3 x 6 = 18 configurations"]
end
    
 
   Ubuntu --> Job
 
   MacOS --> Job
 
   Windows --> Job
    
 
   Default --> Job
 
   NoDefault --> Job
 
   Parallel --> Job
 
   ExposeAPI --> Job
 
   ParallelAPI --> Job
 
   AllFeatures --> Job

Each job runs the complete build-test-benchmark pipeline with a unique combination of operating system and feature flags, resulting in 18 total test configurations per workflow execution.

Sources: .github/workflows/rust-tests.yml:13-30

Test Job Steps

Each matrix job executes the following steps:

1. Repository Checkout

Sources: .github/workflows/rust-tests.yml:33-34

2. Rust Toolchain Installation

Uses the stable Rust toolchain across all platforms.

Sources: .github/workflows/rust-tests.yml:36-37

3. Dependency Caching

The workflow caches Cargo dependencies to reduce build times:

Cache PathContents
~/.cargo/bin/Installed cargo binaries
~/.cargo/registry/index/Crate registry index
~/.cargo/registry/cache/Downloaded crate archives
~/.cargo/git/db/Git dependencies
target/Build artifacts

The cache key includes the OS, Cargo.lock hash, and matrix flags to ensure correct cache isolation between configurations.

Sources: .github/workflows/rust-tests.yml:40-52

4. Build Step

Builds all workspace crates with all targets (library, binaries, tests, benches, examples) using the matrix-specified feature flags.

Sources: .github/workflows/rust-tests.yml:54-55

5. Test Execution

Runs all tests in the workspace with verbose output, including:

  • Unit tests
  • Integration tests
  • Documentation tests
  • Example tests

Sources: .github/workflows/rust-tests.yml:57-58

6. Benchmark Compilation Check

Validates that all benchmarks compile successfully without executing them. The --no-run flag ensures benchmarks are compiled but not executed in CI.

Sources: .github/workflows/rust-tests.yml:60-62

Matrix Configuration Table

Matrix VariableValuesCount
osubuntu-latest, macos-latest, windows-latest3
nameDefault, No Default Features, Parallel, Expose Internal API, Parallel + Expose API, All Features6
flags"", "--no-default-features", "--features parallel", "--features expose-internal-api", "--features=parallel,expose-internal-api", "--all-features"6
Total JobsOS × Features18

Sources: .github/workflows/rust-tests.yml:14-30


Lint Workflow (rust-lint.yml)

Workflow Configuration

The lint workflow runs on all pushes and pull requests without branch restrictions:

Sources: .github/workflows/rust-lint.yml3

Lint Job Architecture

Diagram: Lint Workflow Execution Pipeline

The lint workflow enforces five quality gates in sequence, with each step dependent on the previous step's success.

Sources: .github/workflows/rust-lint.yml:5-44

Quality Gate 1: Code Formatting

Validates that all Rust code adheres to the standard formatting rules defined by rustfmt. The --check flag ensures the command fails if any files require formatting without modifying them.

Sources: .github/workflows/rust-lint.yml:26-27

Quality Gate 2: Clippy Linting

Runs Clippy across the entire workspace with maximum coverage:

  • --workspace: Checks all workspace crates
  • --all-targets: Includes lib, bins, tests, benches, examples
  • --all-features: Enables all optional features
  • -D warnings: Treats all warnings as errors

Sources: .github/workflows/rust-lint.yml:29-31

Quality Gate 3: Documentation Validation

Generates documentation and fails on any documentation warnings:

  • RUSTDOCFLAGS="-D warnings": Treats doc warnings as errors
  • --no-deps: Only documents workspace crates, not dependencies
  • --document-private-items: Includes private items to ensure complete internal documentation

Sources: .github/workflows/rust-lint.yml:33-35

Quality Gate 4: Dependency Security Check

Runs cargo-deny to validate dependency security and licensing:

  • Checks for security advisories in dependencies
  • Validates license compatibility
  • Detects duplicate dependencies
  • Enforces allowed/denied crate policies

The tool requires prior installation via cargo install cargo-deny.

Sources: .github/workflows/rust-lint.yml:20-22 .github/workflows/rust-lint.yml:37-39

Quality Gate 5: Vulnerability Audit

Scans Cargo.lock for known security vulnerabilities in dependencies by checking against the RustSec Advisory Database. Requires prior installation via cargo install cargo-audit.

Sources: .github/workflows/rust-lint.yml:20-23 .github/workflows/rust-lint.yml:41-43

Component Installation Workaround

The workflow includes a workaround for a GitHub Actions environment issue:

This step explicitly installs rustfmt and clippy components to address a toolchain installation issue where these components may not be available by default.

Sources: .github/workflows/rust-lint.yml:13-18


Quality Gates Summary

The complete CI/CD pipeline enforces the following quality gates:

GateWorkflowCommandFailure Condition
Buildrust-tests.ymlcargo buildCompilation errors
Testsrust-tests.ymlcargo testTest failures
Benchmarksrust-tests.ymlcargo bench --no-runBenchmark compilation errors
Formattingrust-lint.ymlcargo fmt --checkUnformatted code
Lintingrust-lint.ymlcargo clippy -D warningsClippy warnings or errors
Documentationrust-lint.ymlcargo doc -D warningsMissing or invalid docs
Dependenciesrust-lint.ymlcargo deny checkSecurity advisories, license issues
Vulnerabilitiesrust-lint.ymlcargo auditKnown CVEs in dependencies

All quality gates must pass for the CI/CD pipeline to succeed. The matrix testing strategy in rust-tests.yml ensures compatibility across three operating systems and six feature configurations, providing comprehensive validation coverage.

Sources: .github/workflows/rust-tests.yml:54-62 .github/workflows/rust-lint.yml:26-43


Workflow Execution Context

Runner Configuration

Both workflows execute on GitHub-hosted runners:

WorkflowRunnerOS
rust-tests.ymlubuntu-latest, macos-latest, windows-latestMulti-platform
rust-lint.ymlubuntu-latestLinux only

Sources: .github/workflows/rust-tests.yml13 .github/workflows/rust-lint.yml7

Fail-Fast Strategy

The test workflow uses fail-fast: false to allow all matrix jobs to complete even if one fails, providing comprehensive failure information:

This configuration is valuable for identifying platform-specific or feature-specific issues across the entire matrix rather than stopping at the first failure.

Sources: .github/workflows/rust-tests.yml15


graph LR
    LocalDev["Local Development"]
Commit["git commit"]
Push["git push"]
PR["Pull Request"]
TestsCI["rust-tests.yml\n18 matrix jobs"]
LintCI["rust-lint.yml\n5 quality gates"]
Merge["Merge to main"]
Tag["Tag release (v*)"]
LocalDev --> Commit
 
   Commit --> Push
 
   Push --> PR
    
 
   PR --> TestsCI
 
   PR --> LintCI
    
 
   TestsCI -->|Pass| Merge
 
   LintCI -->|Pass| Merge
    
 
   Merge --> Tag
 
   Tag --> TestsCI

Integration with Development Workflow

The CI/CD pipeline integrates with the development workflow at multiple points:

Diagram: CI/CD Integration Points

The CI/CD system provides automated validation at multiple stages of the development lifecycle. Pull requests trigger both workflows, ensuring all quality gates pass before code review. Pushes to main and version tags also trigger the test workflow to validate the stability of the main branch and release candidates.

Sources: .github/workflows/rust-tests.yml:3-8 .github/workflows/rust-lint.yml3


Ignored Paths

The repository excludes certain paths from version control that are relevant to CI/CD execution:

PathPurpose
**/targetBuild artifacts generated by cargo
*.binBinary data files used in tests
/dataDebugging and experimentation data
.cargo/config.tomlLocal cargo configuration overrides

These exclusions prevent CI cache pollution and ensure consistent builds across environments.

Sources: .gitignore:1-11

Debug Assertions in CI

The codebase includes conditional debug assertions that activate during CI test runs:

The debug_assert_aligned and debug_assert_aligned_offset functions in simd-r-drive-entry-handle/src/debug_assert_aligned.rs:26-88 activate only when cfg(any(test, debug_assertions)) is true. In CI test runs, these assertions validate alignment invariants for memory-mapped payloads and zero-copy access patterns without impacting release builds or benchmarks.

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-88


Maintenance and Evolution

The CI/CD pipeline configuration is versioned alongside the codebase and evolves with project requirements. Changes to the pipeline typically correspond to:

  • New feature flags requiring matrix coverage
  • Additional security tools or quality gates
  • Platform support changes
  • Performance optimization validation

Historical changes to the CI/CD infrastructure can be tracked through the repository's commit history in the .github/workflows/ directory.

For information about breaking changes that may affect CI/CD behavior, see Version History and Migration, particularly the alignment changes in version 0.15.0 that introduced new debug assertions.

Sources: CHANGELOG.md:25-51

DeepWiki GitHub

Version History and Migration

Relevant source files

This page documents the version history of SIMD R Drive, detailing breaking changes, format evolutions, and migration procedures required when upgrading between versions. The primary focus is on storage format compatibility and the alignment strategy evolution that occurred between versions 0.13.x and 0.15.x.

For information about the current storage architecture and entry structure, see Entry Structure and Metadata. For details on the current alignment strategy and performance characteristics, see Payload Alignment and Cache Efficiency.


Version Overview

SIMD R Drive follows a loosely semantic versioning scheme and is currently in alpha (0.x.x-alpha). Breaking changes in the storage format have occurred as the alignment strategy evolved to optimize zero-copy access and SIMD performance.

VersionRelease DateStatusKey Changes
0.15.5-alpha2025-10-27CurrentApache Arrow 57.0.0, no format changes
0.15.0-alpha2025-09-25BreakingAlignment increased to 64 bytes
0.14.0-alpha2025-09-08BreakingIntroduced 16-byte alignment with pre-pad
≤0.13.x-alphaPre-2025-09LegacyNo alignment guarantees

Sources: CHANGELOG.md:1-82


Storage Format Evolution

The most significant changes in SIMD R Drive's history involve the evolution of payload alignment strategy. These changes were driven by the need to optimize zero-copy access, SIMD operations, and CPU cache efficiency.

timeline
    title "Storage Format Alignment Evolution"
    section Pre-0.14
        No Alignment : Payloads written sequentially\n: No pre-padding\n: Variable starting addresses\nsection 0.14.0-alpha\n16-byte Alignment : Introduced PAYLOAD_ALIGNMENT constant
                          : Added pre-pad bytes (0-15)
                          : SSE-friendly boundaries
    section 0.15.0-alpha
        64-byte Alignment : Increased to cache-line size
                          : Pre-pad bytes (0-63)
                          : AVX-512 and cacheline optimized

Timeline: Alignment Strategy Evolution

Format Comparison: Pre-0.14 vs 0.14+ vs 0.15+

Sources: CHANGELOG.md:55-82 CHANGELOG.md:25-52 README.md:51-59


Version Compatibility Matrix

Understanding compatibility between readers and writers is critical when managing SIMD R Drive deployments across multiple services or upgrading existing storage files.

Reader/Writer Compatibility

Writer Version≤0.13.x Reader0.14.x Reader0.15.x Reader
≤0.13.x✅ Compatible❌ Incompatible*❌ Incompatible*
0.14.x❌ Incompatible✅ Compatible❌ Incompatible
0.15.x❌ Incompatible❌ Incompatible✅ Compatible

Legend:

  • Compatible: Reader can correctly parse the storage format
  • Incompatible: Reader will misinterpret pre-pad bytes as payload data
    • Newer readers may attempt to parse old formats but will fail validation

Incompatibility Root Cause

The incompatibility arises because:

  1. Pre-0.14 readers expect prev_offset to point directly to payload start
  2. 0.14+ formats insert 0-15 or 0-63 pre-pad bytes before the payload
  3. Old readers interpret these zero bytes as part of the payload, corrupting data

Sources: CHANGELOG.md:55-60 CHANGELOG.md:27-31


Migration Procedures

Migrating from 0.13.x or earlier to 0.14.x+

When upgrading from pre-alignment versions to any aligned version (0.14.x or 0.15.x), you must regenerate storage files.

flowchart TD
    Start["Start: Have 0.13.x store"]
ReadOld["Read all entries\nwith 0.13.x binary"]
CreateNew["Create new store\nwith 0.14.x+ binary"]
WriteNew["Write entries to\nnew store"]
Verify["Verify new store\nintegrity"]
Replace["Replace old file\nwith new file"]
Start --> ReadOld
 
   ReadOld --> CreateNew
 
   CreateNew --> WriteNew
 
   WriteNew --> Verify
 
   Verify -->|Success| Replace
 
   Verify -->|Failure| CreateNew
    
    MultiService{"Multi-service\ndeployment?"}
UpgradeReaders["Deploy reader upgrades\nto all services first"]
UpgradeWriters["Deploy writer upgrades\nafter readers"]
Replace --> MultiService
 
   MultiService -->|Yes| UpgradeReaders
 
   MultiService -->|No| Done["Migration Complete"]
UpgradeReaders --> UpgradeWriters
 
   UpgradeWriters --> Done

Migration Workflow

Step-by-Step Migration: 0.13.x → 0.14.x/0.15.x

  1. Prepare the old binary:

  2. Build the new binary:

  3. Extract data from old store:

  4. Create new store with new binary:

  5. Verify integrity:

  6. Replace old file:

Sources: CHANGELOG.md:75-82 CHANGELOG.md:43-52


Migrating from 0.14.x to 0.15.x

The migration from 0.14.x (16-byte alignment) to 0.15.x (64-byte alignment) follows the same regeneration process, as the alignment constants are incompatible.

Alignment Constant Changes

The key difference between versions:

VersionConstant FilePAYLOAD_ALIGN_LOG2PAYLOAD_ALIGNMENT
0.14.xsrc/storage_engine/constants.rs416 bytes
0.15.xsrc/storage_engine/constants.rs664 bytes

Migration Steps: 0.14.x → 0.15.x

Follow the same procedure as 0.13.x → 0.14.x migration:

  1. Read entries with 0.14.x binary
  2. Write to new store with 0.15.x binary
  3. Verify new store integrity
  4. Deploy reader upgrades before writer upgrades in multi-service environments

Rationale for 64-byte alignment:

  • Matches typical CPU cache line size
  • Enables full-speed AVX-512 operations
  • Provides safety margin for future SIMD extensions
  • Optimizes for zero-copy typed slices (see Payload Alignment and Cache Efficiency)

Sources: CHANGELOG.md:25-52 README.md:51-59 CHANGELOG.md39


Version-Specific Details

Version 0.15.5-alpha (2025-10-27)

Type: Non-breaking maintenance release

Changes:

  • Updated Apache Arrow dependency from version 56.x to 57.0.0
  • No storage format changes
  • No API changes
  • No migration required

Compatibility: Fully compatible with 0.15.0-alpha through 0.15.4-alpha

Sources: CHANGELOG.md:19-22


Version 0.15.0-alpha (2025-09-25)

Type: Breaking storage format change

Primary Change: Increased default PAYLOAD_ALIGNMENT from 16 bytes to 64 bytes

Motivation:

  • Cache-line alignment for optimal memory access
  • Support for wider SIMD instructions (AVX-512)
  • Improved zero-copy performance for typed slices
  • Future-proofing for emerging CPU architectures

Breaking Changes:

  1. Storage files created with 0.15.x cannot be read by 0.14.x or earlier
  2. Pre-pad calculation changed: pad = (64 - (prev_tail % 64)) & 63
  3. Pointer and offset alignment assertions added in debug/test builds

New Features:

  • Debug-only alignment assertions via debug_assert_aligned() function
  • Offset-level alignment validation via debug_assert_aligned_offset()
  • Enhanced documentation for alignment guarantees

Code Changes:

Migration Path: See [Migrating from 0.14.x to 0.15.x](https://github.com/jzombie/rust-simd-r-drive/blob/487b7b98/Migrating from 0.14.x to 0.15.x) above.

Sources: CHANGELOG.md:25-52 simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-89


Version 0.14.0-alpha (2025-09-08)

Type: Breaking storage format change

Primary Change: Introduced fixed payload alignment with pre-padding

Motivation:

  • Enable zero-copy typed views (e.g., &[u16], &[u32], &[u64])
  • Guarantee SIMD-safe alignment for AVX2/NEON operations
  • Improve CPU cache utilization

Breaking Changes:

  1. New on-disk layout with optional pre-pad bytes (0-15 bytes)
  2. Files written by 0.14.0-alpha are incompatible with ≤0.13.x readers
  3. Pre-pad calculation: pad = (16 - (prev_tail % 16)) & 15

New Features:

  • Experimental arrow feature flag
  • EntryHandle::as_arrow_buffer() method for Apache Arrow integration
  • EntryHandle::into_arrow_buffer() method for zero-copy Arrow buffers
  • Configurable alignment via PAYLOAD_ALIGN_LOG2 and PAYLOAD_ALIGNMENT constants

Storage Layout:

[Pre-Pad: 0-15 bytes] [Payload: variable] [Metadata: 20 bytes]

Configuration:

  • Alignment controlled in src/storage_engine/constants.rs
  • PAYLOAD_ALIGN_LOG2 = 4 (2^4 = 16 bytes)
  • PAYLOAD_ALIGNMENT = 1 << PAYLOAD_ALIGN_LOG2

Migration Path: See [Migrating from 0.13.x or earlier to 0.14.x+](https://github.com/jzombie/rust-simd-r-drive/blob/487b7b98/Migrating from 0.13.x or earlier to 0.14.x+) above.

Sources: CHANGELOG.md:55-82 README.md:112-138


graph TB
    subgraph Phase1["Phase 1: Reader Upgrades"]
R1["Service A\n(Reader)"]
R2["Service B\n(Reader)"]
R3["Service C\n(Reader)"]
end
    
    subgraph Phase2["Phase 2: Writer Upgrades"]
W1["Service D\n(Writer)"]
W2["Service E\n(Writer)"]
end
    
    subgraph Phase3["Phase 3: Storage Migration"]
M1["Migrate shared\nstorage files"]
M2["Switch to new\nformat"]
end
    
    Start["Start: All services\non old version"]
Start --> Phase1
 
   Phase1 -->|Verify all readers operational| Phase2
 
   Phase2 -->|Verify all writers operational| Phase3
 
   Phase3 --> Complete["Complete: All services\non new version"]
Note1["Readers can handle\nold format during\ntransition"]
Note2["Writers continue\nproducing old format\nuntil migration"]
Phase1 -.-> Note1
 
   Phase2 -.-> Note2

Multi-Service Deployment Strategy

When upgrading SIMD R Drive across multiple services that share storage files or communicate via the WebSocket RPC interface, a staged deployment approach prevents compatibility issues.

Deployment Steps

  1. Phase 1 - Upgrade all readers:

    • Deploy new binaries to services that only read from storage
    • Deploy new WebSocket clients (Python or Rust)
    • Verify that services can still read existing (old format) data
    • Monitor for errors or compatibility issues
  2. Phase 2 - Upgrade writers (optional hold):

    • Deploy new binaries to services that write to storage
    • Important: Writers should continue using the old format during this phase
    • Verify writer functionality with old format
  3. Phase 3 - Migrate storage and switch format:

    • Use migration procedure to regenerate storage files
    • Update writer configuration to use new format
    • Monitor write operations for alignment correctness
  4. Rollback strategy:

    • Keep backups of old format files until migration is verified
    • Keep old binaries available for emergency rollback
    • Test rollback procedure before starting migration

Sources: CHANGELOG.md:43-52 CHANGELOG.md:75-82


graph TB
    subgraph DebugMode["Debug/Test Mode"]
Call1["debug_assert_aligned(ptr, align)"]
Call2["debug_assert_aligned_offset(offset)"]
Check1["Verify ptr % align == 0"]
Check2["Verify offset % PAYLOAD_ALIGNMENT == 0"]
Pass1["Assertion passes"]
Fail1["Panic with alignment error"]
Call1 --> Check1
 
       Call2 --> Check2
 
       Check1 -->|Aligned| Pass1
 
       Check1 -->|Misaligned| Fail1
 
       Check2 -->|Aligned| Pass1
 
       Check2 -->|Misaligned| Fail1
    end
    
    subgraph ReleaseMode["Release/Bench Mode"]
Call3["debug_assert_aligned(ptr, align)"]
Call4["debug_assert_aligned_offset(offset)"]
NoOp["No-op (zero cost)"]
Call3 --> NoOp
 
       Call4 --> NoOp
    end

Alignment Validation and Debugging

Starting with version 0.15.0, SIMD R Drive includes debug-time alignment assertions to catch alignment violations early in development.

Debug Assertion Functions

Implementation Details

The assertion functions are exported from simd-r-drive-entry-handle and can be called from any crate:

  • debug_assert_aligned(ptr: *const u8, align: usize)

    • Validates pointer alignment in debug/test builds
    • Used for verifying Arrow buffer compatibility
    • Zero cost in release builds
  • debug_assert_aligned_offset(offset: u64)

    • Validates file offset alignment in debug/test builds
    • Checks against PAYLOAD_ALIGNMENT constant
    • Zero cost in release builds

Implementation approach:

  • Functions always exist (stable symbol for cross-crate calls)
  • Body is gated by #[cfg(any(test, debug_assertions))]
  • Release/bench builds compile to no-ops with no runtime cost

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-89 CHANGELOG.md:33-42


Future Format Stability

Current Status

SIMD R Drive is in alpha (0.x.x-alpha) and storage format stability is not guaranteed. Each alignment change requires storage regeneration.

Path to Stability

Future work may include:

  1. Format version headers - Store format version in file header for automatic detection
  2. Multi-format readers - Support reading multiple alignment versions in a single binary
  3. In-place migration - Tools to migrate storage without full regeneration
  4. Stable 1.0 format - Commitment to format stability after 1.0 release

Current Recommendations

Until format stability is achieved:

  • Plan for regeneration: Assume storage files will need regeneration on major updates
  • Version pinning: Pin specific SIMD R Drive versions in production deployments
  • Test thoroughly: Validate migration procedures in staging environments
  • Keep backups: Maintain backups of storage files before migrations
  • Monitor changelogs: Review CHANGELOG.md for breaking changes before upgrading

Sources: CHANGELOG.md:1-82 README.md:1-10


Summary: Key Takeaways

TopicKey Points
Format CompatibilityEach major version (0.13, 0.14, 0.15) has incompatible storage formats
Migration RequiredUpgrading between major versions requires regenerating storage files
Alignment EvolutionPre-0.14: none → 0.14.x: 16 bytes → 0.15.x: 64 bytes
Deployment OrderUpgrade readers first, then writers, then migrate storage
Debug ToolsUse debug_assert_aligned() functions to validate alignment in tests
Production StrategyPin versions, plan for regeneration, test migrations thoroughly

For specific migration procedures, refer to the [Migration Procedures](https://github.com/jzombie/rust-simd-r-drive/blob/487b7b98/Migration Procedures) section above. For understanding the current storage format, see Entry Structure and Metadata.

Sources: CHANGELOG.md:1-82 README.md:1-282