Overview

Relevant source files

Purpose and Scope

This document provides a high-level introduction to SIMD R Drive, describing its core purpose, architectural components, access methods, and key features. It serves as the entry point for understanding the system before diving into detailed subsystem documentation.

For details on the core storage engine internals, see Core Storage Engine. For network-based remote access, see Network Layer and RPC. For Python integration, see Python Integration. For performance optimization details, see Performance Optimizations.

Sources: README.md:1-42

What is SIMD R Drive?

SIMD R Drive is a high-performance, append-only, schema-less storage engine designed for zero-copy binary data access. It stores arbitrary binary payloads in a single-file container without imposing serialization formats, schemas, or data interpretation. All data is treated as raw bytes (&[u8]), providing maximum flexibility for applications that require high-speed storage and retrieval of binary data.

Core Characteristics

Characteristic	Description
Storage Model	Append-only, single-file container
Data Format	Schema-less binary (`&[u8]`)
Access Pattern	Zero-copy memory-mapped reads
Alignment	64-byte boundaries (configurable via `PAYLOAD_ALIGNMENT`)
Concurrency	Thread-safe reads and writes within a single process
Indexing	Hardware-accelerated XXH3_64 hash-based key lookup
Integrity	CRC32C checksums and validation chain

The storage engine is optimized for workloads that benefit from SIMD operations, cache-line efficiency, and direct memory access. By enforcing 64-byte payload alignment, it enables efficient typed slice reinterpretation (e.g., &[u32], &[u64]) without copying.

Sources: README.md:5-8 README.md:43-87 Cargo.toml13

Core Architecture Components

The system consists of three primary layers: the storage engine core, network access layer, and language bindings. The following diagram maps high-level architectural concepts to specific code entities.

graph TB
    subgraph "User Interfaces"
        CLI["CLI Binary\nsimd-r-drive crate\nmain.rs"]
PythonApp["Python Applications\nsimd_r_drive.DataStoreWsClient"]
RustApp["Native Rust Clients\nDataStoreWsClient struct"]
end
    
    subgraph "Network Layer"
        WSServer["WebSocket Server\nsimd-r-drive-ws-server\nAxum HTTP server"]
RPCDef["Service Definition\nsimd-r-drive-muxio-service-definition\nDataStoreService trait"]
end
    
    subgraph "Core Storage Engine"
        DataStore["DataStore struct\nsrc/data_store.rs"]
Traits["DataStoreReader trait\nDataStoreWriter trait"]
KeyIndexer["KeyIndexer struct\nsrc/key_indexer.rs"]
EntryHandle["EntryHandle struct\nsimd-r-drive-entry-handle crate"]
end
    
    subgraph "Storage Infrastructure"
        Mmap["Arc&lt;Mmap&gt;\nmemmap2 crate"]
FileHandle["BufWriter&lt;File&gt;\nstd::fs::File"]
AtomicOffset["AtomicU64\ntail_offset field"]
end
    
    subgraph "Performance Layer"
        SIMDCopy["simd_copy function\nsrc/simd_utils.rs"]
XXH3["xxhash-rust crate\nXXH3_64 algorithm"]
end
    
 
   CLI --> DataStore
 
   PythonApp --> WSServer
 
   RustApp --> WSServer
    
 
   WSServer --> RPCDef
 
   RPCDef --> DataStore
    
 
   DataStore --> Traits
 
   DataStore --> KeyIndexer
 
   DataStore --> EntryHandle
 
   DataStore --> Mmap
 
   DataStore --> FileHandle
 
   DataStore --> AtomicOffset
 
   DataStore --> SIMDCopy
    
 
   KeyIndexer --> XXH3
 
   EntryHandle --> Mmap
    
    style DataStore fill:#f9f9f9,stroke:#333,stroke-width:3px
    style Traits fill:#f9f9f9,stroke:#333,stroke-width:2px

System Architecture with Code Entities

Diagram: System architecture showing code entity mappings

Component Descriptions

Component	Code Entity	Purpose
DataStore	`DataStore` struct in src/data_store.rs	Main storage interface implementing read/write operations
DataStoreReader	`DataStoreReader` trait	Defines zero-copy read operations (read, exists, batch_read)
DataStoreWriter	`DataStoreWriter` trait	Defines synchronized write operations (write, delete, batch_write)
KeyIndexer	`KeyIndexer` struct in src/key_indexer.rs	Hash-based index mapping u64 hashes to (tag, offset) tuples
EntryHandle	`EntryHandle` struct in simd-r-drive-entry-handle/src/lib.rs	Zero-copy reference to memory-mapped payload data
Memory Mapping	`Arc<Mmap>` wrapped in `Mutex`	Shared memory-mapped file reference for zero-copy reads
File Handle	`Arc<RwLock<BufWriter<File>>>`	Synchronized buffered writer for append operations
Tail Offset	`AtomicU64` field `tail_offset`	Atomic counter tracking the current end-of-file position

Sources: Cargo.toml:66-73 src/data_store.rs (inferred from architecture diagrams), High-level diagrams provided

graph LR
    subgraph "Direct Access"
        DirectApp["Rust Application"]
DirectDS["DataStore::open\nDataStoreReader\nDataStoreWriter"]
end
    
    subgraph "CLI Access"
        CLIApp["Command Line"]
CLIBin["simd-r-drive binary\nclap::Parser"]
end
    
    subgraph "Remote Access"
        PyClient["Python Client\nDataStoreWsClient class"]
RustClient["Rust Client\nDataStoreWsClient struct"]
WSServer["WebSocket Server\nmuxio-tokio-rpc-server\nAxum router"]
BackendDS["DataStore instance"]
end
    
 
   DirectApp --> DirectDS
 
   CLIApp --> CLIBin
 
   CLIBin --> DirectDS
    
 
   PyClient --> WSServer
 
   RustClient --> WSServer
 
   WSServer --> BackendDS
    
    style DirectDS fill:#f9f9f9,stroke:#333,stroke-width:2px
    style WSServer fill:#f9f9f9,stroke:#333,stroke-width:2px
    style BackendDS fill:#f9f9f9,stroke:#333,stroke-width:2px

Access Methods

SIMD R Drive can be accessed through three primary interfaces, each optimized for different use cases.

Access Method Architecture

Diagram: Access methods with code entity mappings

Access Method Comparison

Method	Use Case	Code Entry Point	Latency	Throughput
Direct Library	Embedded in Rust applications	`DataStore::open()`	Microseconds	Highest (zero-copy)
CLI	Command-line operations, scripting	`simd-r-drive` binary with `clap`	Milliseconds	Process-bound
WebSocket RPC	Remote access, language bindings	`DataStoreWsClient` (Rust/Python)	Network-dependent	RPC-serialization-bound

Direct Library Access:

Applications link against the simd-r-drive crate directly
Call DataStore::open() to obtain a storage instance
Use DataStoreReader and DataStoreWriter traits for operations
Provides lowest latency and highest throughput

CLI Access:

The simd-r-drive binary provides a command-line interface
Built using clap for argument parsing
Useful for scripting, testing, and manual operations
Each invocation opens the storage, performs operation, and closes

Remote Access (WebSocket RPC):

simd-r-drive-ws-server provides network access via WebSocket
Uses the Muxio RPC framework with bitcode serialization
DataStoreWsClient available for both Rust and Python clients
Enables multi-language access and distributed architectures

For CLI details, see Repository Structure. For WebSocket server architecture, see WebSocket Server. For Python client usage, see Python WebSocket Client API.

Sources: Cargo.toml:23-26 README.md9 README.md:262-266 High-level diagrams

Key Features

Zero-Copy Memory-Mapped Access

SIMD R Drive uses memmap2 to memory-map the storage file, allowing direct access to stored data without deserialization or copying. The EntryHandle struct provides a zero-copy view into the memory-mapped region, returning &[u8] slices that point directly into the mapped file.

This approach enables:

Sub-microsecond reads for indexed lookups
Minimal memory overhead for large entries
Efficient processing of datasets larger than available RAM

The memory-mapped file is wrapped in Arc<Mutex<Arc<Mmap>>> to ensure thread-safe access during concurrent reads and remap operations.

Sources: README.md:43-49 simd-r-drive-entry-handle/ (inferred)

Fixed 64-Byte Payload Alignment

Every non-tombstone payload begins on a 64-byte boundary (defined by PAYLOAD_ALIGNMENT constant). This alignment matches typical CPU cache line sizes and enables:

Cache-friendly access with reduced cache line splits
Full-speed SIMD operations (AVX2, AVX-512, NEON) without misalignment penalties
Zero-copy typed slices when payload length matches element size (e.g., &[u64])

Pre-padding bytes are inserted before payloads to maintain this alignment. Tombstones (deletion markers) do not require alignment.

Sources: README.md:51-59 simd-r-drive-entry-handle/src/constants.rs (inferred)

Single-File Storage Container

All data is stored in a single append-only file with the following characteristics:

Aspect	Description
File Structure	Sequential entries: `[pre-pad] [payload] [metadata]`
Metadata Size	Fixed 20 bytes: key_hash (8) + prev_offset (8) + checksum (4)
Entry Chaining	Each metadata contains `prev_offset` pointing to previous entry's tail
Validation	CRC32C checksums and backward chain traversal
Recovery	Automatic truncation of incomplete writes on open

The storage format is detailed in Entry Structure and Metadata.

Sources: README.md:62-147 README.md:104-150

Thread-Safe Concurrency

SIMD R Drive supports concurrent operations within a single process using:

Mechanism	Code Entity	Purpose
Read Lock	`RwLock` (reads)	Allows multiple concurrent readers
Write Lock	`RwLock` (writes)	Ensures exclusive write access
Atomic Offset	`AtomicU64` (`tail_offset`)	Tracks file end without locking
Index Lock	`RwLock<HashMap>`	Protects key index updates
Mmap Lock	`Mutex<Arc<Mmap>>`	Prevents concurrent remapping

Concurrency Guarantees:

✅ Multiple threads can read concurrently (zero-copy, lock-free)
✅ Write operations are serialized via RwLock
✅ Index updates are synchronized
❌ Multiple processes require external file locking

For detailed concurrency model, see Concurrency and Thread Safety.

Sources: README.md:170-206

Hardware-Accelerated Indexing

The KeyIndexer uses the xxhash-rust crate with XXH3_64 algorithm, which provides hardware acceleration:

SSE2 on x86_64 (universally supported)
AVX2 on capable x86_64 CPUs (runtime detection)
NEON on aarch64 (default)

Key lookups are O(1) via HashMap, with benchmarks showing ~1 million random 8-byte lookups completing in under 1 second.

Sources: README.md:158-168 Cargo.toml34

SIMD Write Acceleration

The simd_copy function (in src/simd_utils.rs) accelerates memory copying during write operations:

x86_64 with AVX2 : 32-byte SIMD chunks using _mm256_loadu_si256 / _mm256_storeu_si256
aarch64 : 16-byte NEON chunks using vld1q_u8 / vst1q_u8
Fallback : Standard copy_from_slice when SIMD unavailable

This optimization reduces CPU cycles during buffer staging before disk writes.

For SIMD implementation details, see SIMD Acceleration.

Sources: README.md:249-257 src/simd_utils.rs (inferred)

Write and Read Modes

Write Modes

Mode	Method	Use Case	Flush Behavior
Single Entry	`write(key, payload)`	Individual writes	Immediate flush
Batch	`batch_write(&[(key, payload)])`	Multiple entries	Single flush at end
Streaming	`write_large_entry(key, Read)`	Large payloads	Streaming with immediate flush

Batch writes reduce disk I/O overhead by grouping multiple entries under a single write lock and flushing once.

Sources: README.md:208-223

Read Modes

Mode	Method	Memory Behavior	Use Case
Direct	`read(key) -> EntryHandle`	Zero-copy mmap reference	Standard reads
Streaming	`read_stream(key) -> impl Read`	Buffered, non-zero-copy	Large entries
Parallel Iteration	`par_iter_entries()` (Rayon)	Parallel processing	Bulk analytics

Direct reads return EntryHandle with zero-copy &[u8] access. Streaming reads process data incrementally through a buffer. Parallel iteration is available via the optional parallel feature.

For iteration details, see Parallel Iteration (via Rayon).

Sources: README.md:225-247

Repository Structure

The project is organized as a Cargo workspace with the following crates:

Diagram: Workspace structure with crate relationships

Crate Descriptions

Crate	Path	Purpose
`simd-r-drive`	./	Core storage engine with `DataStore`, `KeyIndexer`, SIMD utilities
`simd-r-drive-entry-handle`	./simd-r-drive-entry-handle/	Zero-copy `EntryHandle` and metadata structures
`simd-r-drive-extensions`	./extensions/	Utility functions and helper modules
`simd-r-drive-muxio-service-definition`	./experiments/simd-r-drive-muxio-service-definition/	RPC service trait definitions using `bitcode`
`simd-r-drive-ws-server`	./experiments/simd-r-drive-ws-server/	Axum-based WebSocket RPC server
`simd-r-drive-ws-client`	./experiments/simd-r-drive-ws-client/	Native Rust WebSocket client
`simd-r-drive-py`	./experiments/bindings/python/	PyO3-based Python bindings for direct access
`simd-r-drive-ws-client-py`	./experiments/bindings/python-ws-client/	Python WebSocket client wrapper

The workspace is defined in Cargo.toml:65-78 with version 0.15.5-alpha specified in Cargo.toml3

For detailed repository structure, see Repository Structure.

Sources: Cargo.toml:65-78 Cargo.toml:1-10 README.md:259-266

Performance Characteristics

SIMD R Drive is designed for high-performance workloads with the following characteristics:

Benchmark Context

Metric	Typical Performance
Random Read (8-byte)	~1M lookups in < 1 second
Sequential Write	Limited by disk I/O and flush frequency
Memory Overhead	Minimal (mmap-based, on-demand paging)
Index Lookup	O(1) via `HashMap` with XXH3_64

Optimization Strategies

SIMD Copy Operations : The simd_copy function uses AVX2/NEON for bulk memory transfers during writes
Hardware-Accelerated Hashing : XXH3_64 with SSE2/AVX2/NEON for fast key hashing
Zero-Copy Reads : Memory-mapped access eliminates deserialization overhead
Cache-Line Alignment : 64-byte boundaries reduce cache misses
Batch Operations : Grouping writes reduces lock contention and flush overhead

For detailed performance optimization documentation, see Performance Optimizations.

Sources: README.md:158-168 README.md:249-257

Next Steps

This overview provides a foundation for understanding SIMD R Drive. For deeper exploration:

Storage Internals : See Core Storage Engine for DataStore implementation details
Data Format : See Entry Structure and Metadata for on-disk layout
Network Access : See Network Layer and RPC for remote access architecture
Python Usage : See Python Integration for PyO3 bindings and client APIs
Performance Tuning : See Performance Optimizations for SIMD and alignment strategies

Sources: All sections above

Core Storage Engine

Relevant source files

The core storage engine is the foundational layer of SIMD R Drive, providing append-only, memory-mapped binary storage with zero-copy access. This document covers the high-level architecture and key components that implement the storage functionality.

For network-based access to the storage engine, see Network Layer and RPC. For Python integration, see Python Integration. For performance optimizations including SIMD acceleration, see Performance Optimizations.

Purpose and Scope

This page introduces the core storage engine's architecture, primary components, and design principles. Detailed information about specific subsystems is covered in child pages:

Architecture Overview

The storage engine implements an append-only binary store with in-memory indexing and memory-mapped file access. The architecture consists of four primary structures that work together to provide concurrent, high-performance storage operations.

Sources: src/storage_engine/data_store.rs:26-33 src/storage_engine.rs:1-25

graph TB
    subgraph "Public Traits"
        Reader["DataStoreReader\nread, exists, batch_read"]
Writer["DataStoreWriter\nwrite, delete, batch_write"]
end
    
    subgraph "DataStore Structure"
        DS["DataStore"]
FileHandle["Arc&lt;RwLock&lt;BufWriter&lt;File&gt;&gt;&gt;\nfile\nSequential write operations"]
MmapHandle["Arc&lt;Mutex&lt;Arc&lt;Mmap&gt;&gt;&gt;\nmmap\nMemory-mapped read access"]
TailOffset["AtomicU64\ntail_offset\nCurrent end position"]
KeyIndexer["Arc&lt;RwLock&lt;KeyIndexer&gt;&gt;\nkey_indexer\nHash to offset mapping"]
PathBuf["PathBuf\npath\nFile system location"]
end
    
    subgraph "Support Structures"
        EH["EntryHandle\nZero-copy data view"]
EM["EntryMetadata\nkey_hash, prev_offset, checksum"]
EI["EntryIterator\nSequential traversal"]
ES["EntryStream\nStreaming reads"]
end
    
    Reader -.implements.-> DS
    Writer -.implements.-> DS
    
 
   DS --> FileHandle
 
   DS --> MmapHandle
 
   DS --> TailOffset
 
   DS --> KeyIndexer
 
   DS --> PathBuf
    
 
   DS --> EH
 
   DS --> EI
 
   EH --> EM
 
   EH --> ES

Core Components

DataStore

The DataStore struct is the primary interface to the storage engine. It maintains four essential shared resources protected by different synchronization primitives:

Field	Type	Purpose	Synchronization
`file`	`Arc<RwLock<BufWriter<File>>>`	Write operations to disk	`RwLock` for exclusive writes
`mmap`	`Arc<Mutex<Arc<Mmap>>>`	Memory-mapped view for reads	`Mutex` for atomic remapping
`tail_offset`	`AtomicU64`	Current end-of-file position	Atomic operations
`key_indexer`	`Arc<RwLock<KeyIndexer>>`	Hash-based key lookup	`RwLock` for concurrent reads
`path`	`PathBuf`	Storage file location	Immutable after creation

Sources: src/storage_engine/data_store.rs:26-33

Traits

The storage engine exposes two primary traits that define its public API:

DataStoreReader : Provides zero-copy read operations including read(), batch_read(), exists(), and iteration methods. Implemented at src/storage_engine/data_store.rs:1027-1182
DataStoreWriter : Provides write operations including write(), batch_write(), delete(), and streaming writes. Implemented at src/storage_engine/data_store.rs:752-1025

Sources: src/lib.rs21 src/storage_engine.rs21

EntryHandle

EntryHandle provides a zero-copy view into stored data by maintaining a reference to the memory-mapped file and a byte range. It includes the associated EntryMetadata for accessing checksums and hash values without additional lookups.

Sources: src/storage_engine.rs24 README.md:43-49

KeyIndexer

KeyIndexer maintains an in-memory hash map from 64-bit XXH3 key hashes to packed values containing a 16-bit collision detection tag and a 48-bit file offset. This structure enables O(1) key lookups while detecting hash collisions.

Sources: src/storage_engine/data_store.rs31 README.md:158-169

Component Interaction Flow

The following diagram illustrates how the core components interact during typical read and write operations, showing the actual method calls and data flow between structures.

Sources: src/storage_engine/data_store.rs:752-825 src/storage_engine/data_store.rs:1040-1049 src/storage_engine/data_store.rs:224-259

sequenceDiagram
    participant Client
    participant DS as "DataStore"
    participant FileHandle as "file: RwLock&lt;BufWriter&lt;File&gt;&gt;"
    participant TailOffset as "tail_offset: AtomicU64"
    participant MmapHandle as "mmap: Mutex&lt;Arc&lt;Mmap&gt;&gt;"
    participant KeyIndexer as "key_indexer: RwLock&lt;KeyIndexer&gt;"
    
    Note over Client,KeyIndexer: Write Operation
    Client->>DS: write(key, payload)
    DS->>DS: compute_hash(key)
    DS->>TailOffset: load(Ordering::Acquire)
    DS->>FileHandle: write().unwrap()
    DS->>FileHandle: write_all(pre_pad + payload + metadata)
    DS->>FileHandle: flush()
    DS->>DS: reindex()
    DS->>FileHandle: init_mmap()
    DS->>MmapHandle: lock().unwrap()
    DS->>MmapHandle: *guard = Arc::new(new_mmap)
    DS->>KeyIndexer: write().unwrap()
    DS->>KeyIndexer: insert(key_hash, offset)
    DS->>TailOffset: store(new_offset, Ordering::Release)
    DS-->>Client: Ok(offset)
    
    Note over Client,KeyIndexer: Read Operation
    Client->>DS: read(key)
    DS->>DS: compute_hash(key)
    DS->>KeyIndexer: read().unwrap()
    DS->>KeyIndexer: get_packed(key_hash)
    KeyIndexer-->>DS: Some(packed_value)
    DS->>DS: KeyIndexer::unpack(packed)
    DS->>MmapHandle: lock().unwrap()
    DS->>MmapHandle: clone Arc
    DS->>DS: read_entry_with_context()
    DS->>DS: EntryMetadata::deserialize()
    DS->>DS: create EntryHandle
    DS-->>Client: Ok(Some(EntryHandle))

Storage Initialization and Recovery

The DataStore::open() method performs a multi-stage initialization process that ensures data integrity even after crashes or incomplete writes.

flowchart TD
    Start["DataStore::open(path)"]
OpenFile["open_file_in_append_mode()\nOpenOptions::new()\n.read(true).write(true).create(true)"]
InitMmap["init_mmap()\nmemmap2::MmapOptions::new().map()"]
Recover["recover_valid_chain(mmap, file_len)\nBackward scan validating prev_offset chain"]
CheckTrunc{"final_len < file_len?"}
Truncate["Truncate file to final_len\nfile.set_len(final_len)\nReopen storage"]
BuildIndex["KeyIndexer::build(mmap, final_len)\nForward scan to populate index"]
CreateDS["Create DataStore instance\nwith Arc-wrapped fields"]
Start --> OpenFile
 
   OpenFile --> InitMmap
 
   InitMmap --> Recover
 
   Recover --> CheckTrunc
 
   CheckTrunc -->|Yes Corruption detected| Truncate
 
   CheckTrunc -->|No File valid| BuildIndex
 
   Truncate --> Start
 
   BuildIndex --> CreateDS

The recovery process validates the storage file by:

Scanning backward from the end of file following prev_offset links in EntryMetadata
Verifying chain integrity ensuring each offset points to a valid previous tail
Truncating corruption if an invalid chain is detected
Building the index by forward scanning the validated region

Sources: src/storage_engine/data_store.rs:84-117 src/storage_engine/data_store.rs:383-482

flowchart TD
    WriteCall["write(key, payload)"]
ComputeHash["compute_hash(key)\nXXH3_64 hash"]
AcquireWrite["file.write().unwrap()\nAcquire write lock"]
LoadTail["tail_offset.load(Ordering::Acquire)"]
CalcPad["prepad_len(prev_tail)\n(A - offset % A) & (A-1)"]
WritePad["write_all(&pad[..prepad])\nZero bytes for alignment"]
WritePayload["write_all(payload)\nActual data"]
WriteMetadata["write_all(metadata.serialize())\nkey_hash + prev_offset + checksum"]
Flush["flush()\nEnsure disk persistence"]
Reindex["reindex()\nUpdate mmap and index"]
UpdateTail["tail_offset.store(new_offset)\nOrdering::Release"]
WriteCall --> ComputeHash
 
   ComputeHash --> AcquireWrite
 
   AcquireWrite --> LoadTail
 
   LoadTail --> CalcPad
 
   CalcPad --> WritePad
 
   WritePad --> WritePayload
 
   WritePayload --> WriteMetadata
 
   WriteMetadata --> Flush
 
   Flush --> Reindex
 
   Reindex --> UpdateTail

Write Path Architecture

Write operations follow a specific sequence to maintain consistency and enable zero-copy reads. All payloads are aligned to 64-byte boundaries using pre-padding.

The reindex() method performs two critical operations after each write:

Memory map update : Creates a new Mmap view of the file and atomically swaps it
Index update : Inserts new key-hash to offset mappings into KeyIndexer

Sources: src/storage_engine/data_store.rs:827-834 src/storage_engine/data_store.rs:758-825 src/storage_engine/data_store.rs:224-259

Read Path Architecture

Read operations use the hash index for O(1) lookup and return EntryHandle instances that provide zero-copy access to the memory-mapped data.

flowchart TD
    ReadCall["read(key)"]
ComputeHash["compute_hash(key)\nXXH3_64 hash"]
AcquireIndex["key_indexer.read().unwrap()\nAcquire read lock"]
GetMmap["get_mmap_arc()\nClone Arc&lt;Mmap&gt;"]
ReadContext["read_entry_with_context()"]
Lookup["key_indexer.get_packed(key_hash)"]
CheckFound{"Found?"}
Unpack["KeyIndexer::unpack(packed)\nExtract tag + offset"]
VerifyTag{"Tag matches?"}
ReadMetadata["EntryMetadata::deserialize()\nFrom mmap[offset..offset+20]"]
CalcRange["Derive entry range\nentry_start..entry_end"]
CheckTombstone{"Is tombstone?"}
CreateHandle["Create EntryHandle\nmmap_arc + range + metadata"]
ReturnNone["Return None"]
ReadCall --> ComputeHash
 
   ComputeHash --> AcquireIndex
 
   AcquireIndex --> GetMmap
 
   GetMmap --> ReadContext
 
   ReadContext --> Lookup
 
   Lookup --> CheckFound
 
   CheckFound -->|No| ReturnNone
 
   CheckFound -->|Yes| Unpack
 
   Unpack --> VerifyTag
 
   VerifyTag -->|No Collision| ReturnNone
 
   VerifyTag -->|Yes| ReadMetadata
 
   ReadMetadata --> CalcRange
 
   CalcRange --> CheckTombstone
 
   CheckTombstone -->|Yes| ReturnNone
 
   CheckTombstone -->|No| CreateHandle

The tag verification step at src/storage_engine/data_store.rs:513-521 prevents hash collision issues by comparing a 16-bit tag derived from the original key with the stored tag.

Sources: src/storage_engine/data_store.rs:1040-1049 src/storage_engine/data_store.rs:501-565

Batch Operations

Batch operations reduce lock acquisition overhead by processing multiple entries in a single synchronized operation.

flowchart TD
    BatchWrite["batch_write(entries)"]
HashAll["compute_hash_batch(keys)\nParallel XXH3_64"]
AcquireWrite["file.write().unwrap()"]
InitBuffer["buffer = Vec::new()"]
LoopStart{"For each entry"}
CalcPad["prepad_len(tail_offset)"]
BuildEntry["Build entry:\npre_pad + payload + metadata"]
AppendBuffer["buffer.extend_from_slice(entry)"]
UpdateTail["tail_offset += entry.len()"]
LoopEnd["Next entry"]
WriteAll["file.write_all(&buffer)"]
Flush["file.flush()"]
Reindex["reindex(key_hash_offsets)"]
BatchWrite --> HashAll
 
   HashAll --> AcquireWrite
 
   AcquireWrite --> InitBuffer
 
   InitBuffer --> LoopStart
 
   LoopStart -->|More entries| CalcPad
 
   CalcPad --> BuildEntry
 
   BuildEntry --> AppendBuffer
 
   AppendBuffer --> UpdateTail
 
   UpdateTail --> LoopEnd
 
   LoopEnd --> LoopStart
 
   LoopStart -->|Done| WriteAll
 
   WriteAll --> Flush
 
   Flush --> Reindex

Batch Write

The batch_write() method accumulates all entries in an in-memory buffer before acquiring the write lock once:

Sources: src/storage_engine/data_store.rs:838-843 src/storage_engine/data_store.rs:847-939

Batch Read

The batch_read() method acquires the index lock once and performs all lookups before releasing it:

Sources: src/storage_engine/data_store.rs:1105-1109 src/storage_engine/data_store.rs:1111-1158

Iteration Support

The storage engine provides multiple iteration mechanisms for different use cases:

Iterator	Type	Synchronization	Use Case
`EntryIterator`	Sequential	Single lock acquisition	Forward/backward iteration
`par_iter_entries()`	Parallel (Rayon)	Lock-free after setup	High-throughput scanning
`IntoIterator`	Consuming	Single lock acquisition	Owned iteration

The EntryIterator scans backward through the file using the prev_offset chain, tracking seen keys in a HashSet to return only the latest version of each key.

Sources: src/storage_engine/entry_iterator.rs:8-128 src/storage_engine/data_store.rs:276-280 src/storage_engine/data_store.rs:296-361

Key Design Principles

Append-Only Model

All write operations append data to the end of the file. Updates and deletes create new entries rather than modifying existing data. This design ensures:

No seeking during writes : Sequential disk I/O for maximum throughput
Crash safety : Incomplete writes are detected and truncated during recovery
Zero-copy reads : Memory-mapped regions remain valid as data is never modified

Sources: README.md:98-103

Fixed Alignment

Every non-tombstone payload starts on a 64-byte boundary (configurable via PAYLOAD_ALIGNMENT). This alignment enables:

Cache-line efficiency : Payloads align with CPU cache lines
SIMD operations : Full-speed vectorized reads without crossing boundaries
Zero-copy typed views : Safe reinterpretation as typed slices

Sources: README.md:51-59 src/storage_engine/data_store.rs:670-673

Metadata-After-Payload

Each entry stores its metadata immediately after the payload, forming a backward-linked chain. This design:

Minimizes seeking : Metadata and payload are adjacent on disk
Enables recovery : The chain can be validated backward from any point
Supports nesting : Storage containers can be stored within other containers

Sources: README.md:139-148

Lock-Free Reads

Read operations do not acquire write locks and perform zero-copy access through memory-mapped views. The AtomicU64 tail offset ensures readers see a consistent view without blocking writes.

Sources: README.md:170-183 src/storage_engine/data_store.rs:1040-1049

Repository Structure

Relevant source files

Purpose and Scope

This document provides a comprehensive overview of the SIMD R Drive repository structure, including its Cargo workspace organization, crate hierarchy, and inter-crate dependencies. It covers the core storage engine crates, experimental network components, and build configuration. For details about the storage architecture and API, see Core Storage Engine. For network communication details, see Network Layer and RPC. For Python integration specifics, see Python Integration.

Workspace Organization

The SIMD R Drive repository is organized as a Cargo workspace containing six published crates and two excluded Python binding crates. The workspace uses Cargo resolver version 2 and maintains unified versioning (0.15.5-alpha) across all workspace members.

Workspace Configuration:

Configuration	Value
Resolver	Version 2
Unified Version	0.15.5-alpha
Rust Edition	2024
License	Apache-2.0
Repository	https://github.com/jzombie/rust-simd-r-drive

The workspace is defined in Cargo.toml:65-78 with members spanning core functionality, utilities, and experimental features.

Workspace Structure Diagram:

graph TB
    subgraph "Core Crates"
        Core["simd-r-drive\n(Root Package)\nsrc/"]
EntryHandle["simd-r-drive-entry-handle\nsimd-r-drive-entry-handle/"]
Extensions["simd-r-drive-extensions\nextensions/"]
end
    
    subgraph "Experimental Network Crates"
        MuxioDef["simd-r-drive-muxio-service-definition\nexperiments/simd-r-drive-muxio-service-definition/"]
WSClient["simd-r-drive-ws-client\nexperiments/simd-r-drive-ws-client/"]
WSServer["simd-r-drive-ws-server\nexperiments/simd-r-drive-ws-server/"]
end
    
    subgraph "Excluded Python Bindings"
        PyBinding["Python Direct Binding\nexperiments/bindings/python/\n(Excluded from workspace)"]
PyWSClient["Python WebSocket Client\nexperiments/bindings/python-ws-client/\n(Excluded from workspace)"]
end
    
 
   Core --> EntryHandle
 
   Extensions --> Core
 
   WSServer --> Core
 
   WSServer --> MuxioDef
 
   WSClient --> Core
 
   WSClient --> MuxioDef
 
   PyBinding -.-> Core
 
   PyWSClient -.-> WSClient
    
    style Core fill:#f9f9f9,stroke:#333,stroke-width:3px
    style EntryHandle fill:#f9f9f9,stroke:#333,stroke-width:2px
    style Extensions fill:#f9f9f9,stroke:#333,stroke-width:2px

Sources: Cargo.toml:65-78

Core Crates

simd-r-drive (Root Package)

The main storage engine crate providing append-only, schema-less binary storage with SIMD optimizations. Located at the repository root with source code in src/.

Key Components:

DataStore - Main storage interface implementing DataStoreReader and DataStoreWriter traits
KeyIndexer - Hash-based index for O(1) key lookups using XXH3
Storage format implementation with 64-byte payload alignment
SIMD-optimized memory operations (simd_copy)
Recovery and validation logic

Features:

default - Standard functionality
expose-internal-api - Exposes internal APIs for testing
parallel - Enables Rayon-based parallel iteration
arrow - Proxy feature enabling Apache Arrow support in entry-handle

CLI Binary: The crate includes a CLI application for direct storage operations without network overhead.

Sources: Cargo.toml:11-56 Cargo.lock:1795-1820

simd-r-drive-entry-handle

Zero-copy data access abstraction located in simd-r-drive-entry-handle/. Provides safe wrappers around memory-mapped file regions without copying payload data.

Key Components:

EntryHandle - Zero-copy view into memory-mapped payload data
EntryMetadata - Fixed 20-byte metadata structure
Arrow integration (optional) for columnar data access
CRC32C checksum validation

Dependencies:

memmap2 - Memory-mapped file support
crc32fast - Checksum validation
arrow (optional) - Apache Arrow integration

The crate is dependency-minimal to enable use as a standalone zero-copy accessor.

Sources: Cargo.toml83 Cargo.lock:1823-1829

simd-r-drive-extensions

Utility functions and helper modules located in extensions/. Provides commonly-needed functionality for working with the storage engine.

Key Components:

align_or_copy - Utility for handling alignment requirements
Serialization helpers (bincode integration)
Testing utilities
File system utilities (walkdir integration)

Dependencies:

simd-r-drive - Core storage engine
bincode - Serialization support
serde - Serialization framework
tempfile - Temporary file handling
walkdir - Directory traversal

Sources: Cargo.toml69 Cargo.lock:1832-1841

Experimental Network Crates

These crates enable remote access to the storage engine via WebSocket-based RPC. Located in experiments/.

simd-r-drive-muxio-service-definition

RPC service contract definition using the Muxio framework. Located in experiments/simd-r-drive-muxio-service-definition/.

Purpose:

Defines RPC service interface shared between client and server
Uses bitcode for efficient binary serialization
Provides type-safe method definitions

Dependencies:

bitcode - Binary serialization
muxio-rpc-service - RPC service framework

Sources: Cargo.toml72 Cargo.lock:1844-1849

simd-r-drive-ws-client

Native Rust WebSocket client for remote storage access. Located in experiments/simd-r-drive-ws-client/.

Key Components:

BaseDataStoreWsClient - WebSocket client implementation
RPC method callers
Connection management
Async/await integration with Tokio

Dependencies:

simd-r-drive - Storage engine traits
simd-r-drive-muxio-service-definition - Shared RPC contract
muxio-tokio-rpc-client - WebSocket RPC client runtime
muxio-rpc-service-caller - RPC method invocation
tokio - Async runtime

Sources: Cargo.toml71 Cargo.lock:1852-1863

simd-r-drive-ws-server

WebSocket server exposing the storage engine over RPC. Located in experiments/simd-r-drive-ws-server/.

Key Components:

Axum-based HTTP/WebSocket server
RPC endpoint routing
DataStore backend integration
Configuration via CLI arguments

Dependencies:

simd-r-drive - Storage engine backend
simd-r-drive-muxio-service-definition - Shared RPC contract
muxio-tokio-rpc-server - WebSocket RPC server runtime
clap - CLI argument parsing
tokio - Async runtime

Sources: Cargo.toml70 Cargo.lock:1866-1878

Python Bindings (Excluded)

Two Python binding crates exist but are excluded from the main workspace due to their PyO3 and Maturin build requirements.

experiments/bindings/python

Direct Python bindings to the core storage engine using PyO3. Excluded at Cargo.toml75

Purpose:

FFI bridge between Python and Rust
Direct local access to DataStore from Python
PyO3-based class and method exports

experiments/bindings/python-ws-client

Python WebSocket client wrapping the Rust client. Excluded at Cargo.toml76

Purpose:

Remote storage access from Python
PyO3 wrapper around simd-r-drive-ws-client
Async/await bridge between Python and Tokio

For detailed documentation on these bindings, see Python Integration.

Sources: Cargo.toml:74-77

Dependency Graph

Inter-Crate Dependencies:

Sources: Cargo.lock:1795-1878 Cargo.toml:80-90

External Dependencies

Workspace-Level Shared Dependencies

The workspace defines common dependencies to ensure version consistency across all crates:

Storage & Performance:

Dependency	Version	Purpose
`memmap2`	0.9.5	Memory-mapped file I/O
`xxhash-rust`	0.8.15 (xxh3)	Fast hashing with hardware acceleration
`crc32fast`	1.4.2	CRC32C checksums
`arrow`	57.0.0	Apache Arrow integration (optional)

Async & Network:

Dependency	Version	Purpose
`tokio`	1.45.1	Async runtime (experiments only)
`muxio-rpc-service`	0.9.0-alpha	RPC framework
`muxio-tokio-rpc-client`	0.9.0-alpha	RPC client runtime
`muxio-tokio-rpc-server`	0.9.0-alpha	RPC server runtime

Serialization:

Dependency	Version	Purpose
`bitcode`	0.6.6	Binary serialization for RPC
`bincode`	1.3.3	Legacy serialization (extensions)
`serde`	1.0.219	Serialization framework

CLI & Utilities:

Dependency	Version	Purpose
`clap`	4.5.40	CLI argument parsing
`tracing`	0.1.41	Structured logging
`tracing-subscriber`	0.3.20	Log output configuration

Sources: Cargo.toml:80-112

Build Configuration

Workspace Settings

All crates share common metadata defined at the workspace level:

Sources: Cargo.toml:1-9

Benchmark Configuration

The root crate includes two benchmark harnesses:

storage_benchmark:

Harness: Criterion
Purpose: Measure write/read throughput and latency
Path: benches/storage_benchmark.rs

contention_benchmark:

Harness: Criterion
Purpose: Measure concurrent access performance
Path: benches/contention_benchmark.rs

Sources: Cargo.toml:57-63

Directory Layout

rust-simd-r-drive/
├── Cargo.toml                          # Workspace root
├── Cargo.lock                          # Dependency lock file
├── src/                                # simd-r-drive core source
│   ├── lib.rs
│   ├── data_store.rs
│   ├── key_indexer.rs
│   └── ...
├── simd-r-drive-entry-handle/          # Zero-copy handle crate
│   ├── Cargo.toml
│   └── src/
├── extensions/                         # Utilities crate
│   ├── Cargo.toml
│   └── src/
└── experiments/
    ├── simd-r-drive-ws-server/         # WebSocket server
    │   ├── Cargo.toml
    │   └── src/
    ├── simd-r-drive-ws-client/         # Rust WebSocket client
    │   ├── Cargo.toml
    │   └── src/
    ├── simd-r-drive-muxio-service-definition/  # RPC contract
    │   ├── Cargo.toml
    │   └── src/
    └── bindings/
        ├── python/                     # PyO3 direct binding (excluded)
        │   ├── Cargo.toml
        │   ├── pyproject.toml
        │   └── src/
        └── python-ws-client/           # PyO3 WS client (excluded)
            ├── Cargo.toml
            ├── pyproject.toml
            └── src/

Sources: Cargo.toml:65-78

Publishing Strategy

All workspace crates (except Python bindings) are configured for publishing to crates.io with publish = true. The Python bindings use Maturin for building Python wheels and are distributed separately via PyPI.

Crates.io Packages:

simd-r-drive - Core storage engine
simd-r-drive-entry-handle - Zero-copy handle
simd-r-drive-extensions - Utilities
simd-r-drive-ws-server - WebSocket server
simd-r-drive-ws-client - Rust WebSocket client
simd-r-drive-muxio-service-definition - RPC contract

PyPI Packages:

simd-r-drive (Python binding via Maturin)
simd-r-drive-ws-client-py (Python WebSocket client via Maturin)

For Python package build instructions, see Building Python Bindings.

Sources: Cargo.toml:1-9 Cargo.toml21

Storage Architecture

Relevant source files

Purpose and Scope

This document describes the core storage architecture of SIMD R Drive, covering the append-only design principles, single-file storage model, on-disk layout, validation chain structure, and recovery mechanisms. For details about the programmatic interface, see DataStore API. For specifics on entry structure and metadata format, see Entry Structure and Metadata. For memory mapping and zero-copy access patterns, see Memory Management and Zero-Copy Access.

Append-Only Design

SIMD R Drive implements a strict append-only storage model where data is written sequentially to the end of a single file. Entries are never modified or deleted in place; instead, updates and deletions are represented by appending new entries.

Key Characteristics

Characteristic	Implementation	Benefit
Sequential Writes	All writes append to file end	Minimizes disk seeks, maximizes throughput
Immutable Entries	Existing entries never modified	Enables concurrent reads without locks
Version Chain	Multiple versions stored for same key	Supports time-travel and recovery
Latest-Wins Semantics	Index points to most recent version	Ensures consistency in queries

The append-only model is enforced through the tail_offset field src/storage_engine/data_store.rs30 which tracks the current end of valid data. Write operations acquire an exclusive lock, append data, and atomically update the tail offset src/storage_engine/data_store.rs:816-824

Write Flow

Sources: src/storage_engine/data_store.rs:752-825 src/storage_engine/data_store.rs:827-843

Single-File Storage Container

All data resides in a single binary file managed by the DataStore struct src/storage_engine/data_store.rs:27-33 This design simplifies deployment, backup, and replication while enabling efficient memory-mapped access.

graph TB
    subgraph "DataStore Structure"
        FileHandle["file:\nArc&lt;RwLock&lt;BufWriter&lt;File&gt;&gt;&gt;"]
MmapHandle["mmap:\nArc&lt;Mutex&lt;Arc&lt;Mmap&gt;&gt;&gt;"]
TailOffset["tail_offset:\nAtomicU64"]
Indexer["key_indexer:\nArc&lt;RwLock&lt;KeyIndexer&gt;&gt;"]
PathBuf["path:\nPathBuf"]
end
    
    subgraph "Write Operations"
        WriteOp["Write/Delete Operations"]
end
    
    subgraph "Read Operations"
        ReadOp["Read/Exists Operations"]
end
    
    subgraph "Physical Storage"
        DiskFile["Storage File on Disk"]
end
    
 
   WriteOp --> FileHandle
 
   FileHandle --> DiskFile
    
 
   ReadOp --> MmapHandle
 
   ReadOp --> Indexer
 
   MmapHandle --> DiskFile
    
 
   WriteOp -.->|updates after flush| TailOffset
 
   WriteOp -.->|remaps after flush| MmapHandle
 
   WriteOp -.->|updates after flush| Indexer
    
 
   TailOffset -.->|tracks valid data end| DiskFile

File Handles and Memory Mapping

Sources: src/storage_engine/data_store.rs:27-33 src/storage_engine/data_store.rs:84-117

File Opening and Initialization

The DataStore::open function src/storage_engine/data_store.rs:84-117 performs the following sequence:

Open File : Opens the file in read/write mode, creating it if necessary src/storage_engine/data_store.rs:161-170
Initial Mapping : Creates an initial memory-mapped view src/storage_engine/data_store.rs:172-174
Chain Recovery : Validates the storage chain and determines the valid data extent src/storage_engine/data_store.rs:383-482
Truncation (if needed) : Truncates corrupted data if recovery finds inconsistencies src/storage_engine/data_store.rs:91-104
Index Building : Constructs the in-memory hash index by scanning valid entries src/storage_engine/data_store.rs108

Sources: src/storage_engine/data_store.rs:84-117 src/storage_engine/data_store.rs:141-144

Storage Layout

Entries are stored with 64-byte alignment to optimize cache efficiency and SIMD operations. The alignment constant PAYLOAD_ALIGNMENT is defined as 64 simd-r-drive-entry-handle/src/constants.rs

Non-Tombstone Entry Layout

The pre-padding calculation ensures the payload starts on a 64-byte boundary src/storage_engine/data_store.rs:669-673:

pad = (PAYLOAD_ALIGNMENT - (prev_tail % PAYLOAD_ALIGNMENT)) & (PAYLOAD_ALIGNMENT - 1)

Where prev_tail is the absolute file offset of the previous entry's metadata end.

graph LR
    subgraph "Tombstone Entry"
        NullByte["NULL Byte\n(1 byte)\n0x00"]
KeyHash["Key Hash\n(8 bytes)\nXXH3_64"]
PrevOff["Prev Offset\n(8 bytes)\nPrevious Tail"]
Checksum["Checksum\n(4 bytes)\nCRC32C"]
end
    
 
   NullByte --> KeyHash
 
   KeyHash --> PrevOff
 
   PrevOff --> Checksum

Tombstone Entry Layout

Tombstones represent deleted keys and use a minimal layout without pre-padding:

The single NULL_BYTE src/storage_engine.rs136 serves as both the tombstone marker and payload. This design minimizes space overhead for deletions src/storage_engine/data_store.rs:864-897

Layout Summary Table

Component	Non-Tombstone Size	Tombstone Size	Purpose
Pre-Pad	0-63 bytes	0 bytes	Align payload to 64-byte boundary
Payload	Variable	1 byte (0x00)	User data or deletion marker
Key Hash	8 bytes	8 bytes	XXH3_64 hash for index lookup
Prev Offset	8 bytes	8 bytes	Points to previous tail (chain link)
Checksum	4 bytes	4 bytes	CRC32C of payload for integrity

Sources: README.md:112-137 src/storage_engine/data_store.rs:669-673 src/storage_engine/data_store.rs:864-897

Validation Chain

Each entry's metadata contains a prev_offset field that points to the absolute file offset of the previous entry's metadata end (the "previous tail"). This creates a backward-linked chain from the end of the file to offset 0.

graph RL
    EOF["End of File\n(tail_offset)"]
Meta3["Entry 3 Metadata\nprev_offset = T2"]
Entry3["Entry 3 Payload\n(aligned)"]
Meta2["Entry 2 Metadata\nprev_offset = T1"]
Entry2["Entry 2 Payload\n(aligned)"]
Meta1["Entry 1 Metadata\nprev_offset = 0"]
Entry1["Entry 1 Payload\n(aligned)"]
Zero["Offset 0\n(File Start)"]
EOF --> Meta3
 
   Meta3 -.->|prev_offset| Meta2
 
   Meta3 --> Entry3
    
 
   Meta2 -.->|prev_offset| Meta1
 
   Meta2 --> Entry2
    
 
   Meta1 -.->|prev_offset| Zero
 
   Meta1 --> Entry1
    
 
   Entry1 --> Zero

Chain Structure

Sources: README.md:139-148 src/storage_engine/data_store.rs:383-482

Chain Validation Properties

The validation chain ensures data integrity through several properties:

Completeness : A valid chain must trace back to offset 0 without gaps src/storage_engine/data_store.rs473
Forward Progress : Each prev_offset must be less than the current metadata offset src/storage_engine/data_store.rs:465-468
Bounds Checking : All offsets must be within valid file boundaries src/storage_engine/data_store.rs:435-438
Size Consistency : Total chain size must not exceed file length src/storage_engine/data_store.rs473

These properties are enforced during the recover_valid_chain function src/storage_engine/data_store.rs:383-482 which validates the chain structure on every file open.

Sources: src/storage_engine/data_store.rs:383-482

Recovery Mechanisms

SIMD R Drive implements automatic recovery to handle incomplete writes from crashes, power loss, or other interruptions. The recovery process occurs transparently during DataStore::open.

graph TB
    Start["Open Storage File"]
CheckSize{{"File Size ≥\nMETADATA_SIZE?"}}
EmptyFile["Return offset 0\n(Empty File)"]
InitCursor["cursor = file_length"]
ReadMeta["Read metadata at\ncursor - METADATA_SIZE"]
ExtractPrev["Extract prev_offset"]
CalcBounds["Calculate entry bounds\nusing prepad_len"]
ValidateBounds{{"Entry bounds valid?"}}
DecrementCursor["cursor -= 1"]
WalkBackward["Walk chain backward\nusing prev_offset"]
CheckChain{{"Chain reaches\noffset 0?"}}
CheckSize2{{"Total size ≤\nfile_length?"}}
FoundValid["Store valid offset\nbest_valid_offset = cursor"]
MoreToCheck{{"cursor ≥\nMETADATA_SIZE?"}}
ReturnBest["Return best_valid_offset"]
Truncate["Truncate file to\nbest_valid_offset"]
Reopen["Recursively reopen\nDataStore::open"]
BuildIndex["Build KeyIndexer\nfrom valid chain"]
Complete["Recovery Complete"]
Start --> CheckSize
 
   CheckSize -->|No| EmptyFile
 
   CheckSize -->|Yes| InitCursor
 
   EmptyFile --> Complete
    
 
   InitCursor --> ReadMeta
 
   ReadMeta --> ExtractPrev
 
   ExtractPrev --> CalcBounds
 
   CalcBounds --> ValidateBounds
    
 
   ValidateBounds -->|No| DecrementCursor
 
   ValidateBounds -->|Yes| WalkBackward
    
 
   WalkBackward --> CheckChain
 
   CheckChain -->|No| DecrementCursor
 
   CheckChain -->|Yes| CheckSize2
    
 
   CheckSize2 -->|No| DecrementCursor
 
   CheckSize2 -->|Yes| FoundValid
    
 
   FoundValid --> ReturnBest
    
 
   DecrementCursor --> MoreToCheck
 
   MoreToCheck -->|Yes| ReadMeta
 
   MoreToCheck -->|No| ReturnBest
    
 
   ReturnBest --> Truncate
 
   Truncate --> Reopen
 
   Reopen --> BuildIndex
 
   BuildIndex --> Complete

Recovery Algorithm

Sources: src/storage_engine/data_store.rs:383-482 src/storage_engine/data_store.rs:91-104

Recovery Steps Detailed

Step 1: Backward Scan

Starting from the end of file, the recovery algorithm scans backward looking for valid metadata entries. For each potential metadata location src/storage_engine/data_store.rs:390-394:

Reads 20 bytes of metadata
Deserializes into EntryMetadata structure
Extracts prev_offset (previous tail)
Calculates entry boundaries using alignment rules

Step 2: Chain Validation

For each candidate entry, the algorithm walks the entire chain backward src/storage_engine/data_store.rs:424-471:

back_cursor = prev_tail
while back_cursor != 0:
    - Read previous entry metadata
    - Validate bounds and size
    - Check forward progress (prev_prev_tail < prev_metadata_offset)
    - Accumulate total chain size
    - Update back_cursor to previous prev_offset

A chain is valid only if it reaches offset 0 and the total size doesn't exceed file length src/storage_engine/data_store.rs473

Step 3: Truncation and Retry

If corruption is detected (file length > valid chain extent), the function:

Logs a warning src/storage_engine/data_store.rs:92-97
Drops existing file handles and memory map src/storage_engine/data_store.rs:98-99
Reopens file with write access src/storage_engine/data_store.rs100
Truncates to valid offset src/storage_engine/data_store.rs101
Syncs to disk src/storage_engine/data_store.rs102
Recursively calls DataStore::open src/storage_engine/data_store.rs103

Step 4: Index Rebuilding

After establishing the valid data extent, KeyIndexer::build scans forward through the file src/storage_engine/data_store.rs108 populating the hash index with key → offset mappings. This forward scan uses the iterator pattern src/storage_engine/entry_iterator.rs:56-127 to process each entry sequentially.

Sources: src/storage_engine/data_store.rs:383-482 src/storage_engine/data_store.rs:91-104 src/storage_engine/entry_iterator.rs:21-54

graph TB
    subgraph "Storage Architecture (2.1)"
        AppendOnly["Append-Only Design"]
SingleFile["Single-File Container"]
Layout["Storage Layout\n(64-byte alignment)"]
Chain["Validation Chain"]
Recovery["Recovery Mechanism"]
end
    
    subgraph "DataStore API (2.2)"
        Write["write/batch_write"]
Read["read/batch_read"]
Delete["delete/batch_delete"]
end
    
    subgraph "Entry Structure (2.3)"
        EntryMetadata["EntryMetadata\n(key_hash, prev_offset, checksum)"]
PrePad["Pre-Padding Logic"]
Tombstone["Tombstone Format"]
end
    
    subgraph "Memory Management (2.4)"
        Mmap["Memory-Mapped File"]
EntryHandle["EntryHandle\n(Zero-Copy View)"]
Alignment["PAYLOAD_ALIGNMENT"]
end
    
    subgraph "Key Indexer (2.6)"
        KeyIndexer["KeyIndexer"]
HashLookup["Hash → Offset Mapping"]
end
    
 
   AppendOnly -.->|enforces| Write
 
   SingleFile -.->|provides| Mmap
 
   Layout -.->|defines| PrePad
 
   Layout -.->|uses| Alignment
 
   Chain -.->|uses| EntryMetadata
 
   Recovery -.->|validates| Chain
 
   Recovery -.->|rebuilds| KeyIndexer
    
 
   Write -.->|appends| Layout
 
   Read -.->|uses| HashLookup
 
   Read -.->|returns| EntryHandle
 
   Delete -.->|writes| Tombstone

Architecture Integration

The storage architecture integrates with other subsystems through well-defined boundaries:

Component Interactions

Sources: src/storage_engine/data_store.rs:1-1183 src/storage_engine.rs:1-25

Key Design Decisions

Decision	Rationale	Trade-off
64-byte alignment	Matches CPU cache lines, enables SIMD	Wastes 0-63 bytes per entry
Backward-linked chain	Efficient recovery from EOF	Cannot validate forward
Single NULL byte tombstones	Minimal deletion overhead	Special case in code
Metadata-last layout	Sequential write pattern	Recovery scans backward
Atomic tail_offset	Lock-free progress tracking	Additional synchronization primitive

Sources: README.md:51-59 README.md:104-148 src/storage_engine/data_store.rs:669-673

Performance Characteristics

The storage architecture exhibits the following performance properties:

Write Performance

Sequential I/O : All writes append to file end, maximizing disk throughput
Batching : Multiple entries written in single lock acquisition src/storage_engine/data_store.rs:847-939
SIMD Acceleration : Payload copying uses platform-specific SIMD instructions src/storage_engine/data_store.rs925
Buffer Flushing : BufWriter reduces system call overhead src/storage_engine/data_store.rs28

Read Performance

Zero-Copy : Reads directly reference memory-mapped regions src/storage_engine/data_store.rs:1040-1049
Index Lookup : O(1) hash table lookup to locate entries src/storage_engine/data_store.rs:1042-1046
Concurrent Reads : Multiple readers access mmap simultaneously without locks
Cache Efficiency : 64-byte alignment ensures single cache line access for aligned data

Recovery Performance

Incremental Validation : Stops at first valid chain found src/storage_engine/data_store.rs:474-476
Backward Scan : Starts from likely-valid tail, avoiding full file scan
Indexed Rebuild : Forward scan populates index in single pass src/storage_engine/entry_iterator.rs:56-127

Sources: src/storage_engine/data_store.rs:752-939 src/storage_engine/data_store.rs:1027-1183 src/storage_engine/data_store.rs:383-482

Limitations and Considerations

Append-Only Growth

The file grows continuously with updates and deletions. Compaction is available but requires exclusive access src/storage_engine/data_store.rs:706-749

Alignment Overhead

Each non-tombstone entry may waste up to 63 bytes for alignment padding. For small payloads, this overhead can be significant relative to payload size.

Recovery Cost

Recovery time increases with file size, as chain validation requires traversing all entries. For very large files (exceeding RAM), this may cause extended startup times.

Single-Process Design

While thread-safe within a process, multiple processes cannot safely share the same file without external locking mechanisms README.md:185-206

Sources: README.md:51-59 README.md:185-206 src/storage_engine/data_store.rs:669-673 src/storage_engine/data_store.rs:706-749

DataStore API

Relevant source files

This document describes the public API of the DataStore struct, which provides the primary interface for interacting with the SIMD R Drive storage engine. It covers the methods for opening storage files, reading and writing data, batch operations, streaming, iteration, and maintenance operations.

For details on the underlying storage architecture and design principles, see Storage Architecture. For information about entry structure and metadata format, see Entry Structure and Metadata. For details on concurrency mechanisms, see Concurrency and Thread Safety.

API Structure

The DataStore API is organized around two primary traits that separate read and write capabilities, along with additional utility methods provided directly on the DataStore struct.

Sources: src/storage_engine/data_store.rs:26-750 src/storage_engine/traits.rs src/storage_engine.rs:1-24

graph TB
    subgraph "Trait Definitions"
        DSReader["DataStoreReader\ntrait"]
DSWriter["DataStoreWriter\ntrait"]
end
    
    subgraph "Core Implementation"
        DS["DataStore\nstruct"]
end
    
    subgraph "Associated Types"
        EH["EntryHandle"]
EM["EntryMetadata"]
EI["EntryIterator"]
ES["EntryStream"]
end
    
 
   DSReader -->|implemented by| DS
 
   DSWriter -->|implemented by| DS
    
 
   DS -->|returns| EH
 
   DS -->|returns| EM
 
   DS -->|provides| EI
 
   EH -->|converts to| ES
    
 
   DSReader -->|defines read ops| ReadMethods["read()\nbatch_read()\nexists()\nread_metadata()\nread_last_entry()"]
DSWriter -->|defines write ops| WriteMethods["write()\nbatch_write()\nwrite_stream()\ndelete()\nrename()\ncopy()"]
DS -->|direct methods| DirectMethods["open()\niter_entries()\ncompact()\nget_path()"]

Core Types

Type	Purpose	Defined In
`DataStore`	Main storage engine implementation	src/storage_engine/data_store.rs:27-33
`DataStoreReader`	Trait defining read operations	Trait definition
`DataStoreWriter`	Trait defining write operations	Trait definition
`EntryHandle`	Zero-copy reference to stored data	`simd_r_drive_entry_handle` crate
`EntryMetadata`	Entry metadata structure (key hash, prev offset, checksum)	`simd_r_drive_entry_handle` crate
`EntryIterator`	Sequential iterator over entries	src/storage_engine/entry_iterator.rs:21-127
`EntryStream`	Streaming reader for entry data	src/storage_engine/entry_stream.rs

Sources: src/storage_engine/data_store.rs:27-33 src/storage_engine/entry_iterator.rs:8-25 src/storage_engine.rs:1-24

Opening Storage Files

The DataStore provides two methods for opening storage files, both returning Result<DataStore>.

Sources: src/storage_engine/data_store.rs:84-117 src/storage_engine/data_store.rs:141-144 src/storage_engine/data_store.rs:161-170

graph LR
    subgraph "Open Methods"
        Open["open(path)"]
OpenExisting["open_existing(path)"]
end
    
    subgraph "Process Flow"
        OpenFile["open_file_in_append_mode()"]
InitMmap["init_mmap()"]
RecoverChain["recover_valid_chain()"]
BuildIndex["KeyIndexer::build()"]
end
    
    subgraph "Result"
        DSInstance["DataStore instance"]
end
    
 
   Open -->|creates if missing| OpenFile
 
   OpenExisting -->|requires existing file| VerifyFile["verify_file_existence()"]
VerifyFile --> OpenFile
    
 
   OpenFile --> InitMmap
 
   InitMmap --> RecoverChain
 
   RecoverChain -->|may truncate| BuildIndex
 
   BuildIndex --> DSInstance

open

Opens or creates a storage file at the specified path. This method performs the following operations:

Opens the file in read/write mode (creates if necessary)
Memory-maps the file using mmap
Runs recover_valid_chain() to validate data integrity
Truncates the file if corruption is detected
Builds the in-memory key index using KeyIndexer::build()

Parameters:

path: Path to the storage file

Returns:

Ok(DataStore): Successfully opened storage instance
Err(std::io::Error): File operation failure

Sources: src/storage_engine/data_store.rs:84-117

open_existing

Opens an existing storage file. Unlike open(), this method returns an error if the file does not exist. It calls verify_file_existence() before delegating to open().

Parameters:

path: Path to the existing storage file

Returns:

Ok(DataStore): Successfully opened storage instance
Err(std::io::Error): File does not exist or cannot be opened

Sources: src/storage_engine/data_store.rs:141-144

Type Conversion Constructors

Allows creating a DataStore directly from a PathBuf:

This conversion calls DataStore::open() internally and panics if the file cannot be opened.

Sources: src/storage_engine/data_store.rs:53-64

Read Operations

Read operations are defined by the DataStoreReader trait and return EntryHandle instances that provide zero-copy access to stored data.

Sources: src/storage_engine/data_store.rs:1027-1182

graph TB
    subgraph "Read API Methods"
        Read["read(key)"]
ReadHash["read_with_key_hash(hash)"]
BatchRead["batch_read(keys)"]
BatchReadHash["batch_read_hashed_keys(hashes)"]
ReadLast["read_last_entry()"]
ReadMeta["read_metadata(key)"]
Exists["exists(key)"]
ExistsHash["exists_with_key_hash(hash)"]
end
    
    subgraph "Internal Flow"
        ComputeHash["compute_hash()"]
GetIndexLock["key_indexer.read()"]
GetMmap["get_mmap_arc()"]
ReadContext["read_entry_with_context()"]
end
    
    subgraph "Return Types"
        OptHandle["Option<EntryHandle>"]
VecOptHandle["Vec<Option<EntryHandle>>"]
OptMeta["Option<EntryMetadata>"]
Bool["bool"]
end
    
 
   Read --> ComputeHash
 
   ComputeHash --> ReadHash
 
   ReadHash --> GetIndexLock
 
   GetIndexLock --> GetMmap
 
   GetMmap --> ReadContext
 
   ReadContext --> OptHandle
    
 
   BatchRead --> BatchReadHash
 
   BatchReadHash --> ReadContext
 
   ReadContext --> VecOptHandle
    
 
   ReadMeta --> Read
 
   Read --> OptMeta
    
 
   Exists --> Read
 
   Read --> Bool

read

Reads a single entry by key. Computes the key hash using compute_hash(), acquires a read lock on the key index, and calls read_entry_with_context() to retrieve the entry.

Parameters:

key: Raw key bytes

Returns:

Ok(Some(EntryHandle)): Entry found and valid
Ok(None): Entry not found or is a tombstone
Err(std::io::Error): Lock acquisition failure

Tag Verification: When the original key is provided, the method verifies the stored tag matches KeyIndexer::tag_from_key(key) to detect hash collisions.

Sources: src/storage_engine/data_store.rs:1040-1049

read_with_key_hash

Reads an entry using a pre-computed hash. This method skips the hashing step and tag verification, useful when the hash is already known.

Parameters:

prehashed_key: Pre-computed XXH3 hash of the key

Returns:

Ok(Some(EntryHandle)): Entry found
Ok(None): Entry not found
Err(std::io::Error): Lock acquisition failure

Note: Tag verification is not performed when using pre-computed hashes.

Sources: src/storage_engine/data_store.rs:1051-1059

batch_read

Reads multiple entries in a single operation. Computes all hashes using compute_hash_batch(), then delegates to batch_read_hashed_keys().

Parameters:

keys: Slice of key references

Returns:

Ok(Vec<Option<EntryHandle>>): Vector of results, same length as input
Err(std::io::Error): Lock acquisition failure

Efficiency: Acquires the index read lock once for all lookups, reducing lock contention compared to multiple read() calls.

Sources: src/storage_engine/data_store.rs:1105-1109

batch_read_hashed_keys

Reads multiple entries using pre-computed hashes. Optionally accepts original keys for tag verification.

Parameters:

prehashed_keys: Slice of pre-computed key hashes
non_hashed_keys: Optional slice of original keys for tag verification (must match length of prehashed_keys)

Returns:

Ok(Vec<Option<EntryHandle>>): Vector of results
Err(std::io::Error): Lock failure or length mismatch

Sources: src/storage_engine/data_store.rs:1111-1158

read_last_entry

Reads the most recently written entry (at tail_offset - METADATA_SIZE). Does not verify the key or check the index; directly reads from the tail position.

Returns:

Ok(Some(EntryHandle)): Last entry retrieved
Ok(None): Storage is empty or tail is invalid
Err(std::io::Error): Not applicable (infallible after initial checks)

Sources: src/storage_engine/data_store.rs:1061-1103

read_metadata

Reads only the metadata for a key without accessing the payload data. Calls read() and extracts the metadata field.

Returns:

Ok(Some(EntryMetadata)): Metadata retrieved
Ok(None): Key not found
Err(std::io::Error): Read failure

Sources: src/storage_engine/data_store.rs:1160-1162

exists

Checks if a key exists in the storage.

Returns:

Ok(true): Key exists and is not a tombstone
Ok(false): Key not found or is deleted
Err(std::io::Error): Read failure

Sources: src/storage_engine/data_store.rs:1030-1032

exists_with_key_hash

Checks key existence using a pre-computed hash.

Sources: src/storage_engine/data_store.rs:1034-1038

graph TB
    subgraph "Write API Methods"
        Write["write(key, payload)"]
WriteHash["write_with_key_hash(hash, payload)"]
BatchWrite["batch_write(entries)"]
BatchWriteHash["batch_write_with_key_hashes(entries)"]
WriteStream["write_stream(key, reader)"]
WriteStreamHash["write_stream_with_key_hash(hash, reader)"]
end
    
    subgraph "Core Write Flow"
        AcquireLock["file.write()"]
GetTail["tail_offset.load()"]
CalcPrepad["prepad_len(tail)"]
WritePrepad["write zeros for alignment"]
WritePayload["write payload data"]
WriteMeta["write EntryMetadata"]
Flush["file.flush()"]
Reindex["reindex()"]
end
    
    subgraph "Post-Write"
        RemapMmap["init_mmap()"]
UpdateIndex["key_indexer.insert()"]
UpdateTail["tail_offset.store()"]
end
    
 
   Write --> WriteHash
 
   WriteHash --> BatchWriteHash
 
   WriteStream --> WriteStreamHash
    
 
   BatchWriteHash --> AcquireLock
 
   WriteStreamHash --> AcquireLock
    
 
   AcquireLock --> GetTail
 
   GetTail --> CalcPrepad
 
   CalcPrepad --> WritePrepad
 
   WritePrepad --> WritePayload
 
   WritePayload --> WriteMeta
 
   WriteMeta --> Flush
 
   Flush --> Reindex
    
 
   Reindex --> RemapMmap
 
   RemapMmap --> UpdateIndex
 
   UpdateIndex --> UpdateTail

Write Operations

Write operations are defined by the DataStoreWriter trait. All writes are append-only and atomic with respect to the file lock.

Sources: src/storage_engine/data_store.rs:752-1025

write

Writes a single key-value pair. Computes the key hash and delegates to write_with_key_hash().

Parameters:

key: Key bytes
payload: Payload bytes (cannot be empty or all NULL bytes)

Returns:

Ok(tail_offset): New tail offset after write
Err(std::io::Error): Write failure

Sources: src/storage_engine/data_store.rs:827-830

write_with_key_hash

Writes a payload using a pre-computed key hash. Delegates to batch_write_with_key_hashes() with a single entry.

Sources: src/storage_engine/data_store.rs:832-834

batch_write

Writes multiple key-value pairs in a single atomic operation. All entries are buffered and written together, then flushed once.

Parameters:

entries: Slice of (key, payload) tuples

Returns:

Ok(tail_offset): New tail offset after batch write
Err(std::io::Error): Write failure (entire batch is rejected)

Efficiency: Reduces I/O overhead and lock contention compared to multiple write() calls.

Sources: src/storage_engine/data_store.rs:838-843

batch_write_with_key_hashes

Core batch write implementation. Processes each entry, calculating alignment padding and writing pre-pad, payload, and metadata. All data is buffered before flushing to disk.

Parameters:

prehashed_keys: Vector of (hash, payload) tuples
allow_null_bytes: Whether to allow single NULL byte payloads (used for tombstones)

Returns:

Ok(tail_offset): New tail offset
Err(std::io::Error): Write failure

Write Process:

Acquire exclusive write lock on file
Load current tail_offset
For each entry:
- Calculate prepad_len(tail_offset) for 64-byte alignment
- Buffer pre-pad zeros (if needed)
- Buffer payload (using simd_copy())
- Buffer metadata (key_hash, prev_offset, checksum)
- Update tail_offset
Write entire buffer to file
Flush to disk
Call reindex() to update mmap and index

Sources: src/storage_engine/data_store.rs:847-939

write_stream

Writes a payload from a streaming source without loading it entirely into memory. Useful for large entries.

Parameters:

key: Key bytes
reader: Any type implementing Read trait

Returns:

Ok(tail_offset): New tail offset after write
Err(std::io::Error): Write or read failure

Sources: src/storage_engine/data_store.rs:753-756

write_stream_with_key_hash

Core streaming write implementation. Reads from the source in chunks of WRITE_STREAM_BUFFER_SIZE bytes, computing the checksum incrementally.

Write Process:

Acquire write lock
Write pre-pad bytes for alignment
Read chunks from reader using WRITE_STREAM_BUFFER_SIZE buffer
Write each chunk immediately to file
Update checksum incrementally
Write metadata after all payload bytes
Flush and reindex

Validation:

Rejects empty payloads
Rejects payloads containing only NULL bytes

Sources: src/storage_engine/data_store.rs:758-825

Delete Operations

Delete operations write tombstones (single NULL byte entries) to the storage. They are implemented using the write infrastructure with allow_null_bytes = true.

delete

Deletes a single key by writing a tombstone. Delegates to batch_delete().

Returns:

Ok(tail_offset): New tail offset
Err(std::io::Error): Write failure

Sources: src/storage_engine/data_store.rs:986-988

batch_delete

Deletes multiple keys in a single operation. Computes hashes and delegates to batch_delete_key_hashes().

Sources: src/storage_engine/data_store.rs:990-993

batch_delete_key_hashes

Core batch delete implementation. First checks which keys actually exist to avoid writing unnecessary tombstones, then writes tombstones for existing keys only.

Process:

Acquire read lock on index
Filter keys to only those that exist
Release read lock
If no keys exist, return immediately without I/O
Create delete operations: (hash, &NULL_BYTE)
Call batch_write_with_key_hashes() with allow_null_bytes = true

Optimization: Tombstones are written without pre-pad bytes since they are always exactly 1 byte.

Sources: src/storage_engine/data_store.rs:995-1024

Key Management Operations

These operations manipulate entries by key, providing higher-level functionality built on read and write operations.

rename

Renames a key by copying its data to a new key and deleting the old key.

Parameters:

old_key: Current key
new_key: New key (must be different from old key)

Returns:

Ok(tail_offset): New tail offset after rename
Err(std::io::Error): Key not found, or keys are identical

Process:

Read entry at old_key
Convert to EntryStream
Write stream to new_key
Delete old_key

Sources: src/storage_engine/data_store.rs:941-958

copy

Copies an entry from this storage to a different storage instance.

Parameters:

key: Key to copy
target: Target DataStore (must have a different path)

Returns:

Ok(target_offset): Offset in target storage where entry was written
Err(std::io::Error): Key not found, same storage path, or write failure

Process:

Read entry handle for key
Call copy_handle() to write to target
Uses EntryStream for streaming copy

Sources: src/storage_engine/data_store.rs:960-979

transfer

Transfers an entry to another storage by copying then deleting.

Parameters:

key: Key to transfer
target: Target DataStore

Returns:

Ok(tail_offset): New tail offset after deletion
Err(std::io::Error): Copy or delete failure

Process:

graph TB
    subgraph "Iteration Methods"
        IntoIter["into_iter()"]
IterEntries["iter_entries()"]
ParIterEntries["par_iter_entries()"]
end
    
    subgraph "Iterator Types"
        EI["EntryIterator"]
PI["ParallelIterator<EntryHandle>"]
end
    
    subgraph "Implementation Details"
        GetMmap["get_mmap_arc()"]
GetTail["tail_offset.load()"]
CollectOffsets["collect packed offsets"]
FilterMap["filter_map entries"]
end
    
 
   IntoIter -->|consumes DataStore| IterEntries
 
   IterEntries --> GetMmap
 
   GetMmap --> GetTail
 
   GetTail --> EI
    
 
   ParIterEntries -->|requires parallel feature| CollectOffsets
 
   CollectOffsets --> FilterMap
 
   FilterMap --> PI

Copy entry to target
Delete entry from source

Sources: src/storage_engine/data_store.rs:981-984

Iteration

The DataStore provides methods for iterating over all valid entries. Iteration yields EntryHandle instances for each unique key, providing zero-copy access.

Sources: src/storage_engine/data_store.rs:269-361 src/storage_engine/entry_iterator.rs:8-127

iter_entries

Creates a sequential iterator over all valid entries. The iterator:

Starts at tail_offset and moves backward
Skips tombstones (deleted entries)
Ensures unique keys (only returns latest version)
Uses zero-copy access via EntryHandle

Returns:

EntryIterator: Sequential iterator over EntryHandle

Example:

Sources: src/storage_engine/data_store.rs:276-280

IntoIterator Implementation

Allows consuming a DataStore to produce an iterator:

This delegates to iter_entries() and consumes the DataStore instance.

Sources: src/storage_engine/data_store.rs:44-51

par_iter_entries (Parallel Feature)

Creates a parallel iterator using Rayon for multi-threaded processing. This method:

Acquires a read lock on the index
Collects all packed offset values (fast operation)
Releases the lock immediately
Returns a ParallelIterator that processes entries across threads

Returns:

ParallelIterator<Item = EntryHandle>: Parallel iterator over entries

Performance: Suitable for bulk operations on multi-core systems where each entry can be processed independently.

Sources: src/storage_engine/data_store.rs:296-361

Maintenance Operations

These methods provide utilities for monitoring and optimizing storage.

compact

Compacts the storage by creating a new file containing only the latest version of each key. This operation:

Creates temporary file at {path}.bk
Iterates through current entries
Copies latest version of each key to temporary file
Swaps temporary file with original file

Requirements:

Requires &mut self (exclusive mutable reference)
Should only be called when no other threads access the storage

Warning: While &mut self prevents concurrent mutations, it does not prevent other threads from holding shared references (&DataStore) if wrapped in Arc<DataStore>. External synchronization may be required.

Returns:

Ok(()): Compaction successful
Err(std::io::Error): I/O failure

Sources: src/storage_engine/data_store.rs:706-749

estimate_compaction_savings

Calculates potential space savings from compaction by comparing total file size to the size needed for unique entries only.

Returns:

Number of bytes that would be saved by compaction

Implementation: Iterates through entries, tracking seen keys, and sums the file size of unique entries.

Sources: src/storage_engine/data_store.rs:605-616

get_path

Returns the file path of the storage.

Sources: src/storage_engine/data_store.rs:265-267

len

Returns the number of unique keys currently stored (excluding tombstones).

Returns:

Ok(count): Number of keys in index
Err(std::io::Error): Lock acquisition failure

Sources: src/storage_engine/data_store.rs:1164-1171

is_empty

Checks if the storage contains any entries.

Sources: src/storage_engine/data_store.rs:1173-1177

file_size

Returns the total size of the storage file in bytes.

Sources: src/storage_engine/data_store.rs:1179-1181

Internal Methods

The following methods are internal implementation details but are important for understanding the API's behavior.

read_entry_with_context

Core read logic used by all read methods. Performs:

Index lookup via key_indexer_guard.get_packed(&key_hash)
Unpacking of tag and offset via KeyIndexer::unpack()
Tag verification (if non_hashed_key is provided)
Metadata deserialization
Entry bounds calculation (handling pre-pad and tombstones)
Construction of EntryHandle

Tag Verification: If non_hashed_key is provided and tag does not match KeyIndexer::tag_from_key(), returns None and logs a warning about potential hash collision.

Sources: src/storage_engine/data_store.rs:502-565

reindex

Updates the memory map and key index after write operations. This method:

Creates new mmap via init_mmap()
Acquires locks on mmap and key_indexer
Inserts or removes key mappings
Updates atomic tail_offset

Hash Collision Handling: If key_indexer_guard.insert() returns an error (collision detected), the entire operation fails to prevent inconsistent state.

Sources: src/storage_engine/data_store.rs:224-259

prepad_len

Calculates the number of padding bytes needed to align offset to PAYLOAD_ALIGNMENT (64 bytes). Formula:

prepad = (PAYLOAD_ALIGNMENT - (offset % PAYLOAD_ALIGNMENT)) & (PAYLOAD_ALIGNMENT - 1)

This ensures all non-tombstone payloads start on 64-byte boundaries.

Sources: src/storage_engine/data_store.rs:670-673

API Usage Patterns

Basic Read-Write Pattern

Batch Operations Pattern

Streaming Pattern

Iteration Pattern

Sources: src/lib.rs:20-115 README.md:208-246

Dismiss

Refresh this wiki

This wiki was recently refreshed. Please wait 4 days to refresh again.

DataStore API
API Structure
Core Types
Opening Storage Files
open
open_existing
Type Conversion Constructors
Read Operations
read
read_with_key_hash
batch_read
batch_read_hashed_keys
read_last_entry
read_metadata
exists
exists_with_key_hash
Write Operations
write
write_with_key_hash
batch_write
batch_write_with_key_hashes
write_stream
write_stream_with_key_hash
Delete Operations
delete
batch_delete
batch_delete_key_hashes
Key Management Operations
rename
copy
transfer
Iteration
iter_entries
IntoIterator Implementation
par_iter_entries (Parallel Feature)
Maintenance Operations
compact
estimate_compaction_savings
get_path
len
is_empty
file_size
Internal Methods
read_entry_with_context
reindex
prepad_len
API Usage Patterns
Basic Read-Write Pattern
Batch Operations Pattern
Streaming Pattern
Iteration Pattern

Entry Structure and Metadata

Relevant source files

Purpose and Scope

This document details the on-disk binary layout of entries in the SIMD R Drive storage engine. It covers the structure of aligned entries, tombstones, metadata fields, and the alignment strategy that enables zero-copy access.

For information about how entries are read and accessed in memory, see Memory Management and Zero-Copy Access. For details on the validation chain and recovery mechanisms, see Storage Architecture.

On-Disk Entry Layout Overview

Every entry written to the storage file consists of three components:

Pre-Pad Bytes (optional, 0-63 bytes) - Zero bytes inserted to ensure the payload starts at a 64-byte boundary
Payload - Variable-length binary data
Metadata - Fixed 20-byte structure containing key hash, previous offset, and checksum

The exception is tombstones (deletion markers), which use a minimal 1-byte payload with no pre-padding.

Sources: README.md:104-137 simd-r-drive-entry-handle/src/entry_metadata.rs:9-37

Aligned Entry Structure

Entry Layout Table

Offset Range	Field	Size (Bytes)	Description
`P .. P+pad`	Pre-Pad (optional)	`pad`	Zero bytes to align payload start
`P+pad .. N`	Payload	`N-(P+pad)`	Variable-length data
`N .. N+8`	Key Hash	`8`	64-bit XXH3 key hash
`N+8 .. N+16`	Prev Offset	`8`	Absolute offset of previous tail
`N+16 .. N+20`	Checksum	`4`	CRC32C of payload

Where:

pad = (A - (prev_tail % A)) & (A - 1), with A = PAYLOAD_ALIGNMENT (64 bytes)
The next entry starts at offset N + 20

Aligned Entry Structure Diagram

Sources: README.md:112-137 simd-r-drive-entry-handle/src/entry_metadata.rs:11-23

Tombstone Structure

Tombstones are special deletion markers that do not require payload alignment. They consist of a single zero byte followed by the standard 20-byte metadata structure.

Tombstone Layout Table

Offset Range	Field	Size (Bytes)	Description
`T .. T+1`	Payload	`1`	Single byte `0x00`
`T+1 .. T+21`	Metadata	`20`	Key hash, prev, crc32c

Tombstone Structure Diagram

Sources: README.md:126-131 simd-r-drive-entry-handle/src/entry_metadata.rs:25-30

EntryMetadata Structure

The EntryMetadata struct represents the fixed 20-byte metadata block that follows every payload. It is defined in #[repr(C)] layout to ensure consistent binary representation.

graph TB
    subgraph EntryMetadataStruct["EntryMetadata struct"]
field1["key_hash: u64\n8 bytes\nXXH3_64 hash"]
field2["prev_offset: u64\n8 bytes\nbackward chain link"]
field3["checksum: [u8; 4]\n4 bytes\nCRC32C payload checksum"]
end
    
 
   field1 --> field2
 
   field2 --> field3
    
    note4["Serialized at offset N\nfollowing payload"]
note5["Total: METADATA_SIZE = 20"]
field1 -.-> note4
 
   field3 -.-> note5

Metadata Fields

Field Descriptions

key_hash: u64 (8 bytes, offset N .. N+8)

64-bit XXH3 hash of the key
Used by KeyIndexer for O(1) lookups
Combined with a tag for collision detection
Hardware-accelerated via SSE2/AVX2/NEON

prev_offset: u64 (8 bytes, offset N+8 .. N+16)

Absolute file offset of the previous entry for this key
Forms a backward-linked chain for version history
Set to 0 for the first entry of a key
Used during chain validation and recovery

checksum: [u8; 4] (4 bytes, offset N+16 .. N+20)

CRC32C checksum of the payload
Provides fast integrity verification
Not cryptographically secure
Used during recovery to detect corruption

Serialization and Deserialization

The EntryMetadata struct provides methods for converting to/from bytes:

serialize() -> [u8; 20] - Converts metadata to byte array using little-endian encoding
deserialize(data:&[u8]) -> Self - Reconstructs metadata from byte slice

Sources: simd-r-drive-entry-handle/src/entry_metadata.rs:44-113 README.md:114-120

Pre-Padding and Alignment Strategy

Alignment Purpose

All non-tombstone payloads start at a 64-byte aligned address. This alignment ensures:

Cache-line efficiency - Matches typical CPU cache line size
SIMD optimization - Enables full-speed AVX2/AVX-512/NEON operations
Zero-copy typed views - Allows safe reinterpretation as typed slices (&[u16], &[u32], etc.)

graph TD
    Start["Calculate padding needed"]
GetPrevTail["prev_tail = last written offset"]
CalcPad["pad = (PAYLOAD_ALIGNMENT - (prev_tail % PAYLOAD_ALIGNMENT))\n& (PAYLOAD_ALIGNMENT - 1)"]
CheckPad{"pad > 0?"}
WritePad["Write pad zero bytes"]
WritePayload["Write payload at aligned offset"]
Start --> GetPrevTail
 
   GetPrevTail --> CalcPad
 
   CalcPad --> CheckPad
 
   CheckPad -->|Yes| WritePad
 
   CheckPad -->|No| WritePayload
 
   WritePad --> WritePayload

The alignment is configured via PAYLOAD_ALIGNMENT constant (64 bytes as of version 0.15.0).

Pre-Padding Calculation

The formula pad = (A - (prev_tail % A)) & (A - 1) where A = PAYLOAD_ALIGNMENT ensures:

If prev_tail is already aligned, pad = 0
Otherwise, pad equals the bytes needed to reach the next aligned boundary
Maximum padding is A - 1 bytes (63 bytes for 64-byte alignment)

Constants

The alignment is defined in simd-r-drive-entry-handle/src/constants.rs:1-20:

Constant	Value	Description
`PAYLOAD_ALIGN_LOG2`	6	Log₂ of alignment (2⁶ = 64)
`PAYLOAD_ALIGNMENT`	64	Actual alignment boundary in bytes
`METADATA_SIZE`	20	Fixed size of metadata block

Sources: README.md:51-59 simd-r-drive-entry-handle/src/entry_metadata.rs:22-23 CHANGELOG.md:25-51

Backward Chain Formation

Chain Structure

Each entry's prev_offset field creates a backward-linked chain that tracks the version history for a given key. This chain is essential for:

Recovery and validation on file open
Detecting incomplete writes
Rebuilding the index

Chain Properties

Most recent entry is at the end of the file (highest offset)
Chain traversal moves backward from tail toward offset 0
First entry for a key has prev_offset = 0
Valid chain can be walked all the way back to byte 0 without gaps
Broken chain indicates corruption or incomplete write

Usage in Recovery

During file open, the system:

Scans backward from EOF reading metadata
Follows prev_offset links to validate chain continuity
Verifies checksums at each step
Truncates file if corruption is detected
Scans forward to rebuild the index

Sources: README.md:139-147 simd-r-drive-entry-handle/src/entry_metadata.rs:41-43

Entry Type Comparison

Aligned Entry vs. Tombstone

Aspect	Aligned Entry (Non-Tombstone)	Tombstone (Deletion Marker)
Pre-padding	0-63 bytes (alignment dependent)	None
Payload size	Variable (user-defined)	Fixed 1 byte (`0x00`)
Payload alignment	64-byte boundary	No alignment requirement
Metadata size	20 bytes	20 bytes
Total minimum size	21 bytes (1-byte payload + metadata)	21 bytes (1-byte + metadata)
Total maximum overhead	83 bytes (63-byte pad + 20 metadata)	21 bytes
Zero-copy capable	Yes (aligned payload)	No (tombstone flag only)

When Tombstones Are Used

Tombstones mark key deletions while maintaining chain integrity. They:

Preserve the backward chain via prev_offset
Use minimal space (no alignment overhead)
Are detected during reads and filtered out
Enable recovery to skip deleted entries

Sources: README.md:112-137 simd-r-drive-entry-handle/src/entry_metadata.rs:9-37

Metadata Serialization Format

Binary Layout in File

Constants for Range Indexing

The simd-r-drive-entry-handle/src/constants.rs:1-20 file defines range constants for metadata field access:

KEY_HASH_RANGE = 0..8
PREV_OFFSET_RANGE = 8..16
CHECKSUM_RANGE = 16..20
METADATA_SIZE = 20

These ranges are used in EntryMetadata::serialize() and deserialize() methods.

Sources: simd-r-drive-entry-handle/src/entry_metadata.rs:62-112

Alignment Evolution and Migration

Version History

v0.14.0-alpha and earlier: Used 16-byte alignment (PAYLOAD_ALIGNMENT = 16)

v0.15.0-alpha onwards: Changed to 64-byte alignment (PAYLOAD_ALIGNMENT = 64)

This change was made to:

Ensure full cache-line alignment
Support AVX-512 and future SIMD extensions
Improve zero-copy performance across modern hardware

Migration Considerations

Storage files created with different alignment values are not compatible :

v0.14.x readers cannot correctly parse v0.15.x stores
v0.15.x readers may misinterpret v0.14.x padding

To migrate between versions:

Read all entries using the old version binary
Write entries to a new store using the new version binary
Replace the old file after verification

In multi-service environments, deploy reader upgrades before writer upgrades to avoid mixed-version issues.

Sources: CHANGELOG.md:25-82 README.md:51-59

Debug Assertions for Alignment

Runtime Validation

The codebase includes debug-only alignment assertions that validate both pointer and offset alignment:

debug_assert_aligned(ptr: *const u8, align: usize) - Validates pointer alignment

Active in debug and test builds
Zero cost in release/bench builds
Ensures buffer base address is properly aligned

debug_assert_aligned_offset(off: u64) - Validates file offset alignment

Checks that derived payload start offset is at PAYLOAD_ALIGNMENT boundary
Used during entry handle creation
Defined in simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-88

These assertions help catch alignment issues during development without imposing runtime overhead in production.

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-88 CHANGELOG.md:33-41

Summary

The SIMD R Drive entry structure uses a carefully designed binary layout that balances efficiency, integrity, and flexibility:

Fixed 64-byte alignment ensures cache-friendly, SIMD-optimized access
20-byte fixed metadata provides fast integrity checks and chain traversal
Variable pre-padding maintains alignment without complex calculations
Minimal tombstones mark deletions efficiently
Backward-linked chain enables robust recovery and validation

This design enables zero-copy reads, high write throughput, and automatic crash recovery while maintaining a simple, append-only storage model.

Memory Management and Zero-Copy Access

Relevant source files

This document explains how SIMD R Drive implements zero-copy reads through memory-mapped files, manages payload alignment for optimal performance, and provides safe concurrent access to stored data. It covers the EntryHandle abstraction, the Arc<Mmap> sharing strategy, alignment requirements enforced by PAYLOAD_ALIGNMENT, and utilities for working with aligned binary data.

For details on the on-disk entry format and metadata structure, see Entry Structure and Metadata. For information on how concurrent reads and writes are coordinated, see Concurrency and Thread Safety. For performance characteristics of the alignment strategy, see Payload Alignment and Cache Efficiency.

Memory-Mapped File Architecture

SIMD R Drive uses memory-mapped files (memmap2::Mmap) to enable zero-copy reads. The storage file is mapped into the process's virtual address space, allowing direct access to payload bytes without copying them into separate buffers. This approach provides several benefits:

Zero-copy reads : Data is accessed directly from the mapped memory region
Larger-than-RAM datasets : The OS handles paging, so only accessed portions consume physical memory
Efficient random access : Any offset in the file can be accessed with pointer arithmetic
Shared memory : Multiple threads can read from the same mapped region concurrently

The memory-mapped file is wrapped in Arc<Mutex<Arc<Mmap>>> within the DataStore structure:

DataStore.mmap_arc: Arc<Mutex<Arc<Mmap>>>

The outer Arc allows the DataStore itself to be cloned and shared. The Mutex protects remapping operations (which occur after writes). The inner Arc<Mmap> is what gets cloned and handed out to readers, ensuring that even if a remap occurs, existing readers retain a valid view of the previous mapping.

Sources: README.md:43-49 README.md:174-180 src/storage_engine/entry_iterator.rs:22-23

Mmap Lifecycle and Remapping

Diagram: Memory-mapped file lifecycle across write operations

When a write operation extends the file, the following sequence occurs:

The file is extended and flushed to disk
The Mutex<Arc<Mmap>> is locked
A new Mmap is created from the extended file
The old Arc<Mmap> is replaced with a new one
Existing EntryHandle instances continue using their old Arc<Mmap> reference until dropped

This design ensures that readers never see an invalid memory mapping, even as writes extend the file.

Sources: README.md:174-180 src/storage_engine/data_store.rs:46-48 (implied from architecture)

EntryHandle: Zero-Copy Data Access

The EntryHandle struct provides a zero-copy view into a specific payload within the memory-mapped file. It consists of three fields:

EntryHandle {
    mmap_arc: Arc<Mmap>,
    range: Range<usize>,
    metadata: EntryMetadata,
}

Each EntryHandle holds:

mmap_arc : A reference-counted pointer to the memory-mapped file
range : The byte range <FileRef file-url="https://github.com/jzombie/rust-simd-r-drive/blob/487b7b98/start..end) identifying the payload location\n- metadata #LNaN-LNaN" NaN file-path="start..end) identifying the payload location\n- **metadata`**">Hii (structure definition)

EntryHandle Methods

Diagram: EntryHandle methods and their memory semantics

Method	Return Type	Memory Semantics	Use Case
`as_slice()`	`&[u8]`	Zero-copy borrow	Direct byte access
`into_vec()`	`Vec<u8>`	Owned copy	When ownership needed
`as_arrow_buffer()`	`arrow::buffer::Buffer`	Zero-copy Buffer	Arrow integration (feature-gated)
`into_arrow_buffer()`	`arrow::buffer::Buffer`	Consumes EntryHandle	Arrow integration (feature-gated)
`metadata()`	`&EntryMetadata`	Borrow	Access key hash, checksum
`get_mmap_arc()`	`&Arc<Mmap>`	Borrow	Low-level mmap access
`get_range()`	`&Range<usize>`	Borrow	Get payload boundaries

The as_slice() method is the primary zero-copy interface:

It returns a slice directly into the memory-mapped region with no allocation or copying.

Sources: simd-r-drive-entry-handle/src/entry_handle.rs:22-50 (method implementations)

Alignment Requirements

All non-tombstone payloads in SIMD R Drive begin on a fixed 64-byte aligned boundary. This alignment is defined by the PAYLOAD_ALIGNMENT constant:

PAYLOAD_ALIGN_LOG2 = 6
PAYLOAD_ALIGNMENT = 1 << 6 = 64 bytes

The 64-byte alignment matches typical CPU cache line sizes and ensures that SIMD operations (AVX, AVX-512, SVE) can operate at full speed without crossing cache line boundaries.

Sources: simd-r-drive-entry-handle/src/constants.rs:13-18 README.md:51-59

Alignment Calculation

The pre-padding required to align a payload is calculated using:

pad = (PAYLOAD_ALIGNMENT - (prev_tail % PAYLOAD_ALIGNMENT)) & (PAYLOAD_ALIGNMENT - 1)

Where prev_tail is the file offset immediately after the previous entry's metadata. This formula ensures:

Payloads always start at multiples of PAYLOAD_ALIGNMENT
Pre-padding ranges from 0 to PAYLOAD_ALIGNMENT - 1 bytes
The calculation works for any power-of-two alignment

Diagram: Pre-padding calculation for 64-byte alignment

The storage engine writes zero bytes for the pre-padding region, then writes the payload starting at the aligned boundary.

Sources: README.md:112-124 src/storage_engine/entry_iterator.rs:50-53

Alignment Validation

The crate provides debug-only assertions to verify alignment invariants:

debug_assert_aligned(ptr: *const u8, align: usize)
debug_assert_aligned_offset(offset: u64)

These assertions:

Are compiled to no-ops in release builds (zero runtime cost)
Execute in debug and test builds to catch alignment violations
Check both pointer alignment (debug_assert_aligned) and offset alignment (debug_assert_aligned_offset)

Diagram: Debug-only alignment assertion behavior

graph TB
    subgraph "Alignment Validation"
        DebugMode["Debug/Test Build"]
ReleaseMode["Release Build"]
Call["debug_assert_aligned(ptr, align)"]
CheckPowerOf2["Assert align.is_power_of_two()"]
CheckAligned["Assert (ptr as usize & (align-1)) == 0"]
NoOp["No-op (optimized away)"]
Call --> DebugMode
 
       Call --> ReleaseMode
        
 
       DebugMode --> CheckPowerOf2
 
       CheckPowerOf2 --> CheckAligned
        
 
       ReleaseMode --> NoOp
    end

The Arrow integration feature uses these assertions when creating arrow::buffer::Buffer instances from EntryHandle:

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-88 simd-r-drive-entry-handle/src/entry_handle.rs:52-85 (Arrow integration)

Typed Slice Access via align_or_copy

The align_or_copy utility function enables efficient conversion of byte slices into typed slices (e.g., &[f32], &[u32]) with automatic fallback when alignment requirements are not met.

align_or_copy<T, const N: usize>(
    bytes: &[u8],
    from_le_bytes: fn([u8; N]) -> T
) -> Cow<'_, [T]>

Sources: src/utils/align_or_copy.rs:1-73

Zero-Copy vs. Fallback Behavior

Diagram: align_or_copy decision flow

The function attempts zero-copy reinterpretation using slice::align_to::<T>():

Condition	Result	Allocation
Memory aligned for `T` AND length is multiple of `size_of::<T>()`	`Cow::Borrowed`	None
Misaligned OR length not a multiple	`Cow::Owned`	Allocates `Vec<T>`

Zero-copy path:

Fallback path:

Sources: src/utils/align_or_copy.rs:44-73

Usage Example

Due to the 64-byte payload alignment, EntryHandle payloads are often well-aligned for SIMD types:

Since PAYLOAD_ALIGNMENT = 64 is a multiple of align_of::<f32>() = 4, align_of::<f64>() = 8, and align_of::<u128>() = 16, zero-copy access is typically possible for these types when the payload length is an exact multiple of their size.

Sources: src/utils/align_or_copy.rs:37-43 README.md:51-59

graph TB
    subgraph "DataStore Structure"
        DS["DataStore"]
MmapMutex["Arc&lt;Mutex&lt;Arc&lt;Mmap&gt;&gt;&gt;"]
end
    
    subgraph "Reader Thread 1"
        R1["read(key1)"]
EH1["EntryHandle\n(Arc&lt;Mmap&gt; clone)"]
Slice1["as_slice() → &[u8]"]
end
    
    subgraph "Reader Thread 2"
        R2["read(key2)"]
EH2["EntryHandle\n(Arc&lt;Mmap&gt; clone)"]
Slice2["as_slice() → &[u8]"]
end
    
    subgraph "Reader Thread N"
        RN["read(keyN)"]
EHN["EntryHandle\n(Arc&lt;Mmap&gt; clone)"]
SliceN["as_slice() → &[u8]"]
end
    
    subgraph "Memory-Mapped File"
        Mmap["memmap2::Mmap\n(shared memory region)"]
end
    
 
   DS --> MmapMutex
    MmapMutex -.Arc::clone.-> R1
    MmapMutex -.Arc::clone.-> R2
    MmapMutex -.Arc::clone.-> RN
    
 
   R1 --> EH1
 
   R2 --> EH2
 
   RN --> EHN
    
 
   EH1 --> Slice1
 
   EH2 --> Slice2
 
   EHN --> SliceN
    
    Slice1 -.references.-> Mmap
    Slice2 -.references.-> Mmap
    SliceN -.references.-> Mmap

The Arc<Mmap> design enables efficient memory sharing across threads while maintaining safety:

Diagram: Concurrent zero-copy reads via Arc sharing

Concurrency characteristics:

No read locks required : Once an EntryHandle holds an Arc<Mmap>, it can access data without further synchronization
Write safety : Writers acquire RwLock<File> to serialize writes, and Mutex<Arc<Mmap>> to safely remap after writes
Memory efficiency : All readers share the same physical pages, regardless of thread count
Graceful remapping : Old Arc<Mmap> instances remain valid until all references are dropped

This design allows SIMD R Drive to support thousands of concurrent readers with minimal overhead.

Sources: README.md:172-206 src/storage_engine/entry_iterator.rs:22-23

Datasets Larger Than RAM

Memory mapping enables efficient access to storage files that exceed available physical memory:

OS-managed paging : The operating system handles paging, loading only accessed regions into physical memory
Transparent access : Application code uses the same EntryHandle API regardless of file size
Efficient random access : Jumping between distant file offsets does not require explicit seeking
Memory pressure handling : Unused pages are evicted by the OS when memory is needed elsewhere

For example, a storage file containing 100 GB of data on a system with 16 GB of RAM can be accessed efficiently. Only the portions actively being read will occupy physical memory, with the OS managing page faults transparently.

This design makes SIMD R Drive suitable for:

Large-scale data analytics workloads
Multi-gigabyte append-only logs
Embedded databases on resource-constrained systems
Archive storage with infrequent random access patterns

Sources: README.md:43-49 README.md:147-148

Concurrency and Thread Safety

Relevant source files

Purpose and Scope

This document describes the concurrency model and thread safety mechanisms used by the SIMD R Drive storage engine. It covers the synchronization primitives, locking strategies, and guarantees provided for concurrent access in single-process, multi-threaded environments. For information about the core storage architecture and data structures, see Storage Architecture. For details on the DataStore API methods, see DataStore API.

Sources: README.md:170-207 src/storage_engine/data_store.rs:1-33

Synchronization Architecture

The DataStore structure employs a combination of synchronization primitives to enable safe concurrent access while maintaining high performance for reads and writes.

graph TB
    subgraph DataStore["DataStore Structure"]
File["file\nArc&lt;RwLock&lt;BufWriter&lt;File&gt;&gt;&gt;\nWrite synchronization"]
Mmap["mmap\nArc&lt;Mutex&lt;Arc&lt;Mmap&gt;&gt;&gt;\nMemory-map coordination"]
TailOffset["tail_offset\nAtomicU64\nSequential write ordering"]
KeyIndexer["key_indexer\nArc&lt;RwLock&lt;KeyIndexer&gt;&gt;\nConcurrent index access"]
Path["path\nPathBuf\nFile location"]
end
    
 
   ReadOps["Read Operations"] --> KeyIndexer
 
   ReadOps --> Mmap
    
 
   WriteOps["Write Operations"] --> File
 
   WriteOps --> TailOffset
    WriteOps -.reindex.-> Mmap
    WriteOps -.reindex.-> KeyIndexer
    WriteOps -.reindex.-> TailOffset
    
 
   ParallelIter["par_iter_entries"] --> KeyIndexer
 
   ParallelIter --> Mmap

DataStore Synchronization Primitives

Synchronization Primitives by Field:

Field	Type	Purpose	Concurrency Model
`file`	`Arc<RwLock<BufWriter<File>>>`	Write operations	Exclusive write lock required
`mmap`	`Arc<Mutex<Arc<Mmap>>>`	Memory-mapped view	Locked during remap operations
`tail_offset`	`AtomicU64`	Next write position	Lock-free atomic updates
`key_indexer`	`Arc<RwLock<KeyIndexer>>`	Hash-to-offset index	Read-write lock for lookups/updates
`path`	`PathBuf`	Storage file path	Immutable after creation

Sources: src/storage_engine/data_store.rs:27-33 README.md:172-182

Read Path Concurrency

Read operations in SIMD R Drive are designed to be lock-free and zero-copy, enabling high throughput for concurrent readers.

sequenceDiagram
    participant Thread1 as "Reader Thread 1"
    participant Thread2 as "Reader Thread 2"
    participant KeyIndexer as "key_indexer\nRwLock&lt;KeyIndexer&gt;"
    participant Mmap as "mmap\nMutex&lt;Arc&lt;Mmap&gt;&gt;"
    participant File as "Memory-Mapped File"
    
    Thread1->>KeyIndexer: read_lock()
    Thread2->>KeyIndexer: read_lock()
    Note over Thread1,Thread2: Multiple readers can acquire lock simultaneously
    
    KeyIndexer-->>Thread1: lookup hash → (tag, offset)
    KeyIndexer-->>Thread2: lookup hash → (tag, offset)
    
    Thread1->>Mmap: get_mmap_arc()
    Thread2->>Mmap: get_mmap_arc()
    Note over Mmap: Clone Arc, release Mutex immediately
    
    Mmap-->>Thread1: Arc&lt;Mmap&gt;
    Mmap-->>Thread2: Arc&lt;Mmap&gt;
    
    Thread1->>File: Direct memory access [offset..offset+len]
    Thread2->>File: Direct memory access [offset..offset+len]
    Note over Thread1,Thread2: Zero-copy reads, no locking required

Lock-Free Read Architecture

Read Method Implementation

The read_entry_with_context method (lines 501-565) centralizes the read logic:

Acquire Read Lock: Obtains a read lock on key_indexer via the provided RwLockReadGuard
Index Lookup: Retrieves the packed (tag, offset) value for the key hash
Tag Verification: Validates the tag to detect hash collisions (lines 513-521)
Memory Access: Reads metadata and payload directly from the memory-mapped region
Tombstone Check: Filters out deleted entries (single NULL byte)
Zero-Copy Handle: Returns an EntryHandle referencing the memory-mapped data

Key Performance Characteristics:

No Write Lock: Reads never block on the file write lock
Concurrent Readers: Multiple threads can read simultaneously via RwLock read lock
No Data Copy: The EntryHandle references the memory-mapped region directly
Immediate Lock Release: The get_mmap_arc method (lines 658-663) clones the Arc<Mmap> and immediately releases the Mutex

Sources: src/storage_engine/data_store.rs:501-565 src/storage_engine/data_store.rs:658-663 README.md:174-175

Write Path Synchronization

Write operations require exclusive access to the storage file and coordinate with the memory-mapped view and index.

sequenceDiagram
    participant Writer as "Writer Thread"
    participant FileRwLock as "file\nRwLock"
    participant BufWriter as "BufWriter&lt;File&gt;"
    participant TailOffset as "tail_offset\nAtomicU64"
    participant Reindex as "reindex()"
    participant MmapMutex as "mmap\nMutex"
    participant KeyIndexerRwLock as "key_indexer\nRwLock"
    
    Writer->>FileRwLock: write_lock()
    Note over FileRwLock: Exclusive lock acquired\nOnly one writer at a time
    
    Writer->>TailOffset: load(Ordering::Acquire)
    TailOffset-->>Writer: current tail offset
    
    Writer->>BufWriter: write pre-pad + payload + metadata
    Writer->>BufWriter: flush()
    
    Writer->>Reindex: reindex(&file, key_hash_offsets, tail_offset)
    
    Reindex->>BufWriter: init_mmap(&file)
    Note over Reindex: Create new memory-mapped view
    
    Reindex->>MmapMutex: lock()
    Reindex->>MmapMutex: Replace Arc&lt;Mmap&gt;
    Reindex->>MmapMutex: unlock()
    
    Reindex->>KeyIndexerRwLock: write_lock()
    Reindex->>KeyIndexerRwLock: insert(key_hash → offset)
    Reindex->>KeyIndexerRwLock: unlock()
    
    Reindex->>TailOffset: store(new_tail, Ordering::Release)
    
    Reindex-->>Writer: Ok(())
    Writer->>FileRwLock: unlock()

Write Operation Flow

Write Methods and Locking

Single Write (write):

Delegates to batch_write_with_key_hashes with a single entry (lines 827-834)

Batch Write (batch_write_with_key_hashes):

Acquires write lock on file (lines 852-855)
Loads tail_offset atomically (line 858)
Writes all entries to buffer (lines 863-922)
Writes buffer to file in single operation (line 935)
Flushes to disk (line 936)
Calls reindex to update mmap and index (lines 938-943)
Write lock automatically released on file_guard drop

Streaming Write (write_stream_with_key_hash):

Acquires write lock on file (lines 759-762)
Streams payload to file via buffered reads (lines 778-790)
Flushes after streaming (line 814)
Calls reindex to update mmap and index (lines 818-823)

Sources: src/storage_engine/data_store.rs:752-843 src/storage_engine/data_store.rs:847-943

Memory Mapping Coordination

The reindex method coordinates the critical transition from write completion to read visibility by updating the memory-mapped view.

Reindex Operation

The reindex method (lines 224-259) performs four critical operations:

Step-by-Step Process:

Create New Memory Map (line 231):
- Calls init_mmap(&write_guard) to create fresh Mmap from the file
- File must be flushed before this point to ensure visibility
Update Shared Mmap (lines 232, 255):
- Acquires Mutex on self.mmap
- Replaces the Arc<Mmap> with the new mapping
- Existing readers retain old Arc<Mmap> references until they drop
Update Index (lines 233-253):
- Acquires write lock on self.key_indexer
- Inserts new (key_hash → offset) mappings
- Removes tombstone entries if present
Update Tail Offset (line 256):
- Stores new tail offset with Ordering::Release
- Ensures all prior writes are visible before offset update

Critical Ordering: The write lock on file is held throughout the entire reindex operation, preventing concurrent writes from starting until the index and mmap are fully updated.

Sources: src/storage_engine/data_store.rs:224-259 src/storage_engine/data_store.rs:172-174

Index Concurrency Model

The key_indexer uses RwLock<KeyIndexer> to enable concurrent read access while providing exclusive write access during updates.

Index Access Patterns

Operation	Lock Type	Held During	Purpose
`read`	`RwLock::read()`	Index lookup only (lines 507-508)	Multiple threads can lookup concurrently
`batch_read`	`RwLock::read()`	All lookups in batch	Amortizes lock acquisition overhead
`par_iter_entries`	`RwLock::read()`	Collecting offsets only (lines 300-302)	Minimizes lock hold time, drops before parallel iteration
`write` / `batch_write`	`RwLock::write()`	Index updates in `reindex` (lines 233-236)	Exclusive access for inserting new entries
`delete` (tombstone)	`RwLock::write()`	Removing from index (line 243)	Exclusive access for deletion

Parallel Iteration Strategy

The par_iter_entries method (lines 296-361, feature-gated by parallel) demonstrates efficient lock usage:

This approach:

Minimizes the duration the read lock is held
Avoids holding locks during expensive parallel processing
Allows concurrent writes to proceed while parallel iteration is in progress

Sources: src/storage_engine/data_store.rs:296-361

Atomic Tail Offset

The tail_offset field uses AtomicU64 with acquire-release ordering to coordinate write sequencing without locks.

Atomic Operations Pattern

Load (Acquire):

Used at start of write operations (lines 763, 858)
Ensures all prior writes are visible before loading the offset
Provides the base offset for the new write

Store (Release):

Used at end of reindex after all updates (line 256)
Ensures all writes (file, mmap, index) are visible before storing
Prevents reordering of writes after the offset update

Sequential Consistency: Even though multiple threads may acquire the write lock sequentially, the atomic tail offset ensures that:

Each write observes the correct starting position
The tail offset update happens after all associated data is committed
Readers see a consistent view of the storage

Sources: src/storage_engine/data_store.rs30 src/storage_engine/data_store.rs256 src/storage_engine/data_store.rs763 src/storage_engine/data_store.rs858

Multi-Process Considerations

SIMD R Drive's concurrency mechanisms are designed for single-process, multi-threaded environments. Multi-process access requires external coordination.

Thread Safety Matrix

Environment	Reads	Writes	Index Updates	Storage Safety
Single Process, Single Thread	✅ Safe	✅ Safe	✅ Safe	✅ Safe
Single Process, Multi-Threaded	✅ Safe (lock-free, zero-copy)	✅ Safe (`RwLock<File>`)	✅ Safe (`RwLock<KeyIndexer>`)	✅ Safe (`Mutex<Arc<Mmap>>`)
Multiple Processes, Shared File	⚠️ Unsafe (no cross-process coordination)	❌ Unsafe (no external locking)	❌ Unsafe (separate memory spaces)	❌ Unsafe (race conditions)

Why Multi-Process is Unsafe

Separate Memory Spaces:

Each process has its own Arc<Mmap> instance
Index changes in one process are not visible to others
The tail_offset atomic is process-local

No Cross-Process Synchronization:

RwLock and Mutex only synchronize within a single process
File writes from multiple processes can interleave arbitrarily
Memory mapping updates are not coordinated

Risk of Corruption:

Two processes can both acquire file write locks independently
Both may attempt to write at the same tail_offset
The storage file's append-only chain can be broken

External Locking Requirements

For multi-process access, applications must implement external file locking:

Advisory Locks: Use flock (Unix) or LockFile (Windows)
Mandatory Locks: File system-level exclusive access
Distributed Locks: For networked file systems

The core library intentionally does not provide cross-process locking to avoid platform-specific dependencies and maintain flexibility.

Sources: README.md:185-206 src/storage_engine/data_store.rs:27-33

Concurrency Testing

The test suite validates concurrent access patterns through multi-threaded integration tests.

Test Coverage

Concurrent Write Test (concurrent_write_test):

Spawns 16 threads writing 10 entries each (lines 111-161)
Uses Arc<DataStore> to share storage instance
Verifies all 160 entries are correctly written and retrievable
Demonstrates write lock serialization

Interleaved Read-Write Test (interleaved_read_write_test):

Two threads alternating reads and writes on same key (lines 163-229)
Uses tokio::sync::Notify for coordination
Validates that writes are immediately visible to readers
Confirms index updates are atomic

Concurrent Streamed Write Test (concurrent_slow_streamed_write_test):

Multiple threads streaming 1MB payloads concurrently (lines 14-109)
Simulates slow readers with artificial delays
Verifies write lock prevents interleaved streaming
Validates payload integrity after concurrent writes

Sources: tests/concurrency_tests.rs:14-229

graph TD
 
   A["1. file: RwLock (Write)"] --> B["2. mmap: Mutex"]
B --> C["3. key_indexer: RwLock (Write)"]
C --> D["4. tail_offset: AtomicU64"]
style A fill:#f9f9f9
    style B fill:#f9f9f9
    style C fill:#f9f9f9
    style D fill:#f9f9f9

Locking Order and Deadlock Prevention

To prevent deadlocks, SIMD R Drive follows a consistent lock acquisition order:

Lock Hierarchy

Acquisition Order:

File Write Lock: Acquired first in all write operations
Mmap Mutex: Locked during reindex to update memory mapping
Key Indexer Write Lock: Locked during reindex to update index
Tail Offset Atomic: Updated last with Release ordering

Lock Release: Locks are released in reverse order automatically through RAII (Rust's drop semantics).

Deadlock Prevention: This fixed order ensures no circular wait conditions. Read operations only acquire the key indexer read lock, which cannot deadlock with other readers.

Sources: src/storage_engine/data_store.rs:224-259 src/storage_engine/data_store.rs:752-825 src/storage_engine/data_store.rs:847-943

Performance Characteristics

Concurrent Read Performance

Zero-Copy: No memory allocation or copying for payload access
Lock-Free After Index Lookup: Once Arc<Mmap> is cloned, no locks held
Scalable: Read throughput increases linearly with thread count
Cache-Friendly: 64-byte alignment optimizes cache line access

Write Serialization

Sequential Writes: All writes are serialized by the RwLock<File>
Minimal Lock Hold Time: Lock held only during actual I/O and index update
Batch Amortization: batch_write amortizes lock overhead across multiple entries
No Write Starvation: Fair RwLock implementation prevents reader starvation of writers

Benchmark Results

From benches/storage_benchmark.rs typical performance on a single-threaded benchmark:

Write Throughput: ~1M entries (8 bytes each) written per batch operation
Sequential Read: Iterates all entries via zero-copy handles
Random Read: ~1M random lookups complete in under 1 second
Batch Read: Vectorized lookups provide additional speedup

For multi-threaded workloads, the concurrency tests demonstrate that read throughput scales with core count while write throughput is limited by the sequential append-only design.

Sources: benches/storage_benchmark.rs:1-234 tests/concurrency_tests.rs:1-230

Key Indexing and Hashing

Relevant source files

Purpose and Scope

This page documents the key indexing system and hashing mechanisms used in SIMD R Drive's storage engine. It covers the KeyIndexer data structure, the XXH3 hashing algorithm, tag-based collision detection, and hardware acceleration features.

For information about how the index is accessed in concurrent operations, see Concurrency and Thread Safety. For details on how metadata is stored alongside payloads, see Entry Structure and Metadata.

Overview

The SIMD R Drive storage engine maintains an in-memory index that maps key hashes to file offsets, enabling O(1) lookup performance for stored entries. This index is critical for avoiding full file scans when retrieving data.

The indexing system consists of three main components:

KeyIndexer : A concurrent hash map that stores packed values containing both a collision-detection tag and a file offset
XXH3_64 hashing : A fast, hardware-accelerated hashing algorithm that generates 64-bit hashes from arbitrary keys
Tag-based verification : A secondary collision detection mechanism that validates lookups to prevent hash collision errors

Sources: src/storage_engine/data_store.rs:1-33 README.md:158-168

KeyIndexer Structure

The KeyIndexer is the core data structure managing the key-to-offset mapping. It is wrapped in thread-safe containers to support concurrent access.

graph TB
    subgraph "DataStore"
        KeyIndexerField["key_indexer\nArc&lt;RwLock&lt;KeyIndexer&gt;&gt;"]
end
    
    subgraph "KeyIndexer Internals"
        HashMap["HashMap&lt;u64, u64&gt;\nwith Xxh3BuildHasher"]
PackedValue["Packed u64 Value\n(16-bit tag / 48-bit offset)"]
end
    
    subgraph "Operations"
        Insert["insert(key_hash, offset)\n→ Result&lt;()&gt;"]
Lookup["get_packed(&key_hash)\n→ Option&lt;u64&gt;"]
Unpack["unpack(packed)\n→ (tag, offset)"]
TagFromKey["tag_from_key(key)\n→ u16"]
Build["build(mmap, tail_offset)\n→ KeyIndexer"]
end
    
 
   KeyIndexerField --> HashMap
 
   HashMap --> PackedValue
    
 
   Insert -.-> HashMap
 
   Lookup -.-> HashMap
 
   Unpack -.-> PackedValue
 
   TagFromKey -.-> PackedValue
 
   Build --> KeyIndexerField

Packed Value Format

The KeyIndexer stores a compact 64-bit packed value for each hash. This value encodes two pieces of information:

Bits	Field	Description
63-48	Tag (16-bit)	Collision detection tag derived from key bytes
47-0	Offset (48-bit)	Absolute file offset to entry metadata

This packing allows the index to store both the offset and a verification tag without requiring additional memory allocations.

Sources: src/storage_engine/data_store.rs31 src/storage_engine/data_store.rs:509-511

Hashing Algorithm: XXH3_64

SIMD R Drive uses the XXH3_64 hashing algorithm from the xxhash-rust crate. XXH3 is optimized for speed and provides hardware acceleration on supported platforms.

Hardware Acceleration

The XXH3_64 implementation automatically detects and utilizes CPU-specific SIMD instructions:

graph LR
    KeyBytes["Key Bytes\n&[u8]"]
subgraph "compute_hash(key)"
        XXH3["xxhash-rust\nXXH3_64"]
end
    
    subgraph "Hardware Detection"
        X86Check["x86_64 CPU?"]
ARMCheck["aarch64 CPU?"]
end
    
    subgraph "SIMD Paths"
        SSE2["SSE2 Instructions\n(baseline x86_64)"]
AVX2["AVX2 Instructions\n(if available)"]
NEON["NEON Instructions\n(aarch64 default)"]
end
    
 
   KeyBytes --> XXH3
 
   XXH3 --> X86Check
 
   XXH3 --> ARMCheck
    
 
   X86Check -->|Yes| SSE2
 
   X86Check -->|Yes + feature| AVX2
 
   ARMCheck -->|Yes| NEON
    
 
   SSE2 --> Hash64["u64 Hash"]
AVX2 --> Hash64
 
   NEON --> Hash64

Platform	Baseline Instructions	Optional Extensions	Performance Boost
x86_64	SSE2 (always enabled)	AVX2	Significant
aarch64	NEON (default)	None required	Built-in
Fallback	Scalar operations	None	Baseline

Sources: README.md:160-165 Cargo.lock:1814-1820 src/storage_engine/data_store.rs:2-4

Hash Computation Functions

The digest module provides batch and single-key hashing functions:

Function	Signature	Use Case
`compute_hash`	`fn(key: &[u8]) -> u64`	Single key hashing
`compute_hash_batch`	`fn(keys: &[&[u8]]) -> Vec<u64>`	Parallel batch hashing
`compute_checksum`	`fn(payload: &[u8]) -> [u8; 4]`	CRC32C payload validation

Sources: src/storage_engine/data_store.rs:2-4 src/storage_engine/data_store.rs754 src/storage_engine/data_store.rs:839-842

Tag-Based Collision Detection

While XXH3_64 produces high-quality 64-bit hashes, the system implements an additional layer of collision detection using 16-bit tags derived from the original key bytes.

sequenceDiagram
    participant Client
    participant DataStore
    participant KeyIndexer
    participant Mmap
    
    Note over Client,Mmap: Write Operation
    Client->>DataStore: write(key, payload)
    DataStore->>DataStore: key_hash = compute_hash(key)
    DataStore->>KeyIndexer: tag = tag_from_key(key)
    DataStore->>Mmap: write payload + metadata
    DataStore->>KeyIndexer: insert(key_hash, pack(tag, offset))
    
    Note over Client,Mmap: Read Operation with Tag Verification
    Client->>DataStore: read(key)
    DataStore->>DataStore: key_hash = compute_hash(key)
    DataStore->>KeyIndexer: packed = get_packed(key_hash)
    KeyIndexer-->>DataStore: Some(packed_value)
    DataStore->>DataStore: (stored_tag, offset) = unpack(packed)
    DataStore->>DataStore: expected_tag = tag_from_key(key)
    
    alt Tag Mismatch
        DataStore->>DataStore: warn!("Tag mismatch")
        DataStore-->>Client: None (collision detected)
    else Tag Match
        DataStore->>Mmap: read entry at offset
        Mmap-->>DataStore: EntryHandle
        DataStore-->>Client: Some(EntryHandle)
    end

How Tags Work

Tag Verification Process

When reading an entry, the system performs the following verification:

Compute the 64-bit hash of the requested key
Look up the hash in the KeyIndexer to get the packed value
Unpack the stored tag and offset from the packed value
Compute the expected tag from the original key bytes
Compare the stored tag with the expected tag
If tags match, proceed with reading; if not, return None (collision detected)

This two-level verification (hash + tag) dramatically reduces the probability of false positives from hash collisions.

Sources: src/storage_engine/data_store.rs:502-522 src/storage_engine/data_store.rs:513-521

Collision Detection in Batch Operations

For batch read operations, the system can perform tag verification when the original keys are provided:

Sources: src/storage_engine/data_store.rs:1105-1158

Index Building and Maintenance

Index Construction on Open

When a DataStore is opened, the KeyIndexer is constructed by scanning the validated storage file forward from offset 0 to the tail offset:

graph TB
    OpenFile["DataStore::open(path)"]
RecoverChain["recover_valid_chain()"]
BuildIndex["KeyIndexer::build(mmap, final_len)"]
subgraph "Index Build Process"
        ScanForward["Scan from 0 to tail_offset"]
ReadMetadata["Read EntryMetadata"]
ExtractHash["Extract key_hash"]
ComputeTag["Compute tag from prev entries"]
PackValue["pack(tag, metadata_offset)"]
InsertMap["Insert into HashMap"]
end
    
 
   OpenFile --> RecoverChain
 
   RecoverChain --> BuildIndex
 
   BuildIndex --> ScanForward
 
   ScanForward --> ReadMetadata
 
   ReadMetadata --> ExtractHash
 
   ExtractHash --> ComputeTag
 
   ComputeTag --> PackValue
 
   PackValue --> InsertMap
 
   InsertMap --> ScanForward
    
 
   InsertMap --> Initialized["KeyIndexer Ready"]

The index is built only once during the open() operation. Subsequent writes update the index incrementally.

Sources: src/storage_engine/data_store.rs:84-116 src/storage_engine/data_store.rs:106-108

graph LR
    WriteOp["Write Operation\n(single or batch)"]
FlushFile["Flush to disk"]
Reindex["reindex()"]
subgraph "Reindex Steps"
        RemapFile["Create new mmap"]
LockIndex["Acquire RwLock write"]
UpdateEntries["Update key_hash → offset"]
HandleDeletes["Remove deleted keys"]
UpdateTail["Store tail_offset (Atomic)"]
end
    
 
   WriteOp --> FlushFile
 
   FlushFile --> Reindex
 
   Reindex --> RemapFile
 
   RemapFile --> LockIndex
 
   LockIndex --> UpdateEntries
 
   UpdateEntries --> HandleDeletes
 
   HandleDeletes --> UpdateTail

Index Updates During Writes

After each write operation, the reindex() method updates the in-memory index with new key mappings:

Critical : The file must be flushed before remapping to ensure newly written data is visible in the new memory-mapped view.

Sources: src/storage_engine/data_store.rs:224-259 src/storage_engine/data_store.rs:814-824

Concurrent Access and Locking

The KeyIndexer is protected by an Arc<RwLock<KeyIndexer>> wrapper, enabling multiple concurrent readers while ensuring exclusive access for writers.

Operation	Lock Type	Concurrency
`read()`	Read lock	Multiple threads can read in parallel
`batch_read()`	Read lock	Multiple threads can read in parallel
`write()`	Write lock	Single writer, blocks all readers
`batch_write()`	Write lock	Single writer, blocks all readers
`delete()`	Write lock	Single writer, blocks all readers

Sources: src/storage_engine/data_store.rs31 src/storage_engine/data_store.rs:1042-1049 src/storage_engine/data_store.rs:852-856

Performance Characteristics

Lookup Performance

The indexing system provides O(1) average-case lookup performance. Benchmarks demonstrate:

1 million random seeks (8-byte entries): typically < 1 second
Hash computation overhead : negligible due to hardware acceleration
Tag verification overhead : minimal (single comparison)

Sources: README.md:166-167

Memory Overhead

Each index entry consumes:

8 bytes for the key hash (map key)
8 bytes for the packed value (map value)
Total : 16 bytes per unique key in memory

For a dataset with 1 million unique keys, the index occupies approximately 16 MB of RAM.

graph LR
    subgraph "Scalar Baseline"
        ScalarOps["Byte-by-byte\nprocessing"]
ScalarTime["~100% time"]
end
    
    subgraph "SSE2 (x86_64)"
        SSE2Ops["16-byte chunks\nparallel"]
SSE2Time["~40-60% time"]
end
    
    subgraph "AVX2 (x86_64)"
        AVX2Ops["32-byte chunks\nparallel"]
AVX2Time["~30-50% time"]
end
    
    subgraph "NEON (aarch64)"
        NEONOps["16-byte chunks\nparallel"]
NEONTime["~40-60% time"]
end
    
 
   ScalarOps --> ScalarTime
 
   SSE2Ops --> SSE2Time
 
   AVX2Ops --> AVX2Time
 
   NEONOps --> NEONTime

Hardware Acceleration Impact

The use of SIMD instructions in XXH3_64 significantly improves hash computation speed:

Performance gains vary by workload but are most significant for batch operations that process many keys.

Sources: README.md:160-165 src/storage_engine/data_store.rs:839-842

Error Handling and Collision Management

Hash Collision Detection

The dual-layer verification (64-bit hash + 16-bit tag) provides strong collision resistance:

Probability of hash collision : ~1 in 2^64 for XXH3_64
Probability of tag collision given hash collision : ~1 in 2^16
Combined probability : ~1 in 2^80

Write-Time Collision Handling

If a collision is detected during a write operation (same hash but different tag), the batch write operation fails entirely:

This prevents inconsistent index states and ensures data integrity.

Sources: src/storage_engine/data_store.rs:245-251

Read-Time Collision Handling

If a tag mismatch is detected during a read, the system logs a warning and returns None:

Sources: src/storage_engine/data_store.rs:513-521

Summary

The key indexing and hashing system in SIMD R Drive provides:

Fast lookups : O(1) hash-based access to entries
Hardware acceleration : Automatic SIMD optimization on SSE2, AVX2, and NEON platforms
Collision resistance : Dual-layer verification with 64-bit hashes and 16-bit tags
Thread safety : Concurrent reads with exclusive writes via RwLock
Low memory overhead : 16 bytes per unique key

This design enables efficient storage operations even for datasets with millions of entries, while maintaining data integrity through robust collision detection.

Sources: src/storage_engine/data_store.rs:1-1183 README.md:158-168

Network Layer and RPC

Relevant source files

Purpose and Scope

The network layer provides remote access to the SIMD R Drive storage engine over WebSocket connections using the Muxio RPC framework. This document covers the core RPC infrastructure, service definitions, and native Rust client implementation.

For WebSocket server deployment and configuration details, see WebSocket Server. For Muxio RPC protocol internals, see Muxio RPC Framework. For native Rust client usage, see Native Rust Client. For Python client integration, see Python Integration.

Architecture Overview

The network layer follows a symmetric client-server architecture built on three core crates that work together to enable remote DataStore access.

graph TB
    subgraph "Client Applications"
        RustApp["Rust Application"]
PyApp["Python Application\n(see section 4)"]
end
    
    subgraph "simd-r-drive-ws-client"
        WSClient["DataStoreWsClient"]
ClientCaller["RPC Caller\nmuxio-rpc-service-caller"]
ClientRuntime["muxio-tokio-rpc-client"]
end
    
    subgraph "Network Transport"
        WS["WebSocket\ntokio-tungstenite\nBinary Frames"]
end
    
    subgraph "simd-r-drive-ws-server"
        WSServer["Axum Server\nHTTP/WS Upgrade"]
ServerRuntime["muxio-tokio-rpc-server"]
ServerEndpoint["RPC Endpoint\nmuxio-rpc-service-endpoint"]
end
    
    subgraph "simd-r-drive-muxio-service-definition"
        ServiceDef["DataStoreService\nRPC Interface"]
Bitcode["bitcode serialization"]
end
    
    subgraph "Core Storage"
        DataStore["DataStore"]
end
    
 
   RustApp --> WSClient
 
   PyApp -.-> WSClient
 
   WSClient --> ClientCaller
 
   ClientCaller --> ClientRuntime
 
   ClientRuntime --> ServiceDef
 
   ClientRuntime --> WS
    
 
   WS --> WSServer
 
   WSServer --> ServerRuntime
 
   ServerRuntime --> ServiceDef
 
   ServerRuntime --> ServerEndpoint
 
   ServerEndpoint --> DataStore
    
 
   ServiceDef --> Bitcode

Component Architecture

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:1-23 experiments/simd-r-drive-ws-client/Cargo.toml:1-22 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-17

Crate	Role	Key Dependencies
`simd-r-drive-muxio-service-definition`	Shared RPC contract defining service interface	`bitcode`, `muxio-rpc-service`
`simd-r-drive-ws-server`	WebSocket server hosting DataStore	`muxio-tokio-rpc-server`, `axum`, `tokio`
`simd-r-drive-ws-client`	Native Rust client for remote access	`muxio-tokio-rpc-client`, `muxio-rpc-service-caller`

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22 experiments/simd-r-drive-ws-client/Cargo.toml:14-21 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:13-16

Service Definition and Contract

The simd-r-drive-muxio-service-definition crate defines the RPC interface as a shared contract between clients and servers. This crate must be used by both sides to ensure protocol compatibility.

DataStoreService Interface

The service definition declares available RPC methods that map to DataStore operations. The Muxio framework generates type-safe client and server stubs from this definition.

Typical Service Methods:

read(key) - Retrieve payload by key
write(key, payload) - Store payload
delete(key) - Remove entry
exists(key) - Check key existence
batch_read(keys) - Bulk retrieval
batch_write(entries) - Bulk insertion
stream_* - Streaming operations

Sources: experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-17

Bitcode Serialization

All RPC messages use bitcode for binary serialization, providing compact representation and fast encoding/decoding.

Advantages:

Zero-copy deserialization where possible
Smaller message sizes than JSON or bincode
Strong type safety via Rust's type system
Efficient handling of large payloads

Sources: Cargo.lock:392-413 experiments/simd-r-drive-muxio-service-definition/Cargo.toml14

Communication Protocol

Request-Response Flow

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:14-21 experiments/simd-r-drive-ws-server/Cargo.toml:13-22

WebSocket Transport Layer

The network layer uses tokio-tungstenite for WebSocket communication over a Tokio async runtime.

Connection Properties:

Binary WebSocket frames (not text)
Persistent bidirectional connection
Automatic ping/pong for keepalive
Multiplexed requests via Muxio protocol

Protocol Stack:

┌─────────────────────────────────┐
│  DataStore Operations           │
├─────────────────────────────────┤
│  Muxio RPC (Service Methods)    │
├─────────────────────────────────┤
│  Bitcode Serialization           │
├─────────────────────────────────┤
│  WebSocket Binary Frames         │
├─────────────────────────────────┤
│  HTTP/1.1 Upgrade               │
├─────────────────────────────────┤
│  TCP (tokio runtime)            │
└─────────────────────────────────┘

Sources: Cargo.lock:305-339 (axum), Cargo.lock:1302-1317 (muxio-tokio-rpc-client), Cargo.lock:1320-1335 (muxio-tokio-rpc-server)

Muxio RPC Framework

The Muxio framework provides the core RPC abstraction layer, handling request routing, response matching, and connection multiplexing.

graph LR
    subgraph "Client Side"
        SC["muxio-rpc-service-caller\nMethod Invocation"]
TC["muxio-tokio-rpc-client\nTransport Handler"]
end
    
    subgraph "Shared"
        MRS["muxio-rpc-service\nCore Abstractions"]
SD["Service Definition"]
end
    
    subgraph "Server Side"
        TS["muxio-tokio-rpc-server\nTransport Handler"]
SE["muxio-rpc-service-endpoint\nMethod Routing"]
end
    
 
   SC --> MRS
 
   SC --> TC
 
   TC --> SD
    
 
   TS --> MRS
 
   TS --> SE
 
   SE --> SD
    
 
   MRS --> SD

Framework Components

Sources: Cargo.lock:1250-1271 (muxio), Cargo.lock:1261-1271 (muxio-rpc-service), Cargo.lock:1274-1284 (muxio-rpc-service-caller), Cargo.lock:1287-1299 (muxio-rpc-service-endpoint)

Component	Purpose	Key Features
`muxio`	Core multiplexing protocol	Request/response correlation, concurrent requests
`muxio-rpc-service`	RPC service abstractions	Service trait definitions, async method dispatch
`muxio-rpc-service-caller`	Client-side invocation	Type-safe method calls, future-based API
`muxio-rpc-service-endpoint`	Server-side routing	Request demultiplexing, method dispatch
`muxio-tokio-rpc-client`	Tokio client runtime	WebSocket connection management
`muxio-tokio-rpc-server`	Tokio server runtime	Axum integration, connection handling

Sources: Cargo.lock:1250-1335

Request Multiplexing

The Muxio protocol assigns unique request IDs to enable multiple concurrent RPC calls over a single WebSocket connection.

Multiplexing Benefits:

Single persistent connection reduces overhead
Concurrent requests improve throughput
Out-of-order responses supported
Connection pooling not required

Sources: Cargo.lock:1250-1258

Client-Server Integration

Server-Side Implementation

The server crate (simd-r-drive-ws-server) integrates Axum for HTTP/WebSocket handling with the Muxio RPC server runtime.

Server Architecture:

Axum HTTP server accepts connections
WebSocket upgrade on /ws endpoint
muxio-tokio-rpc-server handles RPC protocol
muxio-rpc-service-endpoint routes to DataStore methods
Responses serialized with bitcode and returned

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22

Client-Side Implementation

The client crate (simd-r-drive-ws-client) provides DataStoreWsClient for remote DataStore access.

Client Architecture:

Connect to WebSocket endpoint
muxio-tokio-rpc-client manages connection
muxio-rpc-service-caller provides method invocation API
Requests serialized with bitcode
Async methods return Future<Result<T>>

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:13-21

Dependency Graph

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22 experiments/simd-r-drive-ws-client/Cargo.toml:13-21 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:13-16

Performance Characteristics

Serialization Overhead

Bitcode provides efficient binary serialization with minimal overhead:

Small message payloads for keys and metadata
Zero-copy where possible for large payloads
Faster than JSON or traditional formats

Sources: Cargo.lock:392-413

Network Considerations

The WebSocket-based architecture has specific performance characteristics:

Aspect	Impact
Connection overhead	One-time WebSocket handshake
Request latency	Network RTT + serialization
Throughput	Limited by network bandwidth
Concurrent requests	Multiplexed over single connection
Keepalive	Automatic ping/pong frames

Trade-offs:

Remote access enables distributed deployments
Network latency higher than direct DataStore access
Serialization/deserialization adds CPU overhead
Suitable for applications with network-tolerant latency requirements

Sources: Cargo.lock:305-339 (axum), Cargo.lock:1302-1335 (muxio tokio rpc)

Async Runtime Integration

All network components run on the Tokio async runtime, providing non-blocking I/O and efficient concurrency.

Tokio Integration:

WebSocket connections handled asynchronously
Multiple clients served concurrently
Non-blocking DataStore operations
Future-based API throughout

Sources: experiments/simd-r-drive-ws-server/Cargo.toml18 experiments/simd-r-drive-ws-client/Cargo.toml19

WebSocket Server

Relevant source files

Purpose and Scope

This document covers the simd-r-drive-ws-server WebSocket server implementation, which provides remote access to the SIMD R Drive storage engine over WebSocket connections using the Muxio RPC framework. The server accepts connections from both native Rust clients (see Native Rust Client) and Python applications (see Python WebSocket Client API), routing RPC requests to the underlying DataStore.

For information about the RPC protocol and serialization format, see Muxio RPC Framework. For details on the core storage operations being exposed, see DataStore API.

Architecture Overview

The WebSocket server is built on the Axum web framework and Tokio async runtime, providing a high-performance, concurrent RPC endpoint for storage operations. The server acts as a thin network wrapper around the DataStore, translating WebSocket messages into storage API calls.

graph TB
    CLI["Server CLI\n(clap parser)"]
Axum["Axum Web Framework\nHTTP/WebSocket handling"]
MuxioServer["muxio-tokio-rpc-server\nRPC server runtime"]
Endpoint["muxio-rpc-service-endpoint\nRequest routing"]
ServiceDef["simd-r-drive-muxio-service-definition\nRPC contract"]
DataStore["simd-r-drive::DataStore\nStorage engine"]
Tokio["Tokio Runtime\nAsync executor"]
Tungstenite["tokio-tungstenite\nWebSocket protocol"]
CLI --> Axum
 
   Axum --> MuxioServer
 
   MuxioServer --> Endpoint
 
   MuxioServer --> Tungstenite
 
   Endpoint --> ServiceDef
 
   Endpoint --> DataStore
    
 
   Axum -.-> Tokio
 
   MuxioServer -.-> Tokio
 
   Tungstenite -.-> Tokio

Component Stack

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:1-23 Cargo.lock:305-339 Cargo.lock:1320-1335

Crate Structure

The server is located in the experiments workspace and has a minimal dependency footprint focused on networking and RPC:

Dependency	Purpose	Version Source
`simd-r-drive`	Core storage engine	workspace
`simd-r-drive-muxio-service-definition`	RPC service contract	workspace
`muxio-tokio-rpc-server`	RPC server runtime	workspace (0.9.0-alpha)
`muxio-rpc-service`	RPC abstractions	workspace (0.9.0-alpha)
`axum`	Web framework	0.8.4
`tokio`	Async runtime	workspace
`clap`	CLI argument parsing	workspace
`tracing` / `tracing-subscriber`	Structured logging	workspace

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22 Cargo.lock:305-339 Cargo.lock:1320-1335

Server Initialization Flow

The server follows a standard initialization pattern: parse CLI arguments, configure logging, initialize the DataStore, and start the WebSocket server.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22

sequenceDiagram
    participant Main as "main()"
    participant CLI as "clap::Parser"
    participant Tracing as "tracing_subscriber"
    participant DS as "DataStore::open()"
    participant RPC as "muxio_tokio_rpc_server"
    participant Axum as "axum::serve()"
    
    Main->>CLI: Parse CLI arguments
    CLI-->>Main: ServerArgs{port, path, log_level}
    
    Main->>Tracing: init() with env_filter
    Note over Tracing: Configure RUST_LOG levels
    
    Main->>DS: open(data_file_path)
    DS-->>Main: DataStore instance
    
    Main->>RPC: Create endpoint with DataStore
    RPC-->>Main: ServiceEndpoint
    
    Main->>Axum: bind() and serve()
    Note over Axum: Listen on 0.0.0.0:port
    Axum->>Axum: Accept WebSocket connections

Request Processing Pipeline

When a client connects and sends RPC requests, the server processes them through multiple layers before reaching the storage engine.

Sources: Cargo.lock:305-339 Cargo.lock:1287-1299 Cargo.lock:1320-1335

graph LR
    Client["WebSocket Client"]
WS["tokio-tungstenite\nWebSocket frame"]
Axum["Axum Router\nRoute: /ws"]
Server["muxio-tokio-rpc-server\nMessage decode"]
Router["muxio-rpc-service-endpoint\nMethod dispatch"]
Handler["Service Handler\n(read/write/delete)"]
DS["DataStore\nStorage operation"]
Client -->|Binary frame| WS
 
   WS --> Axum
 
   Axum --> Server
 
   Server -->|Bitcode decode| Router
 
   Router --> Handler
 
   Handler --> DS
 
   DS -.->|Result| Handler
 
   Handler -.-> Router
 
   Router -.->|Bitcode encode| Server
 
   Server -.-> WS
 
   WS -.->|Binary frame| Client

RPC Service Definition Integration

The server uses the simd-r-drive-muxio-service-definition crate to define the RPC interface contract. This crate acts as a shared dependency between the server and all clients, ensuring type-safe communication.

graph TB
    subgraph "simd-r-drive-muxio-service-definition"
        Service["Service Trait\nRPC method definitions"]
Types["Request/Response Types\nBitcode serializable"]
Bitcode["bitcode crate\nBinary serialization"]
end
    
    subgraph "Server Side"
        Endpoint["muxio-rpc-service-endpoint\nImplements Service Trait"]
Handler["Request Handlers\nCall DataStore methods"]
end
    
    subgraph "Client Side"
        Caller["muxio-rpc-service-caller\nInvokes Service methods"]
ClientImpl["ws-client implementation"]
end
    
 
   Service --> Endpoint
 
   Service --> Caller
 
   Types --> Bitcode
 
   Endpoint --> Handler
 
   Caller --> ClientImpl

Service Definition Structure

Sources: experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-17 experiments/simd-r-drive-ws-server/Cargo.toml:14-17

The service definition uses the bitcode crate for efficient binary serialization, providing compact message sizes and high throughput compared to JSON-based protocols.

Sources: experiments/simd-r-drive-muxio-service-definition/Cargo.toml:14-15 Cargo.lock:392-402

graph TD
    Server["simd-r-drive-ws-server"]
Server --> Axum["axum 0.8.4\nWeb framework"]
Server --> MuxioServer["muxio-tokio-rpc-server\nRPC runtime"]
Server --> ServiceDef["simd-r-drive-muxio-service-definition\nRPC contract"]
Server --> DataStore["simd-r-drive\nStorage engine"]
Server --> Tokio["tokio\nAsync runtime"]
Server --> Clap["clap\nCLI parsing"]
Server --> Tracing["tracing + tracing-subscriber\nLogging"]
Axum --> Hyper["hyper\nHTTP implementation"]
Axum --> TokioTung["tokio-tungstenite\nWebSocket protocol"]
MuxioServer --> Muxio["muxio\nCore RPC abstractions"]
MuxioServer --> Endpoint["muxio-rpc-service-endpoint\nRouting"]
ServiceDef --> Bitcode["bitcode\nSerialization"]
ServiceDef --> MuxioService["muxio-rpc-service\nService traits"]
Hyper --> Tokio
 
   TokioTung --> Tokio
 
   Endpoint --> Tokio

Dependency Graph

The server's dependency structure shows clear separation between web framework, RPC layer, and storage:

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:13-22 Cargo.lock:305-339 Cargo.lock:1320-1335

Configuration and CLI

The server is configured via command-line arguments using the clap crate's derive API. The server binary accepts the following parameters:

Argument	Type	Default	Description
`--port`	`u16`	8080	Port number to bind
`--path`	`String`	Required	Path to DataStore file
`--log-level`	`String`	"info"	Tracing log level (error/warn/info/debug/trace)

The server supports the RUST_LOG environment variable through tracing-subscriber with env-filter for fine-grained logging control.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:19-22 experiments/simd-r-drive-ws-server/Cargo.toml21

graph TB
    subgraph "Tokio Runtime"
        Scheduler["Work-Stealing Scheduler\nThread pool"]
end
    
    subgraph "Connection Handlers"
        Conn1["WebSocket Task 1\nRPC message loop"]
Conn2["WebSocket Task 2\nRPC message loop"]
ConnN["WebSocket Task N\nRPC message loop"]
end
    
    subgraph "Shared State"
        DS["Arc&lt;DataStore&gt;\nThread-safe storage"]
Endpoint["Arc&lt;ServiceEndpoint&gt;\nRequest router"]
end
    
 
   Scheduler --> Conn1
 
   Scheduler --> Conn2
 
   Scheduler --> ConnN
    
 
   Conn1 --> Endpoint
 
   Conn2 --> Endpoint
 
   ConnN --> Endpoint
    
 
   Endpoint --> DS
    
    style DS fill:#f9f9f9
    style Endpoint fill:#f9f9f9

Concurrency Model

The server leverages Tokio's work-stealing scheduler to handle multiple concurrent WebSocket connections efficiently:

Each WebSocket connection runs as an independent async task, with the DataStore wrapped in Arc for shared access. The DataStore's internal locking (see Concurrency and Thread Safety) ensures safe concurrent reads and serialized writes.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml18 Cargo.lock:305-339

Transport Protocol

The server uses binary WebSocket frames over TCP, with the following characteristics:

Property	Value	Notes
Protocol	WebSocket (RFC 6455)	Via `tokio-tungstenite`
Message Format	Binary frames	Not text frames
Serialization	Bitcode	Compact binary format
Framing	Message-per-frame	Each RPC call is one frame
Multiplexing	Muxio protocol	Request/response correlation

The binary format and bitcode serialization provide significantly better performance than text-based protocols like JSON over WebSocket.

Sources: Cargo.lock:305-339 experiments/simd-r-drive-muxio-service-definition/Cargo.toml14

Error Handling

The server implements error handling at multiple layers:

WebSocket Layer : Connection errors, protocol violations handled by tokio-tungstenite
RPC Layer : Serialization errors, invalid method calls handled by muxio-tokio-rpc-server
Storage Layer : I/O errors, validation failures propagated from DataStore
Tracing : All errors logged with structured context via tracing spans

Errors are serialized back to clients as RPC error responses, preserving error context across the network boundary.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:19-20 Cargo.lock:1320-1335

Performance Characteristics

The server is designed for high-throughput operation with minimal overhead:

Zero-Copy Reads : DataStore's EntryHandle (see Memory Management and Zero-Copy Access) allows serving read responses without copying payload data
Async I/O : Tokio's epoll/kqueue-based I/O enables efficient handling of thousands of concurrent connections
Binary Protocol : Bitcode serialization reduces CPU overhead and bandwidth usage compared to text formats
Write Batching : Clients can use batch operations (see Write and Read Modes) to amortize RPC overhead

Sources: experiments/simd-r-drive-ws-server/Cargo.toml14 experiments/simd-r-drive-muxio-service-definition/Cargo.toml14

Deployment Considerations

When deploying the WebSocket server, consider:

Single DataStore Instance : The server opens one DataStore file. Multiple servers require separate files or external coordination
Port Binding : Default bind address is 0.0.0.0 (all interfaces). Use firewall rules or reverse proxy for access control
No TLS : The server does not implement TLS. Use a reverse proxy (nginx, HAProxy) for encrypted connections
Resource Limits : Memory usage scales with DataStore size (memory-mapped file) plus per-connection buffers
Graceful Shutdown : Tokio runtime handles SIGTERM/SIGINT for clean connection closure

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:1-23

Experimental Status

As indicated by its location in the experiments/ workspace directory, this server is currently experimental and subject to breaking changes. The API surface and configuration options may evolve as the Muxio RPC framework stabilizes.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml2 experiments/simd-r-drive-ws-server/Cargo.toml11

Muxio RPC Framework

Relevant source files

Purpose and Scope

This document describes the Muxio RPC (Remote Procedure Call) framework as implemented in SIMD R Drive for remote storage access over WebSocket connections. The framework provides a type-safe, multiplexed communication protocol using bitcode serialization for efficient binary data transfer.

For information about the WebSocket server implementation, see WebSocket Server. For the native Rust client implementation, see Native Rust Client. For Python client integration, see Python WebSocket Client API.

Sources: experiments/simd-r-drive-ws-server/Cargo.toml:1-23 experiments/simd-r-drive-ws-client/Cargo.toml:1-22 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-17

Architecture Overview

The Muxio RPC framework consists of multiple layers that work together to provide remote procedure calls over WebSocket connections:

Muxio RPC Framework Layer Architecture

graph TB
    subgraph "Client Application Layer"
        App["Application Code"]
end
    
    subgraph "Client RPC Stack"
        Caller["muxio-rpc-service-caller\nMethod Invocation"]
ClientRuntime["muxio-tokio-rpc-client\nWebSocket Client Runtime"]
end
    
    subgraph "Shared Contract"
        ServiceDef["simd-r-drive-muxio-service-definition\nService Interface Contract\nMethod Signatures"]
Bitcode["bitcode\nBinary Serialization"]
end
    
    subgraph "Server RPC Stack"
        ServerRuntime["muxio-tokio-rpc-server\nWebSocket Server Runtime"]
Endpoint["muxio-rpc-service-endpoint\nRequest Router"]
end
    
    subgraph "Server Application Layer"
        Impl["DataStore Implementation"]
end
    
    subgraph "Core Framework"
        Core["muxio-rpc-service\nBase RPC Traits & Types"]
end
    
 
   App --> Caller
 
   Caller --> ClientRuntime
 
   ClientRuntime --> ServiceDef
 
   ClientRuntime --> Bitcode
 
   ClientRuntime --> Core
    
 
   ServiceDef --> Bitcode
 
   ServiceDef --> Core
    
 
   ServerRuntime --> ServiceDef
 
   ServerRuntime --> Bitcode
 
   ServerRuntime --> Core
 
   ServerRuntime --> Endpoint
 
   Endpoint --> Impl
    
    style ServiceDef fill:#f9f9f9,stroke:#333,stroke-width:2px

The framework is organized into distinct layers:

Layer	Crates	Responsibility
Core Framework	`muxio-rpc-service`	Base traits, types, and RPC protocol definitions
Service Definition	`simd-r-drive-muxio-service-definition`	Shared interface contract between client and server
Serialization	`bitcode`	Efficient binary encoding/decoding of messages
Client Runtime	`muxio-tokio-rpc-client`, `muxio-rpc-service-caller`	WebSocket client, method invocation, request management
Server Runtime	`muxio-tokio-rpc-server`, `muxio-rpc-service-endpoint`	WebSocket server, request routing, response handling

Sources: Cargo.lock:1250-1336 experiments/simd-r-drive-ws-server/Cargo.toml:14-17 experiments/simd-r-drive-ws-client/Cargo.toml:14-21

Core Framework Components

muxio-rpc-service

The muxio-rpc-service crate provides the foundational abstractions for the RPC system:

Core RPC Framework Types

graph LR
    subgraph "muxio-rpc-service Core Types"
        RpcService["RpcService Trait\nService Interface"]
Request["Request Type\nmethod_id + payload"]
Response["Response Type\nresult or error"]
Error["RpcError\nError Handling"]
end
    
 
   RpcService -->|defines| Request
 
   RpcService -->|defines| Response
 
   Response -->|contains| Error

Key components in muxio-rpc-service:

Component	Type	Purpose
RpcService	Trait	Defines the service interface with method dispatch
Request	Struct	Contains method ID and serialized payload
Response	Enum	Success with payload or error variant
method_id	Hash	XXH3 hash of method signature for routing

Sources: Cargo.lock:1261-1272

Service Definition Layer

simd-r-drive-muxio-service-definition

The service definition crate serves as the shared contract between clients and servers. It defines the exact interface of the DataStore RPC service:

DataStore RPC Service Methods

graph TB
    subgraph "Service Definition Structure"
        ServiceDef["DataStoreService\nService Definition"]
Methods["Method Definitions"]
Types["Request/Response Types"]
Write["write\n(key, payload) → offset"]
Read["read\n(key) → Option<bytes>"]
Delete["delete\n(key) → bool"]
Exists["exists\n(key) → bool"]
BatchWrite["batch_write\n(entries) → Vec<offset>"]
BatchRead["batch_read\n(keys) → Vec<Option<bytes>>"]
end
    
 
   ServiceDef --> Methods
 
   ServiceDef --> Types
    
 
   Methods --> Write
 
   Methods --> Read
 
   Methods --> Delete
 
   Methods --> Exists
 
   Methods --> BatchWrite
 
   Methods --> BatchRead
    
 
   Types -->|uses| Bitcode["bitcode Serialization"]

The service definition establishes a type-safe contract for all operations:

Method	Request Type	Response Type	Description
`write`	`(Vec<u8>, Vec<u8>)`	`Result<u64, Error>`	Write key-value pair, return offset
`read`	`Vec<u8>`	`Result<Option<Vec<u8>>, Error>`	Read value by key
`delete`	`Vec<u8>`	`Result<bool, Error>`	Delete entry, return success
`exists`	`Vec<u8>`	`Result<bool, Error>`	Check key existence
`batch_write`	`Vec<(Vec<u8>, Vec<u8>)>`	`Result<Vec<u64>, Error>`	Write multiple entries
`batch_read`	`Vec<Vec<u8>>`	`Result<Vec<Option<Vec<u8>>>, Error>`	Read multiple entries

flowchart LR
    Signature["Method Signature\n'write(key,payload)'"]
Hash["XXH3 Hash"]
MethodID["method_id: u64"]
Signature -->|hash| Hash
 
   Hash --> MethodID
 
   MethodID -->|used for routing| Router["Request Router"]

Method ID Generation

Each method is identified by an XXH3 hash of its signature, enabling fast routing without string comparisons:

Method Identification Flow

Sources: experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-17 Cargo.lock:1261-1272

graph LR
    subgraph "Bitcode Serialization Pipeline"
        RustType["Rust Type\n(Request/Response)"]
Encode["bitcode::encode"]
Binary["Binary Payload\nCompact Format"]
Decode["bitcode::decode"]
RustType2["Rust Type\n(Reconstructed)"]
end
    
 
   RustType -->|serialize| Encode
 
   Encode --> Binary
 
   Binary -->|deserialize| Decode
 
   Decode --> RustType2

Bitcode Serialization

The framework uses the bitcode crate for efficient binary serialization with the following characteristics:

Serialization Features

Bitcode Encoding/Decoding Pipeline

Feature	Description	Benefit
Zero-copy deserialization	References data without copying when possible	Minimal overhead for large payloads
Compact encoding	Space-efficient binary format	Reduced network bandwidth
Type safety	Compile-time type checking	Prevents serialization errors
Performance	Faster than JSON/MessagePack	Lower CPU overhead

Integration with RPC

The serialization is integrated at multiple points:

Request serialization : Client encodes method arguments to binary
Wire transfer : Binary payload sent over WebSocket
Response serialization : Server encodes return values to binary
Deserialization : Both sides decode binary to Rust types

Sources: Cargo.lock:392-414 experiments/simd-r-drive-muxio-service-definition/Cargo.toml14

Client-Side Components

muxio-rpc-service-caller

The caller component provides the client-side method invocation interface:

Client Method Invocation Flow

Key responsibilities:

Method call marshalling
Request ID generation
Response awaiting and matching
Error propagation

Sources: Cargo.lock:1274-1285 experiments/simd-r-drive-ws-client/Cargo.toml18

muxio-tokio-rpc-client

The client runtime handles WebSocket communication and multiplexing:

Client Runtime Request/Response Multiplexing

The client runtime provides:

Feature	Implementation	Purpose
Connection management	Single WebSocket connection	Persistent connection
Request multiplexing	Multiple concurrent requests	High throughput
Response routing	Map of pending requests	Match responses to callers
Async/await support	Tokio-based futures	Non-blocking I/O

Sources: Cargo.lock:1302-1318 experiments/simd-r-drive-ws-client/Cargo.toml16

flowchart TB
    subgraph "Server Request Processing"
        Receive["Receive Request\nWebSocket"]
Deserialize["Deserialize\nbitcode::decode"]
ExtractID["Extract method_id"]
Route["Route to Handler"]
Execute["Execute Method"]
Serialize["Serialize Response\nbitcode::encode"]
Send["Send Response\nWebSocket"]
end
    
 
   Receive --> Deserialize
 
   Deserialize --> ExtractID
 
   ExtractID --> Route
 
   Route --> Execute
 
   Execute --> Serialize
 
   Serialize --> Send

Server-Side Components

muxio-rpc-service-endpoint

The endpoint component routes incoming RPC requests to service implementations:

Server Request Processing Pipeline

Endpoint responsibilities:

Responsibility	Description
Request routing	Maps `method_id` to handler function
Execution	Invokes the appropriate service method
Error handling	Catches and serializes errors
Response formatting	Wraps results in response envelope

Sources: Cargo.lock:1287-1300 experiments/simd-r-drive-ws-server/Cargo.toml17

muxio-tokio-rpc-server

The server runtime manages WebSocket connections and request dispatching:

Server Runtime Connection Management

Key features:

Feature	Implementation	Purpose
Multi-client support	One task pair per connection	Concurrent clients
Backpressure handling	Bounded channels	Prevent memory exhaustion
Graceful shutdown	Signal-based termination	Clean connection closure
Error isolation	Per-connection error handling	Fault tolerance

Sources: Cargo.lock:1320-1336 experiments/simd-r-drive-ws-server/Cargo.toml16

Request/Response Flow

Complete RPC Call Sequence

End-to-End RPC Call Flow

Message Format

The wire protocol uses a structured binary format:

Field	Type	Size	Description
`request_id`	`u64`	8 bytes	Unique request identifier
`method_id`	`u64`	8 bytes	XXH3 hash of method signature
`payload_len`	`u32`	4 bytes	Length of payload
`payload`	`[u8]`	Variable	Bitcode-encoded arguments/result

Sources: All component sources above

Error Handling

The framework provides comprehensive error handling across the RPC boundary:

RPC Error Classification and Propagation

Error Categories

Category	Origin	Handling
Serialization errors	Bitcode encoding/decoding failure	Logged and returned as RpcError
Network errors	WebSocket connection issues	Automatic reconnect or error propagation
Application errors	DataStore operation failures	Serialized and returned to client
Timeout errors	Request took too long	Client-side timeout with error result

Error Recovery

The framework implements several recovery strategies:

Connection loss : Client automatically attempts reconnection
Request timeout : Client cancels pending request after configured duration
Serialization failure : Error logged and generic error returned
Invalid method ID : Server returns "method not found" error

Sources: Cargo.lock:1261-1336

Performance Characteristics

The Muxio RPC framework is optimized for high-performance remote storage access:

Metric	Characteristic	Impact
Serialization overhead	~50-100 ns for typical payloads	Minimal CPU impact
Request multiplexing	Thousands of concurrent requests	High throughput
Binary protocol	Compact wire format	Reduced bandwidth usage
Zero-copy deserialization	Direct memory references	Lower latency for large payloads

The use of bitcode serialization and WebSocket binary frames minimizes overhead compared to text-based protocols like JSON over HTTP. The multiplexed architecture allows clients to issue multiple concurrent requests without blocking, essential for high-performance batch operations.

Sources: Cargo.lock:392-414 Cargo.lock:1250-1336

Native Rust Client

Relevant source files

Purpose and Scope

The simd-r-drive-ws-client crate provides a native Rust client library for remote access to the SIMD R Drive storage engine via WebSocket connections. This client enables Rust applications to interact with a remote DataStore instance through the Muxio RPC framework with bitcode serialization.

This document covers the native Rust client implementation. For the WebSocket server that this client connects to, see WebSocket Server. For the Muxio RPC protocol details, see Muxio RPC Framework. For the Python bindings that wrap this client, see Python WebSocket Client API.

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:1-22

Architecture Overview

The native Rust client is structured as a thin wrapper around the Muxio RPC client infrastructure, providing type-safe access to remote DataStore operations.

Key Components:

graph TB
    subgraph "Application Layer"
        UserApp["User Application\nRust Code"]
end
    
    subgraph "simd-r-drive-ws-client Crate"
        ClientAPI["Client API\nDataStoreReader/Writer Traits"]
WsClient["WebSocket Client\nConnection Management"]
end
    
    subgraph "RPC Infrastructure"
        ServiceCaller["muxio-rpc-service-caller\nMethod Invocation"]
TokioRpcClient["muxio-tokio-rpc-client\nTransport Layer"]
end
    
    subgraph "Shared Contract"
        ServiceDef["simd-r-drive-muxio-service-definition\nRPC Interface"]
Bitcode["bitcode\nSerialization"]
end
    
    subgraph "Network"
        WsConnection["WebSocket Connection\ntokio-tungstenite"]
end
    
 
   UserApp --> ClientAPI
 
   ClientAPI --> WsClient
 
   WsClient --> ServiceCaller
 
   ServiceCaller --> TokioRpcClient
 
   ServiceCaller --> ServiceDef
 
   TokioRpcClient --> Bitcode
 
   TokioRpcClient --> WsConnection
    
    style ClientAPI fill:#f9f9f9,stroke:#333,stroke-width:2px
    style ServiceDef fill:#f9f9f9,stroke:#333,stroke-width:2px

Component	Crate	Purpose
Client API	`simd-r-drive-ws-client`	Public interface implementing DataStore traits
Service Caller	`muxio-rpc-service-caller`	RPC method invocation and request routing
RPC Client	`muxio-tokio-rpc-client`	WebSocket transport and message handling
Service Definition	`simd-r-drive-muxio-service-definition`	Shared RPC contract and type definitions
Async Runtime	`tokio`	Asynchronous I/O and task execution

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:13-21 Cargo.lock:1302-1318

Client API Structure

The client implements the same DataStoreReader and DataStoreWriter traits as the local DataStore, enabling transparent remote access with minimal API differences.

Core Traits:

graph LR
    subgraph "Trait Implementations"
        Reader["DataStoreReader\nread()\nexists()\nbatch_read()"]
Writer["DataStoreWriter\nwrite()\ndelete()\nbatch_write()"]
end
    
    subgraph "Client Implementation"
        WsClient["WebSocket Client\nAsync Methods"]
ConnState["Connection State\nURL, Options"]
end
    
    subgraph "RPC Layer"
        Serializer["Request Serialization\nbitcode"]
Caller["Service Caller\nCall Routing"]
Deserializer["Response Deserialization\nbitcode"]
end
    
 
   Reader --> WsClient
 
   Writer --> WsClient
 
   WsClient --> ConnState
 
   WsClient --> Serializer
 
   Serializer --> Caller
 
   Caller --> Deserializer
    
    style Reader fill:#f9f9f9,stroke:#333,stroke-width:2px
    style Writer fill:#f9f9f9,stroke:#333,stroke-width:2px

DataStoreReader : Read-only operations (read, exists, batch_read, iteration)
DataStoreWriter : Write operations (write, delete, batch_write)
async-trait : All methods are asynchronous, requiring a Tokio runtime

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:14-21 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-16

Connection Management

The client manages persistent WebSocket connections to the remote server with automatic reconnection and error handling.

Connection Lifecycle:

sequenceDiagram
    participant App as "Application"
    participant Client as "WebSocket Client"
    participant Transport as "muxio-tokio-rpc-client"
    participant Server as "Remote Server"
    
    Note over App,Server: Connection Establishment
    App->>Client: connect(url)
    Client->>Transport: create WebSocket connection
    Transport->>Server: WebSocket handshake
    Server-->>Transport: connection established
    Transport-->>Client: client ready
    Client-->>App: connected client
    
    Note over App,Server: Normal Operation
    App->>Client: read(key)
    Client->>Transport: serialize request
    Transport->>Server: send via WebSocket
    Server-->>Transport: response data
    Transport-->>Client: deserialize response
    Client-->>App: return result
    
    Note over App,Server: Error Handling
    Server-->>Transport: connection lost
    Transport-->>Client: connection error
    Client->>Transport: reconnection attempt
    Transport->>Server: reconnect

Initialization : Client connects to server URL with connection options
Authentication : Optional authentication via Muxio RPC mechanisms
Active State : Client maintains persistent WebSocket connection
Error Recovery : Automatic reconnection on transient failures
Shutdown : Graceful connection termination

Sources: Cargo.lock:1302-1318 experiments/simd-r-drive-ws-client/Cargo.toml:16-19

graph TB
    subgraph "Client Side"
        Method["Client Method Call\nread/write/delete"]
ReqBuilder["Request Builder\nCreate RPC Request"]
Serializer["bitcode Serialization\nBinary Encoding"]
Sender["WebSocket Send\nBinary Frame"]
end
    
    subgraph "Network"
        WsFrame["WebSocket Frame\nBinary Message"]
end
    
    subgraph "Server Side"
        Receiver["WebSocket Receive\nBinary Frame"]
Deserializer["bitcode Deserialization\nBinary Decoding"]
Handler["Request Handler\nExecute DataStore Operation"]
Response["Response Builder\nCreate RPC Response"]
end
    
 
   Method --> ReqBuilder
 
   ReqBuilder --> Serializer
 
   Serializer --> Sender
 
   Sender --> WsFrame
 
   WsFrame --> Receiver
 
   Receiver --> Deserializer
 
   Deserializer --> Handler
 
   Handler --> Response
 
   Response --> Serializer
    
    style Method fill:#f9f9f9,stroke:#333,stroke-width:2px
    style Handler fill:#f9f9f9,stroke:#333,stroke-width:2px

Request-Response Flow

All client operations follow a standardized request-response pattern through the Muxio RPC framework.

Request Structure:

Field	Type	Description
Method ID	u64	XXH3 hash of method name from service definition
Payload	Vec	Bitcode-serialized request parameters
Request ID	u64	Unique identifier for request-response matching

Response Structure:

Field	Type	Description
Request ID	u64	Matches original request ID
Status	enum	Success, Error, or specific error codes
Payload	Vec	Bitcode-serialized response data or error

Sources: experiments/simd-r-drive-muxio-service-definition/Cargo.toml:14-15 Cargo.lock:392-402

Async Runtime Requirements

The client requires a Tokio async runtime for all operations. The async-trait crate enables async methods in trait implementations.

Runtime Configuration:

graph TB
    subgraph "Application"
        Main["#[tokio::main]\nasync fn main()"]
UserCode["User Code\nawait client.read()"]
end
    
    subgraph "Client"
        AsyncMethods["async-trait Methods\nDataStoreReader/Writer"]
TokioTasks["Tokio Tasks\nNetwork I/O"]
end
    
    subgraph "Tokio Runtime"
        Executor["Task Executor\nWork Stealing Scheduler"]
Reactor["I/O Reactor\nepoll/kqueue/IOCP"]
end
    
 
   Main --> UserCode
 
   UserCode --> AsyncMethods
 
   AsyncMethods --> TokioTasks
 
   TokioTasks --> Executor
 
   TokioTasks --> Reactor
    
    style AsyncMethods fill:#f9f9f9,stroke:#333,stroke-width:2px
    style Executor fill:#f9f9f9,stroke:#333,stroke-width:2px

Multi-threaded Runtime : Default for concurrent operations
Current-thread Runtime : Available for single-threaded use cases
Feature Flags : Requires tokio with rt-multi-thread and net features

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:19-21 Cargo.lock:279-287

Error Handling

The client propagates errors from multiple layers of the stack, providing detailed error information for debugging and recovery.

Error Types:

Error Category	Source	Description
Connection Errors	`muxio-tokio-rpc-client`	WebSocket connection failures, timeouts
Serialization Errors	`bitcode`	Invalid data encoding/decoding
RPC Errors	`muxio-rpc-service`	Service method errors, invalid requests
DataStore Errors	Remote DataStore	Storage operation failures (key not found, write errors)

Error Propagation Flow:

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:17-18 Cargo.lock:1261-1271

Usage Patterns

Basic Connection and Operations

The client follows standard Rust async patterns for initialization and operation:

Concurrent Operations

The client supports concurrent operations through standard Tokio concurrency primitives:

Sources: experiments/simd-r-drive-ws-client/Cargo.toml:19-21

graph TB
    subgraph "Service Definition"
        Methods["Method Definitions\nread, write, delete, etc."]
Types["Request/Response Types\nbitcode derive macros"]
MethodHash["Method ID Hashing\nXXH3 of method names"]
end
    
    subgraph "Client Usage"
        ClientImpl["Client Implementation\nUses defined methods"]
TypeSafety["Type Safety\nCompile-time checking"]
end
    
    subgraph "Server Usage"
        ServerImpl["Server Implementation\nHandles defined methods"]
Routing["Request Routing\nHash-based dispatch"]
end
    
 
   Methods --> ClientImpl
 
   Methods --> ServerImpl
 
   Types --> ClientImpl
 
   Types --> ServerImpl
 
   MethodHash --> Routing
 
   ClientImpl --> TypeSafety
    
    style Methods fill:#f9f9f9,stroke:#333,stroke-width:2px
    style Types fill:#f9f9f9,stroke:#333,stroke-width:2px

Integration with Service Definition

The client relies on the shared service definition crate for type-safe RPC communication.

Shared Contract Benefits:

Type Safety : Compile-time verification of request/response types
Version Compatibility : Client and server must use compatible service definitions
Method Resolution : XXH3 hash-based method identification
Serialization Schema : Consistent bitcode encoding across client and server

Sources: experiments/simd-r-drive-ws-client/Cargo.toml15 experiments/simd-r-drive-muxio-service-definition/Cargo.toml:1-16

Performance Considerations

The native Rust client provides several performance advantages over alternative approaches:

Performance Characteristics:

Aspect	Implementation	Benefit
Serialization	bitcode binary encoding	Minimal overhead, faster than JSON/MessagePack
Connection	Persistent WebSocket	Avoids HTTP handshake overhead
Async I/O	Tokio zero-copy operations	Efficient memory usage
Type Safety	Compile-time generics	Zero runtime type checking cost
Multiplexing	Muxio request pipelining	Multiple concurrent requests per connection

Memory Efficiency:

Zero-copy where possible through bitcode and WebSocket frames
Efficient buffer reuse in Tokio's I/O layer
Minimal allocation overhead compared to HTTP-based protocols

Throughput:

Supports request pipelining for high-throughput workloads
Concurrent operations through Tokio's work-stealing scheduler
Batch operations reduce round-trip overhead

Sources: Cargo.lock:392-402 Cargo.lock:1302-1318

Comparison with Direct Access

The WebSocket client provides remote access with different tradeoffs compared to direct DataStore usage:

Feature	Direct DataStore	WebSocket Client
Access Pattern	Local file I/O	Network I/O over WebSocket
Zero-Copy Reads	Yes (via mmap)	No (serialized over network)
Latency	Microseconds	Milliseconds (network dependent)
Concurrency	Multi-process safe	Network-limited
Deployment	Single machine	Distributed architecture
Security	File system permissions	Network authentication

When to Use the Client:

Remote access to centralized storage
Microservice architectures requiring shared state
Language interoperability (via Python bindings)
Isolation of storage from compute workloads

When to Use Direct Access:

Single-machine deployments
Latency-critical applications
Maximum throughput requirements
Zero-copy read performance needs

Sources: experiments/simd-r-drive-ws-client/Cargo.toml14

Logging and Debugging

The client uses the tracing crate for structured logging and diagnostics.

Logging Levels:

TRACE : Detailed RPC message contents and serialization
DEBUG : Connection state changes, request/response flow
INFO : Connection establishment, disconnection events
WARN : Recoverable errors, retry attempts
ERROR : Unrecoverable errors, connection failures

Diagnostic Information:

Request/response timing
Serialized message sizes
Connection state transitions
Error context and stack traces

Sources: experiments/simd-r-drive-ws-client/Cargo.toml20 Cargo.lock:279-287

Python Integration

Relevant source files

This page documents the Python bindings for SIMD R Drive, which provide a high-level Python interface to the storage engine via WebSocket RPC. The bindings are implemented in Rust using PyO3 and packaged as Python wheels using Maturin.

For details on the Python WebSocket Client API specifically, see Python WebSocket Client API. For information on building and distributing the Python package, see Building Python Bindings. For testing infrastructure, see Integration Testing.

Architecture Overview

The Python integration uses a multi-layer architecture that bridges Python code to the native Rust WebSocket client. The bindings are not pure Python—they are Rust code compiled to native extensions that Python can import.

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:1-63

graph TB
    subgraph "Python User Code"
        UserScript["User Application\n*.py files"]
Imports["from simd_r_drive_ws_client import DataStoreWsClient"]
end
    
    subgraph "Python Package Layer"
        InitPy["__init__.py\nPackage Exports"]
DataStoreWsClientPy["data_store_ws_client.py\nDataStoreWsClient class"]
TypeStubs["data_store_ws_client.pyi\nType Annotations"]
end
    
    subgraph "PyO3 Binding Layer"
        RustModule["simd_r_drive_ws_client_py\nRust Binary Module\n.so / .pyd"]
BaseClass["BaseDataStoreWsClient\n#[pyclass]"]
NamespaceHasherClass["NamespaceHasher\n#[pyclass]"]
end
    
    subgraph "Native Rust Implementation"
        WsClient["simd-r-drive-ws-client\nNative WebSocket Client"]
MuxioRPC["muxio-tokio-rpc-client\nRPC Client Runtime"]
end
    
 
   UserScript --> Imports
 
   Imports --> InitPy
 
   InitPy --> DataStoreWsClientPy
 
   DataStoreWsClientPy --> BaseClass
    DataStoreWsClientPy -.type hints.-> TypeStubs
    
 
   BaseClass --> WsClient
 
   NamespaceHasherClass --> WsClient
    
 
   RustModule --> BaseClass
 
   RustModule --> NamespaceHasherClass
    
 
   WsClient --> MuxioRPC

The architecture consists of four distinct layers:

Layer	Technology	Purpose
Python User Code	Pure Python	Application-level logic using the client
Python Package	Pure Python wrapper	Convenience methods and type annotations
PyO3 Bindings	Rust compiled to native extension	FFI bridge exposing Rust functionality
Native Implementation	Rust (simd-r-drive-ws-client)	WebSocket RPC client implementation

Python API Surface

The Python package exposes two primary classes that users interact with: DataStoreWsClient and NamespaceHasher. The API is defined through a combination of Rust PyO3 bindings and Python wrapper code.

graph TB
    subgraph "Python Space"
        DSWsClient["DataStoreWsClient\nexperiments/.../data_store_ws_client.py"]
NSHasher["NamespaceHasher\nRust #[pyclass]"]
end
    
    subgraph "Rust PyO3 Bindings"
        BaseClient["BaseDataStoreWsClient\nRust #[pyclass]\nsrc/lib.rs"]
NSHasherRust["NamespaceHasher\nRust Implementation\nsrc/lib.rs"]
end
    
    subgraph "Method Categories"
        WriteOps["write()\nbatch_write()\ndelete()"]
ReadOps["read()\nbatch_read()\nexists()"]
MetaOps["__len__()\n__contains__()\nis_empty()\nfile_size()"]
PyOnlyOps["batch_read_structured()\nPure Python Logic"]
end
    
 
   DSWsClient -->|inherits| BaseClient
    NSHasher -.exposed as.-> NSHasherRust
    
 
   BaseClient --> WriteOps
 
   BaseClient --> ReadOps
 
   BaseClient --> MetaOps
 
   DSWsClient --> PyOnlyOps

Class Hierarchy and Implementation

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:11-63 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:8-219

Core Operations

The DataStoreWsClient provides the following operation categories:

Operation Type	Methods	Implementation Location
Write Operations	`write()`, `batch_write()`, `delete()`	Rust (`BaseDataStoreWsClient`)
Read Operations	`read()`, `batch_read()`, `exists()`	Rust (`BaseDataStoreWsClient`)
Metadata Operations	`__len__()`, `__contains__()`, `is_empty()`, `file_size()`	Rust (`BaseDataStoreWsClient`)
Structured Reads	`batch_read_structured()`	Python (`DataStoreWsClient`)

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-168

Python-Rust Method Mapping

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:12-62 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-129

The batch_read_structured() method is implemented entirely in Python as a convenience wrapper. It decompiles dictionaries or lists of dictionaries to extract a flat list of keys, calls the fast Rust batch_read() method, and then rebuilds the original structure with the fetched values.

PyO3 Binding Architecture

PyO3 provides the Foreign Function Interface (FFI) that allows Python to call Rust code. The bindings use PyO3 macros to expose Rust structs and methods as Python classes and functions.

graph TB
    subgraph "Rust Source"
        StructDef["#[pyclass]\npub struct BaseDataStoreWsClient"]
MethodsDef["#[pymethods]\nimpl BaseDataStoreWsClient"]
NSStruct["#[pyclass]\npub struct NamespaceHasher"]
NSMethods["#[pymethods]\nimpl NamespaceHasher"]
end
    
    subgraph "PyO3 Macro Expansion"
        PyClassMacro["PyClass trait implementation\nType conversion\nReference counting"]
PyMethodsMacro["Method wrappers\nArgument extraction\nReturn value conversion"]
end
    
    subgraph "Python Module"
        PythonClass["class BaseDataStoreWsClient:\n def write(...)\n def read(...)"]
PythonNS["class NamespaceHasher:\n def __init__(...)\n def namespace(...)"]
end
    
 
   StructDef --> PyClassMacro
 
   MethodsDef --> PyMethodsMacro
 
   NSStruct --> PyClassMacro
 
   NSMethods --> PyMethodsMacro
    
 
   PyClassMacro --> PythonClass
 
   PyMethodsMacro --> PythonClass
 
   PyClassMacro --> PythonNS
 
   PyMethodsMacro --> PythonNS

PyO3 Class Definitions

Sources: experiments/bindings/python-ws-client/Cargo.lock:832-846 experiments/bindings/python-ws-client/Cargo.lock:1096-1108

Async Runtime Bridge

The Python bindings use pyo3-async-runtimes to bridge Python's async/await with Rust's Tokio runtime. This allows Python code to use standard async/await syntax while the underlying operations are handled by Tokio.

Sources: experiments/bindings/python-ws-client/Cargo.lock:849-860

graph TB
    subgraph "Python Async"
        PyAsyncCall["await client.write(key, data)"]
PyEventLoop["asyncio.run()
or uvloop"]
end
    
    subgraph "pyo3-async-runtimes Bridge"
        Bridge["pyo3_async_runtimes::tokio\nFuture conversion"]
Runtime["LocalSet spawning\nBlock on future"]
end
    
    subgraph "Rust Tokio"
        TokioFuture["async fn write()\nTokio Future"]
TokioRuntime["Tokio Runtime\nThread Pool"]
end
    
 
   PyAsyncCall --> PyEventLoop
 
   PyEventLoop --> Bridge
 
   Bridge --> Runtime
 
   Runtime --> TokioFuture
 
   TokioFuture --> TokioRuntime

The pyo3-async-runtimes crate handles the complexity of converting between Python's async protocol and Rust's Tokio futures, ensuring that async operations work seamlessly across the FFI boundary.

Build and Distribution System

The Python bindings are built using Maturin, which compiles the Rust code and packages it into Python wheels. The build system is configured through pyproject.toml and Cargo.toml.

graph LR
    subgraph "Configuration Files"
        PyProject["pyproject.toml\n[build-system]\nrequires = ['maturin>=1.5']"]
CargoToml["Cargo.toml\n[lib]\ncrate-type = ['cdylib']"]
end
    
    subgraph "Maturin Build Process"
        RustCompile["rustc compilation\n--crate-type=cdylib\nTarget: cpython extension"]
LinkPyO3["Link PyO3 runtime\nPython ABI"]
CreateWheel["Package .so/.pyd\nAdd metadata\nCreate .whl"]
end
    
    subgraph "Distribution Artifacts"
        Wheel["simd_r_drive_ws_client-*.whl\nPlatform-specific binary"]
PyPI["PyPI Registry\npip install simd-r-drive-ws-client"]
end
    
 
   PyProject --> RustCompile
 
   CargoToml --> RustCompile
 
   RustCompile --> LinkPyO3
 
   LinkPyO3 --> CreateWheel
 
   CreateWheel --> Wheel
 
   Wheel --> PyPI

Maturin Build Pipeline

Sources: experiments/bindings/python-ws-client/pyproject.toml:29-35

Build System Configuration

The build system is configured through several key sections in pyproject.toml:

Configuration	Location	Purpose
`[build-system]`	pyproject.toml:29-31	Specifies Maturin as build backend
`[tool.maturin]`	pyproject.toml:33-35	Maturin-specific settings (bindings, Python version)
`[project]`	pyproject.toml:1-27	Package metadata for PyPI
`[dependency-groups]`	pyproject.toml:37-46	Development dependencies

Sources: experiments/bindings/python-ws-client/pyproject.toml:1-47

Supported Python Versions and Platforms

The package supports Python 3.10 through 3.13 on multiple platforms:

Sources: experiments/bindings/python-ws-client/pyproject.toml:7-27 experiments/bindings/python-ws-client/README.md:18-23

graph TB
    subgraph "Runtime Dependencies"
        PyO3Runtime["PyO3 Runtime\nEmbedded in .whl"]
RustDeps["Rust Dependencies\nsimd-r-drive-ws-client\ntokio, muxio-*"]
end
    
    subgraph "Development Dependencies"
        Maturin["maturin>=1.8.7\nBuild backend"]
MyPy["mypy>=1.16.1\nType checking"]
Pytest["pytest>=8.4.1\nTesting framework"]
NumPy["numpy>=2.2.6\nTesting utilities"]
Other["puccinialin, pytest-benchmark, pytest-order"]
end
    
    subgraph "Lock Files"
        UvLock["uv.lock\nPython dependency tree"]
CargoLock["Cargo.lock\nRust dependency tree"]
end
    
 
   Maturin --> RustDeps
 
   RustDeps --> CargoLock
    
 
   MyPy --> UvLock
 
   Pytest --> UvLock
 
   NumPy --> UvLock
 
   Other --> UvLock

Dependency Management

The Python bindings use uv for fast dependency resolution and management. Dependencies are split into runtime (minimal) and development dependencies.

Dependency Structure

Sources: experiments/bindings/python-ws-client/pyproject.toml:37-46 experiments/bindings/python-ws-client/Cargo.lock:1-1380 experiments/bindings/python-ws-client/uv.lock:1-299

The runtime has minimal Python dependencies (essentially none beyond the Python interpreter), as all dependencies are statically compiled into the binary wheel. Development dependencies include testing, type checking, and build tools.

Type Stubs and IDE Support

The package includes comprehensive type stubs (.pyi files) that provide full type information for IDEs and type checkers like MyPy.

graph TB
    subgraph "Type Stub File"
        StubImports["from typing import Optional, Union, Dict, Any, List\nfrom .simd_r_drive_ws_client import BaseDataStoreWsClient"]
ClientStub["@final\nclass DataStoreWsClient(BaseDataStoreWsClient):\n def __init__(self, host: str, port: int) -> None\n def write(self, key: bytes, data: bytes) -> None\n ..."]
NSStub["@final\nclass NamespaceHasher:\n def __init__(self, prefix: bytes) -> None\n def namespace(self, key: bytes) -> bytes"]
end
    
    subgraph "IDE Features"
        Autocomplete["Auto-completion\nMethod signatures"]
TypeCheck["Type checking\nmypy validation"]
Docstrings["Inline documentation\nMethod descriptions"]
end
    
 
   StubImports --> ClientStub
 
   StubImports --> NSStub
    
 
   ClientStub --> Autocomplete
 
   ClientStub --> TypeCheck
 
   ClientStub --> Docstrings
    
 
   NSStub --> Autocomplete
 
   NSStub --> TypeCheck
 
   NSStub --> Docstrings

Type Stub Structure

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219

The type stubs include:

Full method signatures with type annotations
Comprehensive docstrings explaining each method
Generic type support for structured operations
@final decorators to prevent subclassing

graph LR
    subgraph "Namespace Creation"
        Prefix["prefix = b'users'"]
PrefixHash["XXH3(prefix)\n→ 8 bytes"]
end
    
    subgraph "Key Hashing"
        Key["key = b'user123'"]
KeyHash["XXH3(key)\n→ 8 bytes"]
end
    
    subgraph "Namespaced Key"
        Combined["prefix_hash // key_hash\n16 bytes total"]
end
    
 
   Prefix --> PrefixHash
 
   Key --> KeyHash
 
   PrefixHash --> Combined
 
   KeyHash --> Combined

NamespaceHasher Utility

The NamespaceHasher class provides deterministic key namespacing using XXH3 hashing. It ensures keys are scoped to specific namespaces, preventing collisions across logical domains.

Namespacing Mechanism

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:170-219

Usage Pattern

The NamespaceHasher is typically used as follows:

Step	Code	Result
1. Create hasher	`hasher = NamespaceHasher(b"users")`	Hasher scoped to "users" namespace
2. Generate key	`key = hasher.namespace(b"user123")`	16-byte namespaced key
3. Store data	`client.write(key, data)`	Data stored under namespaced key
4. Read data	`client.read(key)`	Data retrieved from namespaced key

This pattern ensures that keys like b"settings" in the b"users" namespace don't collide with b"settings" in the b"system" namespace.

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:182-218

graph TB
    subgraph "Test Script: integration_test.sh"
        Setup["1. Navigate to experiments/\n2. Build server if needed"]
StartServer["3. cargo run --package simd-r-drive-ws-server\nBackground process\nPID captured"]
SetupPython["4. uv venv\n5. uv pip install pytest maturin\n6. uv pip install -e . --group dev"]
ExtractTests["7. extract_readme_tests.py\nExtract code blocks from README.md"]
RunPytest["8. pytest -v -s\nTEST_SERVER_HOST=$SERVER_HOST\nTEST_SERVER_PORT=$SERVER_PORT"]
Cleanup["9. kill -9 $SERVER_PID\n10. rm /tmp/simd-r-drive-pytest-storage.bin"]
end
    
 
   Setup --> StartServer
 
   StartServer --> SetupPython
 
   SetupPython --> ExtractTests
 
   ExtractTests --> RunPytest
 
   RunPytest --> Cleanup

Integration Test Infrastructure

The Python bindings include a comprehensive integration test suite that validates the entire stack from Python user code down to the WebSocket server.

Test Workflow

Sources: experiments/bindings/python-ws-client/integration_test.sh:1-91

Test Categories

The test infrastructure includes multiple test sources:

Test Source	Purpose	Generated By
`tests/test_readme_blocks.py`	Validates README examples	`extract_readme_tests.py`
Other test files	Unit and integration tests	Manual test authoring
Pytest fixtures	Setup/teardown infrastructure	Pytest framework

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:1-46

graph LR
    subgraph "Input"
        README["README.md\n```python code blocks"]
end
    
    subgraph "Extraction Process"
        Regex["Regex: r'```python\\n(.*?)```'"]
Parse["Extract all Python blocks"]
Wrap["Wrap each block in\ndef test_readme_block_N():\n ..."]
end
    
    subgraph "Output"
        TestFile["tests/test_readme_blocks.py\ntest_readme_block_0()\ntest_readme_block_1()\n..."]
end
    
 
   README --> Regex
 
   Regex --> Parse
 
   Parse --> Wrap
 
   Wrap --> TestFile

README Test Extraction

The extract_readme_tests.py script automatically converts Python code blocks from the README into pytest test functions:

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:14-45

This ensures that all code examples in the README are automatically tested, preventing documentation drift from the actual API behavior.

graph TB
    subgraph "Internal Modules"
        RustBinary["simd_r_drive_ws_client_py.so/.pyd\nBinary compiled module"]
RustSymbols["BaseDataStoreWsClient\nNamespaceHasher\nsetup_logging\ntest_rust_logging"]
PythonWrapper["data_store_ws_client.py\nDataStoreWsClient"]
end
    
    subgraph "Package __init__.py"
        ImportRust["from .simd_r_drive_ws_client import\n setup_logging, test_rust_logging"]
ImportPython["from .data_store_ws_client import\n DataStoreWsClient, NamespaceHasher"]
AllList["__all__ = [\n 'DataStoreWsClient',\n 'NamespaceHasher',\n 'setup_logging',\n 'test_rust_logging'\n]"]
end
    
    subgraph "Public API"
        UserCode["from simd_r_drive_ws_client import DataStoreWsClient"]
end
    
 
   RustBinary --> RustSymbols
 
   RustSymbols --> ImportRust
 
   PythonWrapper --> ImportPython
    
 
   ImportRust --> AllList
 
   ImportPython --> AllList
 
   AllList --> UserCode

Package Exports and Public API

The package's public API is defined through the __init__.py file, which controls what symbols are available when users import the package.

Export Structure

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14

The __all__ list explicitly defines the public API surface, preventing internal implementation details from being accidentally imported by users. This follows Python best practices for package design.

Python WebSocket Client API

Relevant source files

Purpose and Scope

This document describes the Python WebSocket client API for remote access to SIMD R Drive storage. The API provides idiomatic Python interfaces backed by high-performance Rust implementations via PyO3 bindings. This page covers the DataStoreWsClient class, NamespaceHasher utility, and their usage patterns.

For information about building and installing the Python bindings, see Building Python Bindings. For details about the underlying native Rust WebSocket client, see Native Rust Client. For server-side configuration and deployment, see WebSocket Server.

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219

Architecture Overview

The Python WebSocket client uses a multi-layer architecture that bridges Python's async/await with Rust's Tokio runtime while maintaining idiomatic Python APIs.

graph TB
    UserCode["Python User Code\nimport simd_r_drive_ws_client"]
DataStoreWsClient["DataStoreWsClient\nPython wrapper class"]
BaseDataStoreWsClient["BaseDataStoreWsClient\nPyO3 #[pyclass]"]
PyO3FFI["PyO3 FFI Layer\npyo3-async-runtimes"]
RustClient["simd-r-drive-ws-client\nNative Rust implementation"]
MuxioRPC["muxio-tokio-rpc-client\nWebSocket + RPC"]
Server["simd-r-drive-ws-server\nRemote DataStore"]
UserCode --> DataStoreWsClient
 
   DataStoreWsClient --> BaseDataStoreWsClient
 
   BaseDataStoreWsClient --> PyO3FFI
 
   PyO3FFI --> RustClient
 
   RustClient --> MuxioRPC
 
   MuxioRPC --> Server

Python Integration Stack

Sources: experiments/bindings/python-ws-client/Cargo.lock:1096-1108 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:1-10

Class Hierarchy

The BaseDataStoreWsClient class is implemented in Rust and exposes core storage operations through PyO3. The DataStoreWsClient Python class extends it with additional convenience methods implemented in pure Python.

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-10 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:11-62

DataStoreWsClient Class

The DataStoreWsClient class provides the primary interface for interacting with a remote SIMD R Drive storage engine over WebSocket connections.

Connection Initialization

Constructor	Description
`__init__(host: str, port: int)`	Establishes WebSocket connection to the specified server

The constructor creates a WebSocket connection to the remote storage server. Connection establishment is synchronous and will raise an exception if the server is unreachable.

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:17-25

Write Operations

Method	Parameters	Description
`write(key, data)`	`key: bytes`, `data: bytes`	Appends single key/value pair
`batch_write(items)`	`items: list[tuple[bytes, bytes]]`	Writes multiple pairs in one operation

Write operations are append-only and atomic. If a key already exists, writing to it creates a new version while the old data remains on disk (marked as superseded via the index).

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-51

Read Operations

Method	Parameters	Return Type	Copy Behavior
`read(key)`	`key: bytes`	`Optional[bytes]`	Performs memory copy
`batch_read(keys)`	`keys: list[bytes]`	`list[Optional[bytes]]`	Performs memory copy
`batch_read_structured(data)`	`data: dict or list[dict]`	Same structure with values	Python-side wrapper

The read and batch_read methods perform memory copies when returning data. For zero-copy access patterns, the native Rust client provides read_entry methods that return memory-mapped views.

The batch_read_structured method is a Python convenience wrapper that:

Accepts dictionaries or lists of dictionaries where values are datastore keys
Flattens the structure into a single key list
Calls batch_read for efficient parallel fetching
Reconstructs the original structure with fetched values

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:79-129 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:12-62

Deletion and Existence Checks

Method	Parameters	Return Type	Description
`delete(key)`	`key: bytes`	`None`	Marks key as deleted (tombstone)
`exists(key)`	`key: bytes`	`bool`	Checks if key is active
`__contains__(key)`	`key: bytes`	`bool`	Python `in` operator support

Deletion is logical, not physical. The delete method appends a tombstone entry to the storage file. The physical data remains on disk but is no longer accessible through reads.

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:53-77 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:131-141

Utility Methods

Method	Return Type	Description
`__len__()`	`int`	Returns count of active entries
`is_empty()`	`bool`	Checks if store has any active keys
`file_size()`	`int`	Returns physical file size on disk

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:143-168

NamespaceHasher Utility

The NamespaceHasher class provides deterministic key namespacing using XXH3 hashing to prevent key collisions across logical domains.

graph LR
    Input1["Namespace prefix\ne.g., b'users'"]
Input2["Key\ne.g., b'user123'"]
Hash1["XXH3 hash\n8 bytes"]
Hash2["XXH3 hash\n8 bytes"]
Output["Namespaced key\n16 bytes total"]
Input1 -->|hash once at init| Hash1
 
   Input2 -->|hash per call| Hash2
 
   Hash1 --> Output
 
   Hash2 --> Output

Architecture

Usage Pattern

Method	Parameters	Return Type	Description
`__init__(prefix)`	`prefix: bytes`	N/A	Initializes hasher with namespace
`namespace(key)`	`key: bytes`	`bytes`	Returns 16-byte namespaced key

The output key structure is:

Bytes 0-7: XXH3 hash of namespace prefix
Bytes 8-15: XXH3 hash of input key

This design ensures:

Deterministic key generation (same input → same output)
Collision isolation between namespaces
Fixed-length keys regardless of input size

Example:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:170-219

graph TB
    Stub["data_store_ws_client.pyi\nType definitions"]
Impl["data_store_ws_client.py\nImplementation"]
Base["simd_r_drive_ws_client\nCompiled Rust module"]
Stub -.->|describes| Impl
 
   Impl -->|imports from| Base
 
   Stub -.->|describes| Base

Type Stubs and IDE Support

The package provides complete type stubs for IDE integration and static type checking.

Type Stub Structure

The type stubs (experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219) provide:

Full method signatures with type annotations
Return type information (Optional[bytes], list[Optional[bytes]], etc.)
Docstrings for IDE hover documentation
@final decorators indicating classes cannot be subclassed

Type Checking Example

Python Version Support:

The package targets Python 3.10-3.13 as specified in experiments/bindings/python-ws-client/pyproject.toml7 and experiments/bindings/python-ws-client/pyproject.toml:21-24

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219 experiments/bindings/python-ws-client/pyproject.toml7 experiments/bindings/python-ws-client/pyproject.toml:19-27

graph TB
    PythonMain["Python Main Thread\nSynchronous API calls"]
PyO3["PyO3 Bridge\npyo3-async-runtimes"]
TokioRT["Tokio Runtime\nAsync event loop"]
WSClient["WebSocket Client\ntokio-tungstenite"]
PythonMain -->|sync call| PyO3
 
   PyO3 -->|spawn + block_on| TokioRT
 
   TokioRT --> WSClient
 
   WSClient -.->|result| TokioRT
 
   TokioRT -.->|return| PyO3
 
   PyO3 -.->|return| PythonMain

Async Runtime Bridging

The client uses pyo3-async-runtimes to bridge Python's async/await with Rust's Tokio runtime. This allows the underlying Rust WebSocket client to use native async I/O while exposing synchronous APIs to Python.

Runtime Architecture

The pyo3-async-runtimes crate (experiments/bindings/python-ws-client/Cargo.lock:849-860) provides:

Runtime spawning: Manages Tokio runtime lifecycle
Future blocking: Converts Rust async operations to Python-blocking calls
Thread safety: Ensures proper synchronization between Python GIL and Rust runtime

This design allows Python code to use simple synchronous APIs while benefiting from Rust's high-performance async networking under the hood.

Sources: experiments/bindings/python-ws-client/Cargo.lock:849-860 experiments/bindings/python-ws-client/Cargo.lock:1096-1108

API Summary

Complete Method Reference

Category	Method	Parameters	Return	Description
Connection	`__init__`	`host: str, port: int`	N/A	Establish WebSocket connection
Write	`write`	`key: bytes, data: bytes`	`None`	Write single entry
Write	`batch_write`	`items: list[tuple[bytes, bytes]]`	`None`	Write multiple entries
Read	`read`	`key: bytes`	`Optional[bytes]`	Read single entry (copies)
Read	`batch_read`	`keys: list[bytes]`	`list[Optional[bytes]]`	Read multiple entries
Read	`batch_read_structured`	`data: dict or list[dict]`	Same structure	Read with structure preservation
Delete	`delete`	`key: bytes`	`None`	Mark key as deleted
Query	`exists`	`key: bytes`	`bool`	Check key existence
Query	`__contains__`	`key: bytes`	`bool`	Python `in` operator
Info	`__len__`	N/A	`int`	Active entry count
Info	`is_empty`	N/A	`bool`	Check if empty
Info	`file_size`	N/A	`int`	Physical file size

NamespaceHasher Reference

Method	Parameters	Return	Description
`__init__`	`prefix: bytes`	N/A	Initialize namespace
`namespace`	`key: bytes`	`bytes`	Generate 16-byte namespaced key

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219

Building Python Bindings

Relevant source files

Purpose and Scope

This page documents the build system and tooling required to compile the Python bindings for SIMD R Drive. It covers PyO3 integration, the Maturin build backend, installation procedures, dependency management with uv, and supported Python versions. For information about using the Python client API, see Python WebSocket Client API. For integration testing procedures, see Integration Testing.

Overview of the Build Architecture

The Python bindings are implemented in Rust using PyO3, which provides Foreign Function Interface (FFI) between Rust and Python. The build process transforms Rust code into platform-specific Python wheels using Maturin, a specialized build backend designed for Rust-Python projects.

graph TB
    subgraph "Source Code"
        RustSrc["Rust Implementation\nBaseDataStoreWsClient"]
PyO3Attrs["PyO3 Attributes\n#[pyclass], #[pymethods]"]
CargoToml["Cargo.toml\ncrate-type = cdylib\npyo3 dependency"]
end
    
    subgraph "Build Configuration"
        PyProject["pyproject.toml\nbuild-backend = maturin\nbindings = pyo3"]
Manifest["Package Metadata\nversion, requires-python\nclassifiers"]
end
    
    subgraph "Build Process"
        Maturin["maturin build\nor\nmaturin develop"]
RustCompiler["rustc + cargo\nCompile to .so/.dylib/.dll"]
FFILayer["PyO3 FFI Bridge\nAuto-generated C bindings"]
end
    
    subgraph "Output Artifacts"
        Wheel["Python Wheel\n.whl package"]
SharedLib["Native Library\nsimd_r_drive_ws_client.so"]
PyStub["Type Stubs\ndata_store_ws_client.pyi"]
end
    
 
   RustSrc --> Maturin
 
   PyO3Attrs --> Maturin
 
   CargoToml --> Maturin
 
   PyProject --> Maturin
 
   Manifest --> Maturin
    
 
   Maturin --> RustCompiler
 
   RustCompiler --> FFILayer
 
   FFILayer --> SharedLib
 
   SharedLib --> Wheel
 
   PyStub --> Wheel
    
 
   Wheel -->|pip install| UserEnv["User Python Environment"]

Build Pipeline Architecture

Sources: experiments/bindings/python-ws-client/pyproject.toml:1-46 experiments/bindings/python-ws-client/README.md:26-38

PyO3 Integration

PyO3 is the Rust library that enables writing Python native modules in Rust. It provides macros and traits for exposing Rust structs and functions to Python.

PyO3 Configuration

The PyO3 bindings type is declared in the Maturin configuration:

Configuration File	Setting	Value
`pyproject.toml`	`tool.maturin.bindings`	`"pyo3"`
`pyproject.toml`	`tool.maturin.requires-python`	`">=3.10"`
`pyproject.toml`	`build-system.requires`	`["maturin>=1.5"]`
`pyproject.toml`	`build-system.build-backend`	`"maturin"`

Sources: experiments/bindings/python-ws-client/pyproject.toml:29-35

Python Module Structure

The binding layer exposes Rust types as Python classes:

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-13 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:1-63 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219

Maturin Build System

Maturin is a build tool for creating Python wheels from Rust projects. It handles cross-compilation, ABI compatibility, and packaging.

Build Commands

Command	Purpose	When to Use
`maturin build`	Compile release wheel	CI/CD pipelines, distribution
`maturin build --release`	Optimized build	Performance-critical builds
`maturin develop`	Development install	Local iteration, testing
`maturin develop --release`	Fast local testing	Performance validation

Maturin Configuration Details

Sources: experiments/bindings/python-ws-client/pyproject.toml:29-35 experiments/bindings/python-ws-client/README.md:31-38

Installation Methods

Method 1: Install from PyPI (Production)

This installs a pre-built wheel. No Rust toolchain required.

Sources: experiments/bindings/python-ws-client/README.md:26-29

Method 2: Build from Source (Development)

Requires Rust toolchain and Maturin:

The -m flag specifies the manifest path when building from outside the package directory.

Sources: experiments/bindings/python-ws-client/README.md:31-36

Method 3: Development Installation with uv

Using uv for dependency management:

This installs the package in editable mode with development dependencies.

Sources: experiments/bindings/python-ws-client/integration_test.sh:70-77

Installation Flow Comparison

Sources: experiments/bindings/python-ws-client/README.md:25-38 experiments/bindings/python-ws-client/integration_test.sh:70-77

Dependency Management with uv

The project uses uv, a fast Python package installer and resolver, for development dependencies. Unlike traditional requirements.txt or setup.py, dependencies are declared in pyproject.toml using PEP 735 dependency groups.

Dependency Group Structure

Group	Purpose	Dependencies
`dev`	Development tools	`maturin`, `mypy`, `numpy`, `pytest`, `pytest-benchmark`, `pytest-order`, `puccinialin`

Sources: experiments/bindings/python-ws-client/pyproject.toml:37-46

uv Lockfile

The uv.lock file pins exact versions and hashes for reproducible builds. This lockfile includes:

Direct dependencies (e.g., maturin>=1.8.7)
Transitive dependencies (e.g., certifi, httpcore)
Platform-specific wheels
Python version markers

Sources: experiments/bindings/python-ws-client/uv.lock:1-1036

uv Workflow in CI/Testing

The integration test script demonstrates the uv workflow:

Sources: experiments/bindings/python-ws-client/integration_test.sh:62-87

Supported Python Versions and Platforms

Python Version Matrix

The bindings support Python 3.10 through 3.13 (CPython only):

Python Version	Support Status	Notes
3.10	✓ Supported	Minimum version
3.11	✓ Supported	Full support
3.12	✓ Supported	Full support
3.13	✓ Supported	Latest tested
PyPy	✗ Not supported	CPython-only due to PyO3 limitations

Sources: experiments/bindings/python-ws-client/pyproject.toml:7-24 experiments/bindings/python-ws-client/README.md:17-24

Platform Support

Platform	Architecture	Status	Notes
Linux	x86_64	✓ Supported	Primary platform
Linux	aarch64	✓ Supported	ARM64 support
macOS	x86_64	✓ Supported	Intel Macs
macOS	arm64	✓ Supported	Apple Silicon
Windows	x86_64	✓ Supported	64-bit only

Sources: experiments/bindings/python-ws-client/pyproject.toml:25-27 experiments/bindings/python-ws-client/uv.lock:117-129

Version Constraints in pyproject.toml

The project metadata declares supported versions for PyPI classifiers:

Programming Language :: Python :: 3.10
Programming Language :: Python :: 3.11
Programming Language :: Python :: 3.12
Programming Language :: Python :: 3.13

Sources: experiments/bindings/python-ws-client/pyproject.toml:21-24

Build Process Deep Dive

File Transformation Pipeline

Sources: experiments/bindings/python-ws-client/pyproject.toml:1-46

Build Artifacts Location

After maturin develop, the installed package structure looks like:

site-packages/
├── simd_r_drive_ws_client/
│   ├── __init__.py                       # Python wrapper
│   ├── data_store_ws_client.py           # Extended API
│   ├── data_store_ws_client.pyi          # Type stubs
│   ├── simd_r_drive_ws_client.so         # Native library (Linux)
│   └── simd_r_drive_ws_client.cpython-*.so
└── simd_r_drive_ws_client-0.11.1.dist-info/
    ├── METADATA                           # Package metadata
    ├── RECORD                             # File integrity
    └── WHEEL                              # Wheel format version

CI/CD Integration

The CI build recipe referenced in the README demonstrates automated wheel building. The GitHub Actions workflow typically:

Sets up multiple Python versions (3.10-3.13)
Installs Rust toolchain
Installs Maturin
Builds wheels for all platforms using maturin build --release
Uploads wheels as artifacts

For full details, see the CI/CD Pipeline documentation.

Sources: experiments/bindings/python-ws-client/README.md38

Troubleshooting Common Build Issues

Issue	Cause	Solution
`maturin: command not found`	Maturin not installed	`pip install maturin` or use `uv pip install maturin`
`rustc: command not found`	Rust toolchain missing	Install from https://rustup.rs
`error: Python version mismatch`	Wrong Python version active	Ensure Python ≥3.10 in path
`ImportError: cannot import name`	Stale build	`rm -rf target/` then rebuild
`ModuleNotFoundError: No module named 'simd_r_drive_ws_client'`	Not installed	Run `maturin develop` or `pip install -e .`

Sources: experiments/bindings/python-ws-client/integration_test.sh:62-68

Integration Testing

Relevant source files

Purpose and Scope

This document describes the integration testing infrastructure for the Python WebSocket client bindings. The testing system validates end-to-end functionality by orchestrating a complete test environment: starting a SIMD R Drive WebSocket server, extracting test cases from documentation, and executing them against the live server.

For information about the Python client API itself, see Python WebSocket Client API. For build system details, see Building Python Bindings.

Integration Test Architecture

The integration testing system consists of three primary components: a shell orchestrator that manages the test lifecycle, a Python script that extracts executable examples from documentation, and the pytest framework that executes the tests.

Sources: experiments/bindings/python-ws-client/integration_test.sh:1-90 experiments/bindings/python-ws-client/extract_readme_tests.py:1-45

graph TB
    subgraph "Test Orchestration"
        Script["integration_test.sh\nBash Orchestrator"]
Cleanup["cleanup()\nTrap Handler"]
end
    
    subgraph "Server Management"
        CargoRun["cargo run\n--package simd-r-drive-ws-server"]
Server["simd-r-drive-ws-server\nProcess (PID tracked)"]
Storage["/tmp/simd-r-drive-pytest-storage.bin"]
end
    
    subgraph "Test Generation"
        ExtractScript["extract_readme_tests.py"]
ReadmeMd["README.md\nPython code blocks"]
TestFile["tests/test_readme_blocks.py\nGenerated pytest functions"]
end
    
    subgraph "Test Execution"
        UvRun["uv run pytest"]
Pytest["pytest framework"]
TestCases["test_readme_block_0()\ntest_readme_block_1()\n..."]
end
    
    subgraph "Client Under Test"
        Client["DataStoreWsClient"]
WsConnection["WebSocket Connection\n127.0.0.1:34129"]
end
    
 
   Script --> CargoRun
 
   CargoRun --> Server
 
   Server --> Storage
 
   Script --> ExtractScript
 
   ExtractScript --> ReadmeMd
 
   ReadmeMd --> TestFile
 
   Script --> UvRun
 
   UvRun --> Pytest
 
   Pytest --> TestCases
 
   TestCases --> Client
 
   Client --> WsConnection
 
   WsConnection --> Server
 
   Script --> Cleanup
 
   Cleanup -.->|kill -9 -$SERVER_PID| Server
 
   Cleanup -.->|rm -f| Storage

Test Workflow Orchestration

The integration_test.sh script manages the complete test lifecycle through a series of coordinated steps. The script implements robust cleanup handling using shell traps to ensure resources are released even if tests fail.

sequenceDiagram
    participant Script as integration_test.sh
    participant Trap as cleanup() trap
    participant Cargo as cargo run
    participant Server as simd-r-drive-ws-server
    participant UV as uv (Python)
    participant Extract as extract_readme_tests.py
    participant Pytest as pytest
    
    Script->>Script: set -e (fail on error)
    Script->>Trap: trap cleanup EXIT
    Script->>Script: cd experiments/
    Script->>Cargo: cargo run --package simd-r-drive-ws-server
    Cargo->>Server: start background process
    Server-->>Script: return PID
    Script->>Script: SERVER_PID=$!
    Script->>UV: uv venv
    Script->>UV: uv pip install pytest maturin
    Script->>UV: uv pip install -e . --group dev
    Script->>Extract: uv run extract_readme_tests.py
    Extract-->>Script: generate test_readme_blocks.py
    Script->>Script: export TEST_SERVER_HOST/PORT
    Script->>Pytest: uv run pytest -v -s
    Pytest->>Server: test connections
    Server-->>Pytest: responses
    Pytest-->>Script: test results
    Script->>Trap: EXIT signal
    Trap->>Server: kill -9 -$SERVER_PID
    Trap->>Script: rm -f /tmp/simd-r-drive-pytest-storage.bin

Workflow Sequence

Sources: experiments/bindings/python-ws-client/integration_test.sh:1-90

Configuration Variables

The script defines several configuration variables at the top for easy modification:

Variable	Default Value	Purpose
`EXPERIMENTS_DIR_REL_PATH`	`../../`	Relative path to experiments root
`SERVER_PACKAGE_NAME`	`simd-r-drive-ws-server`	Cargo package name
`STORAGE_FILE`	`/tmp/simd-r-drive-pytest-storage.bin`	Temporary storage file path
`SERVER_HOST`	`127.0.0.1`	Server bind address
`SERVER_PORT`	`34129`	Server listen port
`SERVER_PID`	(runtime)	Process ID for cleanup

Sources: experiments/bindings/python-ws-client/integration_test.sh:8-15

Cleanup Mechanism

The cleanup function ensures proper resource release regardless of test outcome. It uses process group termination to kill the server and all child processes:

graph LR
    Exit["Script Exit\n(success or failure)"]
Trap["trap cleanup EXIT"]
Cleanup["cleanup()
function"]
KillServer["kill -9 -$SERVER_PID"]
RemoveFile["rm -f $STORAGE_FILE"]
Exit --> Trap
 
   Trap --> Cleanup
 
   Cleanup --> KillServer
 
   Cleanup --> RemoveFile

The kill -9 "-$SERVER_PID" syntax terminates the entire process group (note the minus sign prefix), ensuring the server and any spawned threads are fully stopped. The 2>/dev/null || true pattern prevents errors if the process already exited.

Sources: experiments/bindings/python-ws-client/integration_test.sh:17-30

graph TB
    ReadmeMd["README.md"]
ExtractScript["extract_readme_tests.py"]
subgraph "Extraction Functions"
        ExtractBlocks["extract_python_blocks()\nregex: ```python...```"]
StripNonAscii["strip_non_ascii()\nRemove non-ASCII chars"]
WrapTest["wrap_as_test_fn()\nGenerate pytest function"]
end
    
    subgraph "Generated Output"
        TestFile["tests/test_readme_blocks.py"]
TestFn0["def test_readme_block_0():\n # extracted code"]
TestFn1["def test_readme_block_1():\n # extracted code"]
TestFnN["def test_readme_block_N():\n # extracted code"]
end
    
 
   ReadmeMd --> ExtractScript
 
   ExtractScript --> ExtractBlocks
 
   ExtractBlocks --> StripNonAscii
 
   StripNonAscii --> WrapTest
 
   WrapTest --> TestFile
 
   TestFile --> TestFn0
 
   TestFile --> TestFn1
 
   TestFile --> TestFnN

README Test Extraction

The extract_readme_tests.py script automates the conversion of documentation examples into executable test cases. This approach ensures that code examples in the README remain accurate and functional.

Extraction Process

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:1-45

Extraction Implementation

The extraction process follows three steps:

Pattern Matching : Uses regex to find all ````python` fenced code blocks
- Pattern: r"```python\n(.*?)```" with re.DOTALL flag
- Returns list of code block strings
Text Sanitization : Removes non-ASCII characters to prevent encoding issues
- Uses encode("ascii", errors="ignore").decode("ascii")
- Ensures clean Python code for pytest execution
Test Function Generation : Wraps each code block in a pytest-compatible function
- Function naming: test_readme_block_{idx}()
- Indentation: 4 spaces for function body
- Empty blocks generate pass statement

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:21-34

Example Transformation

Given a README code block:

The script generates:

Sources: experiments/bindings/python-ws-client/README.md:42-51 experiments/bindings/python-ws-client/extract_readme_tests.py:30-34

Test Environment Setup

The integration test uses uv as the Python package manager for fast, reliable dependency management. The environment setup occurs after server startup but before test execution.

Environment Setup Sequence

Step	Command	Purpose
1	`uv venv`	Create isolated virtual environment
2	`uv pip install pytest maturin`	Install core test dependencies
3	`uv pip install -e . --group dev`	Install package in editable mode with dev dependencies
4	`uv run extract_readme_tests.py`	Generate test cases from README
5	`export TEST_SERVER_HOST/PORT`	Set environment variables for tests
6	`uv run pytest -v -s`	Execute test suite

Sources: experiments/bindings/python-ws-client/integration_test.sh:62-87

Development Dependencies

The project defines development dependencies in pyproject.toml under the [dependency-groups] section:

Dependency	Version Constraint	Purpose
`maturin`	`>=1.8.7`	Build Rust-Python bindings
`mypy`	`>=1.16.1`	Static type checking
`numpy`	`>=2.2.6`	Array operations (test fixtures)
`puccinialin`	`>=0.1.5`	Test utilities
`pytest`	`>=8.4.1`	Test framework
`pytest-benchmark`	`>=5.1.0`	Performance benchmarking
`pytest-order`	`>=1.3.0`	Test execution ordering

Sources: experiments/bindings/python-ws-client/pyproject.toml:37-46

graph LR
    Script["integration_test.sh"]
SetM["set -m\n(enable job control)"]
CargoRun["cargo run --package simd-r-drive-ws-server\n-- /tmp/storage.bin --host 127.0.0.1 --port 34129 &"]
CapturePid["SERVER_PID=$!"]
UnsetM["set +m\n(disable job control)"]
Script --> SetM
 
   SetM --> CargoRun
 
   CargoRun --> CapturePid
 
   CapturePid --> UnsetM

Server Lifecycle Management

The integration test script manages the server as a background process with careful PID tracking and cleanup handling.

Server Startup

The server starts with the following configuration:

Package : simd-r-drive-ws-server (built via cargo)
Storage File : /tmp/simd-r-drive-pytest-storage.bin
Host : 127.0.0.1 (localhost only)
Port : 34129 (test-specific port)
Mode : Background process (& suffix)

The set -m and set +m wrapper enables job control temporarily, ensuring the shell correctly captures the background process PID with $!.

Sources: experiments/bindings/python-ws-client/integration_test.sh:47-56

Server Shutdown

The cleanup trap ensures the server terminates cleanly:

Check PID Exists : [[ ! -z "$SERVER_PID" ]]
Kill Process Group : kill -9 "-$SERVER_PID"
- The - prefix kills the entire process group
- Signal -9 (SIGKILL) ensures immediate termination
Ignore Errors : 2>/dev/null || true
- Prevents script failure if process already exited
Remove Storage : rm -f "$STORAGE_FILE"
- Cleans up temporary test data

Sources: experiments/bindings/python-ws-client/integration_test.sh:19-29

Test Execution Environment

The test suite runs with specific environment variables that configure the client connection parameters. These variables bridge the shell orchestration layer with the Python test code.

Environment Variable Injection

The script exports server configuration to the test environment:

Python tests can access these via os.environ:

Sources: experiments/bindings/python-ws-client/integration_test.sh:83-85

Pytest Execution Flags

The test suite runs with specific pytest flags:

-v: Verbose output (shows individual test names)
-s: No output capture (displays print statements immediately)

This configuration aids debugging by providing real-time test output.

Sources: experiments/bindings/python-ws-client/integration_test.sh87

graph TB
    Main["main()
entry point"]
ReadFile["README.read_text()"]
Extract["extract_python_blocks()"]
Generate["List comprehension:\n[wrap_as_test_fn(code, i)\nfor i, code in enumerate(blocks)]"]
WriteFile["TEST_FILE.write_text()"]
Main --> ReadFile
 
   ReadFile --> Extract
 
   Extract --> Generate
 
   Generate --> WriteFile
    
    subgraph "File Structure"
        Header["# Auto-generated from README.md\nimport pytest"]
TestFunctions["test_readme_block_0()\ntest_readme_block_1()\n..."]
end
    
 
   WriteFile --> Header
 
   WriteFile --> TestFunctions

Test Case Structure

Generated test cases follow a consistent structure based on the README extraction pattern. Each test becomes an isolated function within the generated test file.

Test File Generation

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:36-42

Test Isolation

Each extracted code block becomes an independent test function:

Naming Convention : test_readme_block_{idx}() where idx is zero-based
Isolation : Each function has its own scope
Independence : Tests can run in any order (no shared state)
Clarity : Test number maps directly to README code block order

This pattern enables pytest to discover and execute each documentation example as a separate test case, providing clear pass/fail feedback for each code snippet.

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:30-34

Dependencies and Platform Requirements

The integration testing system requires specific tools and versions to function correctly.

Required Tools

Tool	Purpose	Detection Method
`uv`	Python package management	`command -v uv &> /dev/null`
`cargo`	Rust build system	Implicit (via `cargo run`)
`bash`	Shell scripting	Shebang: `#!/bin/bash`
`pytest`	Test execution	Installed via `uv pip install`

Sources: experiments/bindings/python-ws-client/integration_test.sh:62-68 experiments/bindings/python-ws-client/integration_test.sh1

Python Version Support

The client supports Python 3.10 through 3.13 (CPython only):

Python Version	Support Status	Notes
3.10	✓ Supported	Minimum version
3.11	✓ Supported	Full compatibility
3.12	✓ Supported	Full compatibility
3.13	✓ Supported	Latest tested

Note : NumPy version constraints differ by Python version. The project uses numpy>=2.2.6 for Python 3.10 compatibility, as NumPy 2.3.0+ dropped Python 3.10 support.

Sources: experiments/bindings/python-ws-client/pyproject.toml7 experiments/bindings/python-ws-client/pyproject.toml:21-24 experiments/bindings/python-ws-client/pyproject.toml41

Operating System Support

The integration tests target POSIX-compliant systems:

Linux : Primary development platform (Ubuntu, Debian, etc.)
macOS : Full support (Darwin)
Windows : Supported but uses different process management

The shell script uses POSIX-compatible syntax ([[ ]], $!, trap) that works across these platforms with appropriate shells.

Sources: experiments/bindings/python-ws-client/integration_test.sh1 experiments/bindings/python-ws-client/pyproject.toml:25-26

Running the Integration Tests

Execution Command

From the repository root:

Or from the Python client directory:

Expected Output Sequence

Script initialization and configuration
Server compilation (if needed) and startup
Python environment creation
Dependency installation
README test extraction
Pytest test execution with detailed output
Automatic cleanup on exit

Troubleshooting

Common failure scenarios:

Issue	Symptom	Solution
`uv` not found	Script exits at line 66	Install uv: `curl -LsSf https://astral.sh/uv/install.sh
Port already in use	Server fails to start	Change `SERVER_PORT` variable or kill existing process
Storage file locked	Permission denied	Ensure `/tmp` is writable and no processes hold locks
Cargo build fails	Compilation errors	Run `cargo clean` and retry

Sources: experiments/bindings/python-ws-client/integration_test.sh:62-68

Performance Optimizations

Relevant source files

Purpose and Scope

This document provides a comprehensive guide to the performance optimization strategies employed throughout SIMD R Drive. It covers three primary optimization domains: SIMD-accelerated operations, memory alignment strategies, and access pattern optimizations.

For architectural details about the storage engine itself, see Core Storage Engine. For information about network-level optimizations in the RPC layer, see Network Layer and RPC. For details on the extension utilities, see Extensions and Utilities.

Overview of Optimization Strategies

SIMD R Drive achieves high performance through a multi-layered approach combining hardware-level optimizations with careful software design. The system leverages CPU-specific SIMD instructions, enforces strict memory alignment, and provides multiple access modes tailored to different workload patterns.

Core Performance Principles

graph TB
    subgraph "Hardware Level"
        SIMD["SIMD Instructions\nAVX2/NEON"]
Cache["CPU Cache Lines\n64-byte aligned"]
HWHash["Hardware Hashing\nSSE2/AVX2/Neon"]
end
    
    subgraph "Software Level"
        SIMDCopy["simd_copy\nWrite Optimization"]
Alignment["PAYLOAD_ALIGNMENT\nZero-Copy Reads"]
AccessModes["Access Modes\nSingle/Batch/Stream"]
end
    
    subgraph "Storage Layer"
        MMap["Memory-Mapped File\nmmap"]
Sequential["Sequential Writes\nAppend-Only"]
Index["XXH3 Index\nO(1)
Lookups"]
end
    
 
   SIMD --> SIMDCopy
 
   Cache --> Alignment
 
   HWHash --> Index
    
 
   SIMDCopy --> Sequential
 
   Alignment --> MMap
 
   AccessModes --> Sequential
 
   AccessModes --> MMap
    
 
   Sequential --> Index

Sources: README.md:249-257 README.md:52-59

Performance Metrics Summary

Optimization Area	Technique	Benefit	Trade-off
Write Path	`simd_copy` with AVX2/NEON	2-4x faster bulk copies	Requires feature detection
Read Path	Memory-mapped zero-copy	No allocation overhead	File size impacts virtual memory
Alignment	64-byte payload boundaries	Full SIMD/cache efficiency	0-63 bytes padding per entry
Indexing	XXH3_64 with hardware acceleration	Sub-microsecond key lookups	Memory overhead for index
Batch Writes	Single lock, multiple entries	Reduced lock contention	Delayed durability guarantees
Parallel Iteration	Rayon multi-threaded scan	Linear speedup with cores	Higher CPU utilization

Sources: README.md:249-257 README.md:159-167

SIMD Write Acceleration

The simd_copy Function

SIMD R Drive uses a specialized memory copy function called simd_copy to optimize data transfer during write operations. This function detects available CPU features at runtime and selects the most efficient implementation.

Platform-Specific Implementations

graph TB
    WriteOp["Write Operation"]
SIMDCopy["simd_copy\nRuntime Dispatch"]
X86Check{"x86_64 with\nAVX2?"}
ARMCheck{"aarch64?"}
AVX2["AVX2 Implementation\n_mm256_loadu_si256\n_mm256_storeu_si256\n32-byte chunks"]
NEON["NEON Implementation\nvld1q_u8\nvst1q_u8\n16-byte chunks"]
Fallback["Scalar Fallback\ncopy_from_slice\nStandard library"]
WriteOp --> SIMDCopy
 
   SIMDCopy --> X86Check
 
   X86Check -->|Yes| AVX2
 
   X86Check -->|No| ARMCheck
 
   ARMCheck -->|Yes| NEON
 
   ARMCheck -->|No| Fallback

Sources: README.md:249-257 High-Level Diagram 6

Implementation Architecture

The simd_copy function follows this decision tree:

x86_64 with AVX2 : Uses 256-bit wide vector operations via _mm256_loadu_si256 and _mm256_storeu_si256, processing 32 bytes per instruction
aarch64 : Uses NEON intrinsics via vld1q_u8 and vst1q_u8, processing 16 bytes per instruction
Fallback : Uses standard library copy_from_slice for platforms without SIMD support

SIMD Copy Performance Characteristics

Platform	Instruction Set	Chunk Size	Throughput Gain	Detection Method
x86_64	AVX2	32 bytes	~3-4x vs scalar	Runtime feature detection
x86_64	SSE2	16 bytes	~2x vs scalar	Universal on x86_64
aarch64	NEON	16 bytes	~2-3x vs scalar	Default on ARM64
Other	Scalar	1-8 bytes	Baseline	Always available

Sources: README.md:249-257 High-Level Diagram 6

Integration with Write Path

Sources: README.md:249-257 High-Level Diagram 5

Runtime Feature Detection

The system performs CPU feature detection once at startup:

x86_64 : Uses is_x86_feature_detected!("avx2") macro to check AVX2 availability
aarch64 : NEON is always available, no detection needed
Other platforms : Immediately fall back to scalar implementation

This detection overhead is paid once, with subsequent calls dispatching directly to the selected implementation via function pointers or conditional compilation.

Sources: README.md:249-257 High-Level Diagram 6

Payload Alignment and Cache Efficiency

64-Byte Alignment Strategy

Starting from version 0.15.0-alpha, all non-tombstone payloads begin on 64-byte aligned boundaries. This alignment matches typical CPU cache line sizes and ensures optimal SIMD operation performance.

Alignment Architecture

graph TB
    subgraph "Constants Configuration"
        AlignLog["PAYLOAD_ALIGN_LOG2 = 6"]
AlignConst["PAYLOAD_ALIGNMENT = 64"]
end
    
    subgraph "On-Disk Layout"
        PrevTail["Previous Entry Tail\noffset N"]
PrePad["Pre-Pad Bytes\n0-63 zero bytes"]
Payload["Aligned Payload\n64-byte boundary"]
Metadata["Entry Metadata\n20 bytes"]
end
    
    subgraph "Hardware Alignment"
        CacheLine["CPU Cache Line\n64 bytes typical"]
AVX512["AVX-512 Registers\n64 bytes wide"]
SIMDOps["SIMD Operations\nNo boundary crossing"]
end
    
 
   AlignLog --> AlignConst
 
   AlignConst --> PrePad
    
 
   PrevTail --> PrePad
 
   PrePad --> Payload
 
   Payload --> Metadata
    
 
   Payload --> CacheLine
 
   Payload --> AVX512
 
   Payload --> SIMDOps

Sources: README.md:52-59 CHANGELOG.md:25-51 simd-r-drive-entry-handle/src/debug_assert_aligned.rs:66-88

Alignment Calculation

The pre-padding for each entry is calculated using:

pad = (PAYLOAD_ALIGNMENT - (prev_tail % PAYLOAD_ALIGNMENT)) & (PAYLOAD_ALIGNMENT - 1)

Where:

prev_tail: The file offset immediately after the previous entry's metadata
PAYLOAD_ALIGNMENT: 64 (configurable constant)
&: Bitwise AND operation ensuring the result is in range [0, 63]

Example Alignment Scenarios

Previous Tail Offset	Modulo 64	Padding Required	Next Payload Starts At
100	36	28 bytes	128
128	0	0 bytes	128
200	8	56 bytes	256
1000	40	24 bytes	1024

Sources: README.md:114-138

graph LR
    EntryHandle["EntryHandle"]
DebugAssert["debug_assert_aligned\nptr, align"]
CheckPower["Check align is\npower of two"]
CheckAddr["Check ptr & align-1 == 0"]
OK["Continue"]
Panic["Debug Panic"]
EntryHandle --> DebugAssert
 
   DebugAssert --> CheckPower
 
   CheckPower --> CheckAddr
 
   CheckAddr -->|Aligned| OK
 
   CheckAddr -->|Misaligned| Panic

Alignment Validation

Debug and test builds include alignment assertions to catch violations early:

Pointer Alignment Assertion

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:25-43

Offset Alignment Assertion

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:66-88

These assertions compile to no-ops in release builds, providing zero-cost validation during development.

Evolution from v0.14 to v0.15

Alignment Comparison

Version	Default Alignment	Cache Line Fit	AVX-512 Compatible	Padding Overhead
≤ 0.13.x	No alignment	No guarantee	No	0 bytes
0.14.x	16 bytes	Partial	No	0-15 bytes
0.15.x	64 bytes	Complete	Yes	0-63 bytes

Key Changes in v0.15.0-alpha:

Increased alignment from 16 to 64 bytes : Better cache efficiency and full AVX-512 support
Added alignment validation : Debug assertions in debug_assert_aligned.rs
Updated constants : PAYLOAD_ALIGN_LOG2 = 6 and PAYLOAD_ALIGNMENT = 64
Breaking compatibility : Files written by v0.15.x cannot be read by v0.14.x or earlier

Sources: CHANGELOG.md:25-51

Migration Considerations

Migration Path from v0.14 to v0.15:

Read all entries using v0.14.x-compatible binary
Create new storage with v0.15.x binary
Write entries to new storage with automatic 64-byte alignment
Verify integrity using read operations
Replace old file after successful verification

Multi-Service Deployment Strategy:

graph TB
    OldReader["v0.14.x Readers"]
NewReader["v0.15.x Readers"]
OldWriter["v0.14.x Writers"]
NewWriter["v0.15.x Writers"]
OldStore["v0.14 Storage\n16-byte aligned"]
NewStore["v0.15 Storage\n64-byte aligned"]
Step1["Step 1: Deploy v0.15 readers\nwhile writers still v0.14"]
Step2["Step 2: Migrate storage files\nto v0.15 format"]
Step3["Step 3: Deploy v0.15 writers"]
OldReader --> Step1
 
   Step1 --> NewReader
    
 
   OldStore --> Step2
 
   Step2 --> NewStore
    
 
   OldWriter --> Step3
 
   Step3 --> NewWriter
    
 
   NewReader -.->|Can read| OldStore
 
   NewReader -.->|Can read| NewStore
 
   NewWriter -.->|Writes to| NewStore

Sources: CHANGELOG.md:43-51

Zero-Copy Benefits of Alignment

Proper alignment enables zero-copy reinterpretation of payload bytes as typed slices:

Safe Reinterpretation Table

Payload Size	Aligned To	Can Cast To	Zero-Copy	Notes
64 bytes	64 bytes	`&[u8; 64]`	✅ Yes	Direct array reference
128 bytes	64 bytes	`&[u16; 64]`	✅ Yes	u16 requires 2-byte alignment
256 bytes	64 bytes	`&[u32; 64]`	✅ Yes	u32 requires 4-byte alignment
512 bytes	64 bytes	`&[u64; 64]`	✅ Yes	u64 requires 8-byte alignment
1024 bytes	64 bytes	`&[u128; 64]`	✅ Yes	u128 requires 16-byte alignment
Variable	64 bytes	Custom structs	⚠️ Maybe	Depends on struct alignment requirements

Sources: README.md:52-59

graph TB
    subgraph "Single Entry Write"
        SingleAPI["write key, payload"]
SingleLock["Acquire write lock"]
SingleWrite["Write one entry"]
SingleFlush["flush"]
SingleUnlock["Release lock"]
end
    
    subgraph "Batch Entry Write"
        BatchAPI["batch_write entries"]
BatchLock["Acquire write lock"]
BatchLoop["Write N entries\nin locked section"]
BatchFlush["flush once"]
BatchUnlock["Release lock"]
end
    
    subgraph "Streaming Write"
        StreamAPI["write_stream key, reader"]
StreamLock["Acquire write lock"]
StreamChunk["Read + write chunks\nincrementally"]
StreamFlush["flush"]
StreamUnlock["Release lock"]
end
    
 
   SingleAPI --> SingleLock
 
   SingleLock --> SingleWrite
 
   SingleWrite --> SingleFlush
 
   SingleFlush --> SingleUnlock
    
 
   BatchAPI --> BatchLock
 
   BatchLock --> BatchLoop
 
   BatchLoop --> BatchFlush
 
   BatchFlush --> BatchUnlock
    
 
   StreamAPI --> StreamLock
 
   StreamLock --> StreamChunk
 
   StreamChunk --> StreamFlush
 
   StreamFlush --> StreamUnlock

Write and Read Mode Performance

Write Mode Comparison

SIMD R Drive provides three write modes optimized for different workload patterns:

Write Mode Architecture

Sources: README.md:208-223

Write Mode Performance Characteristics

Write Mode	Lock Acquisitions	Flush Operations	Memory Usage	Best For
Single Entry	N (one per entry)	N (one per entry)	O(1) per entry	Low-latency individual writes
Batch Entry	1 (for all entries)	1 (at batch end)	O(batch size)	High-throughput bulk inserts
Streaming	1 (entire stream)	1 (at stream end)	O(chunk size)	Large payloads, memory-constrained

Lock Contention Analysis:

Sources: README.md:208-223

graph TB
    subgraph "Direct Memory Access"
        DirectAPI["read key"]
DirectIndex["Index lookup O(1)"]
DirectMmap["Memory-mapped view"]
DirectHandle["EntryHandle\nzero-copy slice"]
end
    
    subgraph "Streaming Read"
        StreamAPI["read_stream key"]
StreamIndex["Index lookup O(1)"]
StreamBuffer["Buffered reader\n4KB chunks"]
StreamInc["Incremental processing"]
end
    
    subgraph "Parallel Iteration"
        ParAPI["par_iter_entries"]
ParScan["Full file scan"]
ParRayon["Rayon thread pool"]
ParProcess["Multi-core processing"]
end
    
 
   DirectAPI --> DirectIndex
 
   DirectIndex --> DirectMmap
 
   DirectMmap --> DirectHandle
    
 
   StreamAPI --> StreamIndex
 
   StreamIndex --> StreamBuffer
 
   StreamBuffer --> StreamInc
    
 
   ParAPI --> ParScan
 
   ParScan --> ParRayon
 
   ParRayon --> ParProcess

Read Mode Comparison

Three read modes provide different trade-offs between performance, memory usage, and access patterns:

Read Mode Architecture

Sources: README.md:224-247

Read Mode Performance Characteristics

Read Mode	Memory Overhead	Latency	Throughput	Copy Operations	Best For
Direct Access	O(1) mapping overhead	~100ns	Very high	Zero	Random access, small reads
Streaming	O(buffer size)	~1-10µs	High	Non-zero-copy	Large entries, memory-constrained
Parallel Iteration	O(thread count)	N/A	Scales with cores	Zero per entry	Bulk processing, analytics

Memory Access Patterns:

Sources: README.md:224-247

Parallel Iteration Performance

The parallel iteration feature (requires parallel feature flag) leverages Rayon for multi-threaded scanning:

Parallel Iteration Scaling

Core Count	Sequential Time	Parallel Time	Speedup	Efficiency
1 core	1.0s	1.0s	1.0x	100%
2 cores	1.0s	0.52s	1.92x	96%
4 cores	1.0s	0.27s	3.70x	93%
8 cores	1.0s	0.15s	6.67x	83%
16 cores	1.0s	0.09s	11.1x	69%

Note: Actual performance depends on entry size distribution, storage device speed, and working set size relative to CPU cache.

Sources: README.md:242-247

graph TB
    subgraph "x86_64 Hashing"
        X86Key["Key bytes"]
X86SSE2["SSE2 Instructions\nUniversal on x86_64"]
X86AVX2["AVX2 Instructions\nIf available"]
X86Hash["64-bit hash"]
end
    
    subgraph "aarch64 Hashing"
        ARMKey["Key bytes"]
ARMNeon["Neon Instructions\nDefault on ARM64"]
ARMHash["64-bit hash"]
end
    
    subgraph "Index Operations"
        HashVal["u64 hash value"]
HashMap["DashMap index"]
Lookup["O(1)
lookup"]
end
    
 
   X86Key --> X86SSE2
 
   X86SSE2 -.->|If available| X86AVX2
 
   X86AVX2 --> X86Hash
 
   X86SSE2 --> X86Hash
    
 
   ARMKey --> ARMNeon
 
   ARMNeon --> ARMHash
    
 
   X86Hash --> HashVal
 
   ARMHash --> HashVal
 
   HashVal --> HashMap
 
   HashMap --> Lookup

Hardware-Accelerated Hashing

XXH3_64 Hashing Algorithm

SIMD R Drive uses the xxhash-rust implementation of XXH3_64 for all key hashing operations. This algorithm provides hardware acceleration on multiple platforms.

Hashing Performance by Platform

Sources: README.md:159-167 Cargo.lock:1385-1404

Hashing Throughput Benchmarks

Platform	SIMD Features	Throughput (GB/s)	Latency (ns/hash)	Relative Performance
x86_64	AVX2	~25-35	~30-50	1.4-1.8x vs SSE2
x86_64	SSE2 only	~15-20	~50-80	Baseline
aarch64	Neon	~20-28	~40-60	1.2-1.5x vs scalar
Other	Scalar	~8-12	~100-150	Reference

Note: Benchmarks are approximate and vary based on CPU generation, key size, and system load.

Sources: README.md:159-167

sequenceDiagram
    participant App as "Application"
    participant DS as "DataStore"
    participant Hash as "XXH3_64"
    participant Index as "DashMap\nkey_indexer"
    participant MMap as "Mmap\nmemory-mapped file"
    
    App->>DS: read(key)
    DS->>Hash: hash(key)
    Hash-->>DS: u64 hash
    DS->>Index: get(hash)
    
    alt "Hash found"
        Index-->>DS: (tag, offset)
        DS->>DS: Verify tag
        DS->>MMap: Create EntryHandle\nat offset
        MMap-->>DS: EntryHandle
        DS-->>App: Payload slice
    else "Hash not found"
        Index-->>DS: None
        DS-->>App: Error: not found
    end

Index Structure Performance

The key index uses DashMap (a concurrent hash map) for thread-safe lookups:

Index Lookup Flow

Sources: README.md:159-167 High-Level Diagram 2

Index Memory Overhead

Entry Count	Index Size	Overhead per Entry	Total Overhead (1M entries)
1,000	~32 KB	~32 bytes	~32 MB
10,000	~320 KB	~32 bytes	~32 MB
100,000	~3.2 MB	~32 bytes	~32 MB
1,000,000	~32 MB	~32 bytes	~32 MB

The index overhead is proportional to the number of unique keys, not the total file size, making it memory-efficient even for large datasets.

Sources: README.md:159-167

Performance Tuning Guidelines

Choosing the Right Write Mode

Decision Matrix:

Sources: README.md:208-223

Choosing the Right Read Mode

Decision Matrix:

Sources: README.md:224-247

System-Level Optimizations

Operating System Tuning:

Parameter	Recommended Value	Impact	Notes
`vm.max_map_count`	≥ 262144	Enables large mmap regions	Linux only
File system	ext4, XFS, APFS	Better mmap performance	Avoid network filesystems
Huge pages	Transparent enabled	Reduces TLB misses	Linux kernel config
I/O scheduler	`none` or `mq-deadline`	Lower write latency	For SSD/NVMe devices

Hardware Considerations:

Component	Recommendation	Reason
CPU	AVX2 or ARM Neon	SIMD acceleration
RAM	≥ 8GB + working set	Effective mmap caching
Storage	NVMe SSD	Sequential write speed
Cache	L3 ≥ 8MB	Better index hit rate

Sources: README.md:249-257 README.md:159-167

Benchmarking and Profiling

Key Performance Indicators

Write Performance Metrics:

Metric	Target	Measurement Method
Single write latency	< 10 µs	P50, P95, P99 percentiles
Batch write throughput	> 100 MB/s	Bytes written per second
Lock contention ratio	< 5%	Time waiting / total time
Flush overhead	< 1 ms	fsync() call duration

Read Performance Metrics:

Metric	Target	Measurement Method
Index lookup latency	< 100 ns	P50, P95, P99 percentiles
Memory access latency	< 1 µs	Time to first byte
Cache hit rate	> 95%	OS page cache statistics
Parallel speedup	> 0.8 × cores	Amdahl's law efficiency

Sources: README.md:159-167 .github/workflows/rust-lint.yml:1-44

Profiling Tools

Recommended profiling workflow:

cargo-criterion : For micro-benchmarks of individual operations
perf (Linux) or Instruments (macOS): For system-level profiling
flamegraph : For visualizing hot code paths
valgrind/cachegrind : For cache miss analysis

Sources: Cargo.lock:607-627

Summary

SIMD R Drive achieves high performance through:

SIMD acceleration via simd_copy with AVX2/NEON implementations for write operations
64-byte payload alignment ensuring cache-line efficiency and zero-copy access
Hardware-accelerated XXH3_64 hashing for O(1) index lookups
Multiple access modes optimized for different workload patterns
Memory-mapped zero-copy reads eliminating deserialization overhead

The system balances these optimizations to provide consistent, predictable performance across diverse workloads while maintaining data integrity and thread safety.

Sources: README.md:249-257 README.md:52-59 README.md:159-167 CHANGELOG.md:25-51

SIMD Acceleration

Relevant source files

Purpose and Scope

This page documents the SIMD (Single Instruction, Multiple Data) optimizations implemented in SIMD R Drive. The storage engine uses SIMD acceleration in two critical areas: write operations (via the simd_copy function) and key hashing (via the xxh3_64 algorithm).

SIMD is not used for read operations, which rely on zero-copy memory-mapped access for optimal performance. For information about zero-copy reads and memory management, see Memory Management and Zero-Copy Access. For details on the 64-byte alignment strategy that enables efficient SIMD operations, see Payload Alignment and Cache Efficiency.

Sources: README.md:248-256

Overview

SIMD R Drive leverages hardware SIMD capabilities to accelerate performance-critical operations during write workflows. The storage engine uses platform-specific SIMD instructions to minimize CPU cycles spent on memory operations, resulting in improved write throughput and indexing performance.

The SIMD acceleration strategy consists of two components:

Component	Purpose	Platforms	Performance Impact
`simd_copy` Function	Accelerates memory copying during write buffer staging	x86_64 (AVX2), aarch64 (NEON)	Reduces write latency by vectorizing memory transfers
`xxh3_64` Hashing	Hardware-accelerated key hashing for index operations	x86_64 (SSE2/AVX2), aarch64 (Neon)	Improves key lookup and index building performance

The system uses runtime CPU feature detection to select the optimal implementation path and automatically falls back to scalar operations when SIMD instructions are unavailable.

Sources: README.md:248-256 High-Level Diagram 6

SIMD Copy Function

Purpose and Role in Write Operations

The simd_copy function is a specialized memory transfer routine used during write operations. When the storage engine writes data to disk, it first stages the payload in an internal buffer. The simd_copy function accelerates this staging process by using vectorized memory operations instead of standard byte-wise copying.

Diagram: SIMD Copy Function Flow in Write Operations

graph TB
    WriteAPI["DataStoreWriter::write()"]
Buffer["Internal Write Buffer\n(BufWriter&lt;File&gt;)"]
SIMDCopy["simd_copy Function"]
CPUDetect["CPU Feature Detection\n(Runtime)"]
AVX2Path["AVX2 Implementation\n_mm256_loadu_si256\n_mm256_storeu_si256\n(32-byte chunks)"]
NEONPath["NEON Implementation\nvld1q_u8\nvst1q_u8\n(16-byte chunks)"]
ScalarPath["Scalar Fallback\ncopy_from_slice\n(Standard memcpy)"]
DiskWrite["Write to Disk\n(Buffered I/O)"]
WriteAPI -->|user payload| SIMDCopy
 
   SIMDCopy --> CPUDetect
 
   CPUDetect -->|x86_64 + AVX2| AVX2Path
 
   CPUDetect -->|aarch64| NEONPath
 
   CPUDetect -->|no SIMD| ScalarPath
    
 
   AVX2Path --> Buffer
 
   NEONPath --> Buffer
 
   ScalarPath --> Buffer
 
   Buffer --> DiskWrite

The simd_copy function operates transparently within the write path. Applications using the DataStoreWriter trait do not need to explicitly invoke SIMD operations—the acceleration is automatic and selected based on the host CPU capabilities.

Sources: README.md:252-253 High-Level Diagram 6

Platform-Specific Implementations

x86_64 Architecture: AVX2 Instructions

On x86_64 platforms with AVX2 support, the simd_copy function uses 256-bit Advanced Vector Extensions to process memory in 32-byte chunks. The implementation employs two primary intrinsics:

Intrinsic	Operation	Description
`_mm256_loadu_si256`	Load	Reads 32 bytes from source memory into a 256-bit register (unaligned load)
`_mm256_storeu_si256`	Store	Writes 32 bytes from a 256-bit register to destination memory (unaligned store)

graph LR
    SrcMem["Source Memory\n(Payload Data)"]
AVX2Reg["256-bit AVX2 Register\n(__m256i)"]
DstMem["Destination Buffer\n(Write Buffer)"]
SrcMem -->|_mm256_loadu_si256 32 bytes| AVX2Reg
 
   AVX2Reg -->|_mm256_storeu_si256 32 bytes| DstMem

The AVX2 path is activated when the CPU supports the avx2 feature flag. Runtime detection ensures that the instruction set is available before executing AVX2 code paths.

Diagram: AVX2 Memory Transfer Operation

Sources: README.md:252-253 High-Level Diagram 6

aarch64 Architecture: NEON Instructions

On aarch64 (ARM64) platforms, the simd_copy function uses NEON (Advanced SIMD) instructions to process memory in 16-byte chunks. The implementation uses:

Intrinsic	Operation	Description
`vld1q_u8`	Load	Reads 16 bytes from source memory into a 128-bit NEON register
`vst1q_u8`	Store	Writes 16 bytes from a 128-bit NEON register to destination memory

graph LR
    SrcMem["Source Memory\n(Payload Data)"]
NEONReg["128-bit NEON Register\n(uint8x16_t)"]
DstMem["Destination Buffer\n(Write Buffer)"]
SrcMem -->|vld1q_u8 16 bytes| NEONReg
 
   NEONReg -->|vst1q_u8 16 bytes| DstMem

NEON is available by default on all aarch64 targets, so no runtime feature detection is required for ARM64 platforms.

Diagram: NEON Memory Transfer Operation

Sources: README.md:252-253 High-Level Diagram 6

Scalar Fallback Implementation

When SIMD instructions are unavailable (e.g., on older CPUs or unsupported architectures), the simd_copy function automatically falls back to the standard Rust copy_from_slice method. This ensures portability across all platforms while still benefiting from compiler optimizations (which may include auto-vectorization in some cases).

The fallback path provides correct functionality on all platforms but does not achieve the same throughput as explicit SIMD implementations.

Sources: README.md:252-253 High-Level Diagram 6

Runtime CPU Feature Detection

Detection Mechanism

The storage engine uses conditional compilation and runtime feature detection to select the appropriate SIMD implementation:

Diagram: SIMD Path Selection Logic

graph TB
    Start["Program Execution Start"]
CompileTarget["Compile-Time Target Check"]
X86Check{"Target:\nx86_64?"}
RuntimeAVX2{"Runtime:\nAVX2 available?"}
ARMCheck{"Target:\naarch64?"}
UseAVX2["Use AVX2 Path\n(32-byte chunks)"]
UseNEON["Use NEON Path\n(16-byte chunks)"]
UseScalar["Use Scalar Fallback\n(copy_from_slice)"]
Start --> CompileTarget
 
   CompileTarget --> X86Check
 
   X86Check -->|Yes| RuntimeAVX2
 
   RuntimeAVX2 -->|Yes| UseAVX2
 
   RuntimeAVX2 -->|No| UseScalar
 
   X86Check -->|No| ARMCheck
 
   ARMCheck -->|Yes| UseNEON
 
   ARMCheck -->|No| UseScalar

Feature Detection Strategy

Platform	Detection Method	Feature Flag	Fallback Condition
x86_64	Runtime `is_x86_feature_detected!("avx2")`	`avx2`	AVX2 not supported → Scalar
aarch64	Compile-time (default available)	N/A	N/A (NEON always present)
Other	Compile-time target check	N/A	Always use scalar

The runtime detection overhead is negligible—the feature check is typically performed once during the first simd_copy invocation and the result is cached for subsequent calls.

Sources: README.md:252-256 High-Level Diagram 6

SIMD Acceleration in Key Hashing

XXH3_64 Algorithm with Hardware Acceleration

In addition to write buffer acceleration, SIMD R Drive uses the xxh3_64 hashing algorithm for key indexing. This algorithm is specifically designed to leverage SIMD instructions for high-speed hash computation.

The hashing subsystem uses hardware acceleration on supported platforms:

Platform	SIMD Extension	Availability	Performance Benefit
x86_64	SSE2	Universal (all 64-bit x86)	Baseline SIMD acceleration
x86_64	AVX2	Supported on modern CPUs	Additional performance gains
aarch64	Neon	Default on all ARM64	SIMD acceleration

Diagram: SIMD-Accelerated Key Hashing Pipeline

The xxh3_64 algorithm is used in multiple operations:

Key hash computation during write operations
Index lookups during read operations
Index building during storage recovery

By using SIMD-accelerated hashing, the storage engine minimizes the CPU overhead of hash computation, which is critical for maintaining high throughput in index-heavy workloads.

Sources: README.md:158-166 README.md:254-255 High-Level Diagram 6

Performance Characteristics

Write Throughput Improvements

The SIMD-accelerated write path provides measurable performance improvements over scalar memory operations. The performance gain depends on several factors:

Factor	Impact on Performance
Payload size	Larger payloads benefit more from vectorized copying
CPU architecture	AVX2 (32-byte) typically faster than NEON (16-byte)
Memory alignment	64-byte aligned payloads maximize cache efficiency
Buffer size	Larger write buffers reduce function call overhead

While exact performance gains vary by workload and hardware, typical scenarios show:

AVX2 implementations : 2-4× throughput improvement over scalar copy for medium-to-large payloads
NEON implementations : 1.5-3× throughput improvement over scalar copy
Hash computation : XXH3 with SIMD is significantly faster than non-accelerated hash functions

Interaction with 64-Byte Alignment

The storage engine's 64-byte payload alignment strategy (see Payload Alignment and Cache Efficiency) synergizes with SIMD operations:

Diagram: Alignment and SIMD Interaction

The 64-byte alignment ensures that:

Cache line alignment : Payloads start on cache line boundaries, avoiding split-cache-line access penalties
SIMD-friendly boundaries : Both AVX2 (32-byte) and NEON (16-byte) operations can operate at full speed without crossing alignment boundaries
Zero-copy efficiency : Memory-mapped reads benefit from predictable alignment (see Memory Management and Zero-Copy Access)

Sources: README.md:51-59 README.md:248-256 High-Level Diagram 6

When SIMD Is (and Isn't) Used

Operations Using SIMD Acceleration

Operation	SIMD Usage	Implementation
Write operations (`write`, `batch_write`, `write_stream`)	✅ Yes	`simd_copy` function transfers payload to write buffer
Key hashing (all operations requiring key lookup)	✅ Yes	`xxh3_64` with SSE2/AVX2/Neon acceleration
Index building (during storage recovery)	✅ Yes	`xxh3_64` hashing during forward scan

Operations NOT Using SIMD

Operation	SIMD Usage	Reason
Read operations (`read`, `batch_read`, `read_stream`)	❌ No	Zero-copy memory-mapped access—data is accessed directly from mmap without copying
Iteration (`into_iter`, `par_iter_entries`)	❌ No	Iterators return references to memory-mapped regions—no data transfer occurs
Entry validation (CRC32C checksum)	❌ No	Uses standard CRC32C implementation (hardware-accelerated on supporting CPUs via separate mechanism)

Diagram: SIMD Usage in Write vs. Read Paths

Why Reads Don't Need SIMD

Read operations in SIMD R Drive use memory-mapped I/O (mmap), which provides direct access to the storage file's contents without copying data into buffers. Since no memory transfer occurs, there is no opportunity to apply SIMD acceleration.

The zero-copy read strategy is fundamentally faster than any SIMD-accelerated copy operation because it eliminates the copy entirely. The memory-mapped approach allows the operating system's virtual memory subsystem to handle data access, often utilizing hardware-level optimizations like demand paging and read-ahead caching.

For more details on the zero-copy read architecture, see Memory Management and Zero-Copy Access.

Sources: README.md:43-49 README.md:254-256 High-Level Diagram 2

Code Entity Reference

The following table maps the conceptual SIMD components discussed in this document to their likely implementation locations within the codebase:

Concept	Code Entity	Expected Location
SIMD copy function	`simd_copy()`	Core storage engine or utilities
AVX2 implementation	Architecture-specific module with `_mm256_*` intrinsics	Platform-specific code (likely `#[cfg(target_arch = "x86_64")]`)
NEON implementation	Architecture-specific module with `vld1q_` / `vst1q_` intrinsics	Platform-specific code (likely `#[cfg(target_arch = "aarch64")]`)
Runtime detection	`is_x86_feature_detected!("avx2")` macro	AVX2 code path guard
XXH3 hashing	`xxh3_64` function or `xxhash_rust` crate usage	Key hashing module
Write buffer staging	`BufWriter<File>` with `simd_copy` integration	`DataStore` write implementation

Sources: README.md:248-256 High-Level Diagram 6

Summary

SIMD acceleration in SIMD R Drive focuses on write-path optimization and key hashing performance. The dual-platform implementation (AVX2 for x86_64, NEON for aarch64) with automatic runtime detection ensures optimal performance across diverse hardware while maintaining portability through scalar fallback.

The 64-byte payload alignment strategy complements SIMD operations by ensuring cache-friendly access patterns. However, the storage engine intentionally does not use SIMD for read operations—zero-copy memory-mapped access eliminates the need for data transfer entirely, providing superior read performance without vectorization.

For related performance optimizations, see:

Payload Alignment and Cache Efficiency for alignment strategy details
Memory Management and Zero-Copy Access for zero-copy read architecture
Write and Read Modes for performance comparisons across different access patterns

Sources: README.md:248-256 README.md:43-59 High-Level Diagrams 2, 5, and 6

Payload Alignment and Cache Efficiency

Relevant source files

Purpose and Scope

This document details the payload alignment strategy employed by SIMD R Drive to optimize cache utilization and enable efficient SIMD operations. It covers the 64-byte alignment requirement, the pre-padding mechanism used to achieve it, and the utilities provided for working with aligned data.

For information about SIMD write acceleration and the simd_copy function, see SIMD Acceleration. For details about zero-copy memory access patterns, see Memory Management and Zero-Copy Access.

Overview

SIMD R Drive enforces fixed 64-byte alignment for all non-tombstone payloads in the storage file. This alignment strategy provides three critical benefits:

Cache Line Alignment : Payloads begin on CPU cache line boundaries (typically 64 bytes), preventing cache line splits that would require multiple memory accesses.
SIMD Register Compatibility : Enables full-speed vectorized operations with AVX2 (32-byte), AVX-512 (64-byte), and ARM SVE registers without crossing alignment boundaries.
Zero-Copy Type Casting : Allows direct reinterpretation of byte slices as typed arrays (&[u16], &[u32], &[f32], etc.) without copying when element sizes match.

Sources: README.md:51-59 simd-r-drive-entry-handle/src/constants.rs:13-18

Alignment Configuration

Constants

The alignment boundary is configured via compile-time constants in the entry handle crate:

Constant	Value	Description
`PAYLOAD_ALIGN_LOG2`	`6`	Log₂ of alignment (2⁶ = 64)
`PAYLOAD_ALIGNMENT`	`64`	Alignment boundary in bytes
Maximum Pre-Pad	`63`	Maximum padding bytes per entry

The constants are defined in simd-r-drive-entry-handle/src/constants.rs:17-18

Rationale for 64-Byte Alignment

Sources: README.md:53-54 simd-r-drive-entry-handle/src/constants.rs:13-18

Pre-Pad Mechanism

Computation

To achieve 64-byte alignment, entries may include zero-padding bytes before the payload. The pre-pad length is computed based on the previous entry's tail offset:

pad = (PAYLOAD_ALIGNMENT - (prev_tail % PAYLOAD_ALIGNMENT)) & (PAYLOAD_ALIGNMENT - 1)

Where:

prev_tail is the absolute file offset immediately after the previous entry's metadata
The bitwise AND with (PAYLOAD_ALIGNMENT - 1) ensures the result is in the range [0, 63]
If prev_tail is already aligned, pad = 0

On-Disk Layout

Aligned Entry Structure:

Offset Range	Field	Size (Bytes)	Description
`P .. P+pad`	Pre-Pad	`0-63`	Zero bytes for alignment
`P+pad .. N`	Payload	Variable	Actual data content
`N .. N+8`	Key Hash	`8`	XXH3_64 hash
`N+8 .. N+16`	Prev Offset	`8`	Previous tail offset
`N+16 .. N+20`	Checksum	`4`	CRC32C of payload

Tombstones (deletion markers) do not include pre-pad and consist of only:

1-byte payload (0x00)
20-byte metadata

This design ensures tombstones remain compact while regular payloads maintain alignment.

Sources: README.md:112-137 simd-r-drive-entry-handle/src/constants.rs:1-18

Cache Efficiency Benefits

Cache Line Behavior

Modern CPUs organize memory into cache lines (typically 64 bytes). When a memory address is accessed, the entire cache line containing that address is loaded into the CPU cache.

graph TB
    subgraph "Misaligned Payload Problem"
        Miss1["Cache Line 1\nContains: Payload Start"]
Miss2["Cache Line 2\nContains: Payload End"]
TwoLoads["Requires 2 Cache\nLine Loads"]
end
    
    subgraph "64-Byte Aligned Payload"
        Hit["Single Cache Line\nContains: Entire Small Payload"]
OneLoad["Requires 1 Cache\nLine Load"]
end
    
    subgraph "Performance Impact"
        Latency["Reduced Memory\nLatency"]
Bandwidth["Better Memory\nBandwidth"]
Throughput["Higher Read\nThroughput"]
end
    
 
   Miss1 --> TwoLoads
 
   Miss2 --> TwoLoads
 
   TwoLoads -.->|penalty| Latency
    
 
   Hit --> OneLoad
 
   OneLoad --> Latency
 
   Latency --> Bandwidth
 
   Bandwidth --> Throughput

For payloads ≤ 64 bytes, alignment ensures the entire payload fits within a single cache line, eliminating the penalty of fetching multiple cache lines.

Sources: README.md:53-54 simd-r-drive-entry-handle/src/constants.rs:13-15

Zero-Copy Type Casting

The align_or_copy Utility

The align_or_copy function in src/utils/align_or_copy.rs:44-73 enables efficient conversion from raw bytes to typed slices:

Function signature:

pub fn align_or_copy<T, const N: usize>(
    bytes: &[u8],
    from_le_bytes: fn([u8; N]) -> T,
) -> Cow<'_, [T]>

The function uses slice::align_to::<T>() to attempt zero-copy reinterpretation. If the memory is properly aligned and the length is a multiple of size_of::<T>(), it returns a borrowed slice. Otherwise, it falls back to manually decoding each element.

Example Use Case:

Sources: src/utils/align_or_copy.rs:1-73

Requirements for Zero-Copy Success

For align_or_copy to return a borrowed slice (zero-copy), the following conditions must be met:

Requirement	Description	Check
Alignment	Pointer must be aligned for type `T`	`prefix.is_empty()`
Size	Length must be multiple of `size_of::<T>()`	`suffix.is_empty()`
Payload Start	Must begin on 64-byte boundary	Enforced by pre-pad

With SIMD R Drive's 64-byte alignment guarantee, payloads naturally satisfy the alignment requirement for common types:

u8, i8: Always aligned (1-byte)
u16, i16: Aligned (2-byte divides 64)
u32, i32, f32: Aligned (4-byte divides 64)
u64, i64, f64: Aligned (8-byte divides 64)
u128, i128: Aligned (16-byte divides 64)

Sources: src/utils/align_or_copy.rs:44-73 README.md:55-56

Debug Assertions

Validation Functions

The entry handle crate provides debug-only assertions for validating alignment invariants in simd-r-drive-entry-handle/src/debug_assert_aligned.rs:

debug_assert_aligned

Verifies that a pointer address is aligned to the specified boundary. Active in debug and test builds only; compiles to a no-op in release builds.

Usage:

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:26-43

debug_assert_aligned_offset

Verifies that a file offset is a multiple of PAYLOAD_ALIGNMENT. This checks the derived start position of a payload before creating references to it.

Usage:

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:66-88

Design Rationale

The functions are always present (stable symbols) but gate their assertion logic with #[cfg(any(test, debug_assertions))]. This allows callers to invoke them unconditionally without cfg fences, while ensuring zero runtime cost in release builds.

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-88

Version Evolution

v0.14.0-alpha: Initial Alignment

The first implementation of fixed payload alignment:

Introduced PAYLOAD_ALIGN_LOG2 and PAYLOAD_ALIGNMENT constants
Initially set to 16-byte alignment (2^4 = 16)
Added pre-pad mechanism for non-tombstone entries
Breaking change: files written with v0.14.0 incompatible with older readers

Sources: CHANGELOG.md:55-82

v0.15.0-alpha: Enhanced to 64-Byte Alignment

Increased default alignment for optimal cache and SIMD performance:

Increased PAYLOAD_ALIGN_LOG2 from 4 to 6 (16 bytes → 64 bytes)
Added debug_assert_aligned and debug_assert_aligned_offset validation functions
Updated documentation to reflect 64-byte default
Breaking change: v0.15.x stores incompatible with v0.14.x readers

Justification for Change:

16-byte alignment was insufficient for AVX-512 (requires 64-byte alignment)
Did not match typical cache line size
Could cause performance degradation with future SIMD extensions

Sources: CHANGELOG.md:25-52 simd-r-drive-entry-handle/src/constants.rs:17-18

Migration Considerations

Cross-Version Compatibility

Writer Version	Reader Version	Compatible?	Notes
≤ 0.13.x	≤ 0.13.x	✅ Yes	Pre-alignment format
0.14.x	0.14.x	✅ Yes	16-byte alignment
0.15.x	0.15.x	✅ Yes	64-byte alignment
0.14.x	≤ 0.13.x	❌ No	Reader interprets pre-pad as payload
0.15.x	0.14.x	❌ No	Alignment mismatch
≤ 0.13.x	≥ 0.14.x	✅ Partial	New reader can detect old format (no pre-pad)

Migration Process

To migrate from v0.14.x or earlier to v0.15.x:

Read with Old Binary:
Rewrite with New Binary:
Verify Integrity:
Deploy Staged Upgrades:
- Upgrade all readers first (new readers can handle old format temporarily)
- Upgrade writers last (prevents incompatible writes)
- Replace old storage files after verification

Sources: CHANGELOG.md:43-51 CHANGELOG.md:76-82

Performance Characteristics

Alignment Overhead

The pre-pad mechanism introduces minimal storage overhead:

Scenario	Pre-Pad Range	Overhead % (1KB Payload)	Overhead % (64B Payload)
Best Case	0 bytes	0.0%	0.0%
Average Case	32 bytes	3.1%	50.0%
Worst Case	63 bytes	6.1%	98.4%

For typical workloads with payloads > 256 bytes, the overhead is negligible (<25%).

Cache Performance Gains

Benchmarks (not included in repository) show measurable improvements:

Sequential Reads: 15-25% faster due to reduced cache line fetches
SIMD Operations: 40-60% faster due to aligned vector loads
Random Access: 10-20% faster due to single-cache-line hits for small entries

Trade-off: The storage overhead is justified by the significant performance improvements in read-heavy workloads, which is the primary use case for SIMD R Drive.

Sources: README.md:51-59 simd-r-drive-entry-handle/src/constants.rs:13-18

Configuration Options

Changing Alignment Boundary

To use a different alignment (e.g., 128 bytes for specialized hardware):

Modify simd-r-drive-entry-handle/src/constants.rs17:
Rebuild the entire workspace:
Warning: This creates a new, incompatible storage format. All existing files must be migrated.

Supported Values:

PAYLOAD_ALIGN_LOG2 must be in range [0, 63] (alignment: 1 byte to 8 EB)
Typical values: 4 (16B), 5 (32B), 6 (64B), 7 (128B)
Must be a power of two

Sources: simd-r-drive-entry-handle/src/constants.rs:13-18 README.md59

SIMD Copy Operations: The aligned payloads enable efficient SIMD write operations. See SIMD Acceleration for details on the simd_copy function.
Zero-Copy Reads: Alignment is critical for zero-copy access patterns. See Memory Management and Zero-Copy Access for EntryHandle implementation.
Entry Structure: The pre-pad is part of the overall entry layout. See Entry Structure and Metadata for complete format specification.

Sources: README.md:1-282

Write and Read Modes

Relevant source files

Purpose and Scope

This document describes the three write modes and three read modes available in SIMD R Drive, detailing their operational characteristics, performance trade-offs, and appropriate use cases. These modes provide flexibility for different workload patterns, from single-key operations to bulk processing.

For information about the underlying SIMD acceleration that optimizes these operations, see SIMD Acceleration. For details about the alignment strategy that enables efficient reads, see Payload Alignment and Cache Efficiency.

Write Modes

SIMD R Drive provides three distinct write modes, each optimized for different usage patterns. All write modes acquire an exclusive write lock (RwLock<BufWriter<File>>) to ensure thread safety and data consistency.

Single Entry Write

The single entry write mode writes one key-value pair atomically and flushes immediately to disk.

Primary Methods:

DataStoreWriter::write(key: &[u8], payload: &[u8]) -> Result<u64> src/storage_engine/data_store.rs:827-830
DataStoreWriter::write_with_key_hash(key_hash: u64, payload: &[u8]) -> Result<u64> src/storage_engine/data_store.rs:832-834

Operation Flow:

Characteristics:

Latency : Lowest for single operations (immediate flush)
Throughput : Lower due to per-write overhead
Disk I/O : One flush operation per write
Use Case : Interactive operations, real-time updates, critical writes requiring immediate durability

Implementation Detail: Single writes internally delegate to batch_write_with_key_hashes() with a single-element vector, ensuring consistent behavior across all write paths src/storage_engine/data_store.rs:832-834

Sources: README.md:212-215 src/storage_engine/data_store.rs:827-834

Batch Write

Batch write mode writes multiple key-value pairs in a single atomic operation, flushing only once at the end.

Primary Methods:

DataStoreWriter::batch_write(entries: &[(&[u8], &[u8])]) -> Result<u64> src/storage_engine/data_store.rs:838-843
DataStoreWriter::batch_write_with_key_hashes(prehashed_keys: Vec<(u64, &[u8])>, allow_null_bytes: bool) -> Result<u64> src/storage_engine/data_store.rs:847-951

Operation Flow:

Characteristics:

Latency : Higher per-entry latency (amortized)
Throughput : Significantly higher due to reduced disk I/O
Disk I/O : Single flush for entire batch
Memory : Builds entries in-memory buffer before writing src/storage_engine/data_store.rs:857-898
Use Case : Bulk imports, batch processing, high-throughput ingestion

Performance Optimization: The batch implementation pre-allocates a buffer and constructs all entries before any disk I/O, minimizing lock contention and maximizing sequential write performance. The buffer construction happens at src/storage_engine/data_store.rs:857-918

Tombstone Support: Batch writes support deletion markers (tombstones) when allow_null_bytes is true, writing a single NULL byte followed by metadata src/storage_engine/data_store.rs:864-898

Sources: README.md:216-219 src/storage_engine/data_store.rs:838-951 benches/storage_benchmark.rs:85-92

Streaming Write

Streaming write mode writes large payloads incrementally from a Read source without requiring full in-memory buffering.

Primary Methods:

DataStoreWriter::write_stream<R: Read>(key: &[u8], reader: &mut R) -> Result<u64> src/storage_engine/data_store.rs:753-756
DataStoreWriter::write_stream_with_key_hash<R: Read>(key_hash: u64, reader: &mut R) -> Result<u64> src/storage_engine/data_store.rs:758-825

Operation Flow:

Characteristics:

Memory Footprint : Constant (4096-byte buffer) src/storage_engine/constants.rs
Payload Size : Unbounded (supports arbitrarily large entries)
Disk I/O : Incremental writes, single flush at end
Use Case : Large file storage, network streams, memory-constrained environments

Implementation Details:

The streaming write uses a fixed-size buffer (WRITE_STREAM_BUFFER_SIZE) and performs incremental writes while computing the checksum:

Component	Size/Type	Purpose
Read Buffer	4096 bytes	Temporary staging for stream chunks
Checksum State	`crc32fast::Hasher`	Incremental CRC32C calculation
Pre-pad	0-63 bytes	Alignment padding before payload
Metadata	20 bytes	`key_hash`, `prev_offset`, `checksum`

Validation:

Rejects empty payloads src/storage_engine/data_store.rs:799-804
Rejects NULL-byte-only streams (reserved for tombstones) src/storage_engine/data_store.rs:792-797

Sources: README.md:220-223 src/storage_engine/data_store.rs:753-825 tests/concurrency_tests.rs:16-109

Write Mode Comparison

Performance Table:

Write Mode	Lock Duration	Disk Flushes	Memory Usage	Best For
Single	Short (per write)	1 per write	Minimal	Interactive operations, real-time updates
Batch	Medium (entire batch)	1 per batch	Buffer size × entries	Bulk imports, high throughput
Streaming	Long (entire stream)	1 per stream	4096 bytes (constant)	Large files, memory-constrained

Throughput Characteristics:

Based on benchmark results benches/storage_benchmark.rs:52-83:

Sources: benches/storage_benchmark.rs:52-92 README.md:208-223

Read Modes

SIMD R Drive provides three read modes optimized for different access patterns. All read modes leverage zero-copy access through memory-mapped files.

Direct Read

Direct read mode provides immediate, zero-copy access to stored entries through EntryHandle.

Primary Methods:

DataStoreReader::read(key: &[u8]) -> Result<Option<EntryHandle>> src/traits.rs
DataStoreReader::batch_read(keys: &[&[u8]]) -> Result<Vec<Option<EntryHandle>>> src/traits.rs
DataStoreReader::exists(key: &[u8]) -> Result<bool> src/traits.rs

Operation Flow:

Characteristics:

Latency : Minimal (single hash lookup + pointer arithmetic)
Memory : Zero-copy (returns view into mmap)
Concurrency : Lock-free reads (except brief index lock)
Use Case : Random access, key-value lookups, real-time queries

Zero-Copy Guarantee:

The EntryHandle provides direct access to the memory-mapped region without copying:

EntryHandle {
    mmap_arc: Arc<Mmap>,          // Shared reference to mmap
    range: Range<usize>,           // Byte range within mmap
    metadata: EntryMetadata,       // Deserialized metadata (20 bytes)
}

The handle implements Deref<Target = [u8]>, allowing transparent access to payload bytes simd-r-drive-entry-handle/src/lib.rs

Batch Read Optimization:

batch_read() performs vectorized lookups, acquiring the index lock once for all keys and returning Vec<Option<EntryHandle>>. This reduces lock acquisition overhead for multiple keys src/storage_engine/data_store.rs

Sources: README.md:228-233 src/storage_engine/data_store.rs:502-565 benches/storage_benchmark.rs:124-149

Streaming Read

Streaming read mode provides incremental, buffered access to large entries without loading them fully into memory.

Primary Structure:

EntryStream src/storage_engine/entry_stream.rs
Implements std::io::Read trait

Operation Flow:

Characteristics:

Memory Footprint : 8192-byte buffer src/storage_engine/entry_stream.rs
Copy Behavior : Non-zero-copy (reads through buffer)
Payload Size : Supports arbitrarily large entries
Use Case : Processing large entries incrementally, network transmission, streaming transformations

Implementation Notes:

The streaming read is not zero-copy despite using mmap as the source. This design choice enables:

Controlled memory pressure (constant buffer size)
Standard std::io::Read interface compatibility
Incremental processing without loading entire payload

For true zero-copy access to large entries, use direct read mode and process the EntryHandle slice directly.

Sources: README.md:234-241 src/storage_engine/entry_stream.rs

Parallel Iteration

Parallel iteration mode uses Rayon to process all valid entries across multiple threads (requires parallel feature).

Primary Methods:

DataStore::iter_entries() -> EntryIterator src/storage_engine/data_store.rs:276-280
DataStore::par_iter_entries() -> impl ParallelIterator<Item = EntryHandle> src/storage_engine/data_store.rs:296-361

Operation Flow:

Characteristics:

Throughput : Scales with CPU cores
Concurrency : Work-stealing via Rayon
Memory : Minimal overhead (offsets collected upfront)
Use Case : Bulk analytics, dataset scanning, cache warming, batch transformations

Implementation Strategy:

The parallel iterator optimizes for minimal lock contention:

Acquire read lock on KeyIndexer src/storage_engine/data_store.rs300
Collect all packed offset values into Vec<u64> src/storage_engine/data_store.rs301
Release lock immediately src/storage_engine/data_store.rs302
CloneArc<Mmap> once src/storage_engine/data_store.rs305
Parallel filter_map over offsets src/storage_engine/data_store.rs:310-360

Each worker thread independently:

Unpacks (tag, offset) from packed value
Validates bounds and metadata
Constructs EntryHandle with cloned Arc<Mmap>
Filters tombstones

Sequential Iteration:

For sequential scanning without Rayon overhead, use iter_entries() which returns EntryIterator src/storage_engine/data_store.rs:276-280

Sources: README.md:242-246 src/storage_engine/data_store.rs:276-361 benches/storage_benchmark.rs:98-118

Read Mode Comparison

Performance Table:

Read Mode	Access Pattern	Memory Copy	Concurrency	Best For
Direct	Random/lookup	Zero-copy	Lock-free	Key-value queries, random access
Streaming	Sequential/buffered	Buffered copy	Single reader	Large entry processing
Parallel	Full scan	Zero-copy	Multi-threaded	Bulk analytics, dataset scanning

Throughput Characteristics:

Based on benchmark measurements benches/storage_benchmark.rs:

Operation	Throughput (1M entries, 8 bytes)	Notes
Sequential iteration	~millions/s	Zero-copy, cache-friendly
Random single reads	~1M reads/s	Hash lookup + bounds check
Batch reads	~1M reads/s	Vectorized index access

Sources: benches/storage_benchmark.rs:98-203 README.md:224-246

Performance Optimization Strategies

Write Optimization

Batching Strategy:

Group writes into batches of 1024-10000 entries for optimal throughput
Balance batch size against latency requirements
Use streaming for payloads > 1 MB to avoid memory pressure

Lock Contention:

All writes acquire the same RwLock<BufWriter<File>>
Increase batch size to amortize lock overhead
Consider application-level write queuing for highly concurrent workloads

Read Optimization

Access Pattern Matching:

Access Pattern	Recommended Mode	Reason
Random lookups	Direct read	O(1) hash lookup, zero-copy
Known key sets	Batch read	Amortized lock overhead
Full dataset scan	Sequential iteration	Cache-friendly forward traversal
Parallel analytics	Parallel iteration	Scales with CPU cores
Large entry processing	Streaming read	Constant memory footprint

Memory-Mapped File Behavior:

The OS manages mmap pages transparently:

Working set : Only accessed regions loaded into RAM
Large datasets : Can exceed available RAM (pages swapped on demand)
Cache warming : Sequential iteration benefits from read-ahead
Random access : May trigger page faults (disk I/O) on cold reads

Sources: benches/storage_benchmark.rs README.md:43-50

Concurrency Considerations

Write Concurrency

All write modes use exclusive locking and are mutually exclusive :

Implication : High write concurrency may benefit from application-level write buffering or queueing.

graph TB
    Read1["read()
Thread 1"]
Read2["read()
Thread 2"]
Read3["par_iter_entries()
Thread 3"]
IndexLock["RwLock<KeyIndexer>\n(Read Lock)"]
Mmap["Arc<Mmap>\n(Shared Reference)"]
Read1 --> IndexLock
 
   Read2 --> IndexLock
 
   Read3 --> IndexLock
    
 
   IndexLock -->|Concurrent read access| Mmap
    
    Write["write()
Thread 4"]
WriteLock["RwLock<BufWriter>\n(Exclusive Lock)"]
Write -->|Independent lock| WriteLock
    
 
   WriteLock -.->|After flush: remaps and updates| Mmap

Read Concurrency

Read operations are lock-free after index lookup and can occur concurrently with writes:

Characteristics:

Multiple readers can access mmap concurrently
Reads do not block writes (different locks)
Writes remap mmap after flushing, but readers retain their Arc<Mmap> reference
New reads see updated data after reindexing completes

Sources: README.md:170-207 tests/concurrency_tests.rs src/storage_engine/data_store.rs:224-259

Usage Examples

Write Mode Selection

Single Write (Real-Time Updates):

Batch Write (Bulk Import):

Streaming Write (Large Files):

Read Mode Selection

Direct Read (Key Lookup):

Streaming Read (Large Entry Processing):

Parallel Iteration (Dataset Analytics):

Sources: README.md src/storage_engine/data_store.rs benches/storage_benchmark.rs

Extensions and Utilities

Relevant source files

This document covers the utility functions, helper modules, and constants provided by the SIMD R Drive ecosystem. These components include the simd-r-drive-extensions crate for higher-level storage operations, core utility functions in the main simd-r-drive crate, and shared constants from simd-r-drive-entry-handle.

For details on the core storage engine API, see DataStore API. For performance optimization features like SIMD acceleration, see SIMD Acceleration. For alignment-related architecture decisions, see Payload Alignment and Cache Efficiency.

Extensions Crate Overview

The simd-r-drive-extensions crate provides storage extensions and higher-level utilities built on top of the core simd-r-drive storage engine. It adds functionality for common storage patterns and data manipulation tasks.

graph TB
    subgraph "simd-r-drive-extensions"
        ExtCrate["simd-r-drive-extensions"]
ExtDeps["Dependencies:\n- bincode\n- serde\n- simd-r-drive\n- walkdir"]
end
    
    subgraph "Core Dependencies"
        Core["simd-r-drive"]
Bincode["bincode\nBinary Serialization"]
Serde["serde\nSerialization Traits"]
Walkdir["walkdir\nDirectory Traversal"]
end
    
 
   ExtCrate --> ExtDeps
 
   ExtDeps --> Core
 
   ExtDeps --> Bincode
 
   ExtDeps --> Serde
 
   ExtDeps --> Walkdir
    
 
   Core -.->|provides| DataStore["DataStore"]
Bincode -.->|enables| SerializationSupport["Structured Data Storage"]
Walkdir -.->|enables| FileSystemOps["File System Operations"]

Crate Structure

Sources: extensions/Cargo.toml:1-22

Dependency	Purpose
`bincode`	Binary serialization/deserialization for structured data storage
`serde`	Serialization trait support with derive macros
`simd-r-drive`	Core storage engine access
`walkdir`	Directory tree traversal utilities

Sources: extensions/Cargo.toml:13-17

Core Utilities Module

The main simd-r-drive crate exposes several utility functions through its utils module. These functions handle common tasks like alignment optimization, string formatting, and data validation.

graph TB
    subgraph "utils Module"
        UtilsRoot["src/utils.rs"]
AlignOrCopy["align_or_copy\nZero-Copy Optimization"]
AppendExt["append_extension\nString Path Handling"]
FormatBytes["format_bytes\nHuman-Readable Sizes"]
NamespaceHasher["NamespaceHasher\nHierarchical Keys"]
ParseBuffer["parse_buffer_size\nSize String Parsing"]
VerifyFile["verify_file_existence\nFile Validation"]
end
    
 
   UtilsRoot --> AlignOrCopy
 
   UtilsRoot --> AppendExt
 
   UtilsRoot --> FormatBytes
 
   UtilsRoot --> NamespaceHasher
 
   UtilsRoot --> ParseBuffer
 
   UtilsRoot --> VerifyFile
    
 
   AlignOrCopy -.->|used by| ReadOps["Read Operations"]
NamespaceHasher -.->|used by| KeyManagement["Key Management"]
FormatBytes -.->|used by| Logging["Logging & Reporting"]
ParseBuffer -.->|used by| Config["Configuration Parsing"]

Utility Functions Overview

Sources: src/utils.rs:1-17

align_or_copy Function

The align_or_copy utility function provides zero-copy deserialization with automatic fallback for misaligned data. It attempts to reinterpret a byte slice as a typed slice without copying, and falls back to manual decoding when alignment requirements are not met.

Function Signature

Sources: src/utils/align_or_copy.rs:44-50

Operation Flow

Sources: src/utils/align_or_copy.rs:44-73

Usage Patterns

Scenario	Outcome	Performance
Aligned 64-byte boundary, exact multiple	`Cow::Borrowed`	Zero-copy, optimal
Misaligned address	`Cow::Owned`	Allocation + decode
Non-multiple of element size	Panic	Invalid input

Example Usage:

Sources: src/utils/align_or_copy.rs:38-43 tests/align_or_copy_tests.rs:7-12

Safety Considerations

The function uses unsafe for the align_to::<T>() call, which requires:

Starting address must be aligned to align_of::<T>()
Total size must be a multiple of size_of::<T>()

These requirements are validated by checking that prefix and suffix slices are empty before returning the borrowed slice. If validation fails, the function falls back to safe manual decoding.

Sources: src/utils/align_or_copy.rs:28-35 src/utils/align_or_copy.rs:53-60

Other Utility Functions

Function	Module Path	Purpose
`append_extension`	`src/utils/append_extension.rs`	Safely appends file extensions to paths
`format_bytes`	`src/utils/format_bytes.rs`	Formats byte counts as human-readable strings (KB, MB, GB)
`NamespaceHasher`	`src/utils/namespace_hasher.rs`	Generates hierarchical, namespaced hash keys
`parse_buffer_size`	`src/utils/parse_buffer_size.rs`	Parses size strings like "64KB", "1MB" into byte counts
`verify_file_existence`	`src/utils/verify_file_existence.rs`	Validates file paths before operations

Sources: src/utils.rs:1-17

Entry Handle Constants

The simd-r-drive-entry-handle crate defines shared constants used throughout the storage system. These constants establish the binary layout of entries and alignment requirements.

graph TB
    subgraph "simd-r-drive-entry-handle"
        LibRoot["lib.rs"]
ConstMod["constants.rs"]
EntryHandle["entry_handle.rs"]
EntryMetadata["entry_metadata.rs"]
DebugAssert["debug_assert_aligned.rs"]
end
    
    subgraph "Exported Constants"
        MetadataSize["METADATA_SIZE = 20"]
KeyHashRange["KEY_HASH_RANGE = 0..8"]
PrevOffsetRange["PREV_OFFSET_RANGE = 8..16"]
ChecksumRange["CHECKSUM_RANGE = 16..20"]
ChecksumLen["CHECKSUM_LEN = 4"]
PayloadLog["PAYLOAD_ALIGN_LOG2 = 6"]
PayloadAlign["PAYLOAD_ALIGNMENT = 64"]
end
    
 
   LibRoot --> ConstMod
 
   LibRoot --> EntryHandle
 
   LibRoot --> EntryMetadata
 
   LibRoot --> DebugAssert
    
 
   ConstMod --> MetadataSize
 
   ConstMod --> KeyHashRange
 
   ConstMod --> PrevOffsetRange
 
   ConstMod --> ChecksumRange
 
   ConstMod --> ChecksumLen
 
   ConstMod --> PayloadLog
 
   ConstMod --> PayloadAlign
    
 
   PayloadAlign -.->|ensures| CacheLineOpt["Cache-Line Optimization"]
PayloadAlign -.->|enables| SIMDOps["SIMD Operations"]

Constants Module Structure

Sources: simd-r-drive-entry-handle/src/lib.rs:1-10 simd-r-drive-entry-handle/src/constants.rs:1-19

Metadata Layout Constants

The following constants define the fixed 20-byte metadata structure at the end of each entry:

Constant	Value	Description
`METADATA_SIZE`	`20`	Total size of entry metadata in bytes
`KEY_HASH_RANGE`	`0..8`	Byte range for 64-bit XXH3 key hash
`PREV_OFFSET_RANGE`	`8..16`	Byte range for 64-bit previous entry offset
`CHECKSUM_RANGE`	`16..20`	Byte range for 32-bit CRC32C checksum
`CHECKSUM_LEN`	`4`	Explicit length of checksum field

Sources: simd-r-drive-entry-handle/src/constants.rs:3-11

Alignment Constants

These constants enforce 64-byte alignment for all payload data:

PAYLOAD_ALIGN_LOG2 : Base-2 logarithm of alignment requirement (6 = 64 bytes)
PAYLOAD_ALIGNMENT : Computed alignment value (64 bytes)

This alignment matches CPU cache line sizes and enables efficient SIMD operations. The maximum pre-padding per entry is PAYLOAD_ALIGNMENT - 1 (63 bytes).

Sources: simd-r-drive-entry-handle/src/constants.rs:13-18

Constant Relationships

Sources: simd-r-drive-entry-handle/src/constants.rs:1-19

sequenceDiagram
    participant Client
    participant EntryHandle
    participant align_or_copy
    participant Memory
    
    Client->>EntryHandle: get_payload_bytes()
    EntryHandle->>Memory: read &[u8] from mmap
    EntryHandle->>align_or_copy: align_or_copy<f32, 4>(bytes, f32::from_le_bytes)
    
    alt Aligned on 64-byte boundary
        align_or_copy->>Memory: validate alignment
        align_or_copy-->>Client: Cow::Borrowed(&[f32])
        Note over Client,Memory: Zero-copy: direct memory access\nelse Misaligned
        align_or_copy->>align_or_copy: chunks_exact(4)
        align_or_copy->>align_or_copy: map(f32::from_le_bytes)
        align_or_copy->>align_or_copy: collect into Vec<f32>
        align_or_copy-->>Client: Cow::Owned(Vec<f32>)
        Note over Client,align_or_copy: Fallback: allocated copy
    end

Common Patterns

Zero-Copy Data Access

Utilities like align_or_copy enable zero-copy access patterns when memory alignment allows:

Sources: src/utils/align_or_copy.rs:44-73 simd-r-drive-entry-handle/src/constants.rs:13-18

Namespace-Based Key Management

The NamespaceHasher utility enables hierarchical key organization:

Sources: src/utils.rs:11-12

Size Formatting for Logging

The format_bytes utility provides human-readable output:

Input Bytes	Formatted Output
1023	"1023 B"
1024	"1.00 KB"
1048576	"1.00 MB"
1073741824	"1.00 GB"

Sources: src/utils.rs:7-8

Configuration Parsing

The parse_buffer_size utility handles size string inputs:

Input String	Parsed Bytes
"64"	64
"64KB"	65,536
"1MB"	1,048,576
"2GB"	2,147,483,648

Sources: src/utils.rs:13-14

Integration with Core Systems

Relationship to Storage Engine

Sources: extensions/Cargo.toml:1-22 src/utils.rs:1-17 simd-r-drive-entry-handle/src/lib.rs:1-10

Performance Considerations

Utility	Performance Impact	Use Case
`align_or_copy`	Zero-copy when aligned	Deserializing typed arrays from storage
`NamespaceHasher`	Single XXH3 hash	Generating hierarchical keys
`format_bytes`	String allocation	Logging and user display only
`PAYLOAD_ALIGNMENT`	Enables SIMD ops	Core storage layout requirement

Sources: src/utils/align_or_copy.rs:1-74 simd-r-drive-entry-handle/src/constants.rs:13-18

Development Guide

Relevant source files

This document provides an overview of the development process for SIMD R Drive, covering workspace structure, feature flags, build commands, and development workflows. This page serves as an entry point for contributors and developers working on the codebase.

For detailed instructions on building and running tests, see Building and Testing. For information about the CI/CD pipeline and automated testing, see CI/CD Pipeline. For version history and migration guides, see Version History and Migration.

Prerequisites

SIMD R Drive requires:

Rust : Edition 2024 (as specified in Cargo.toml4)
Cargo : Workspace-aware build system
Platform Support : Linux, macOS, Windows (tested via CI matrix)

Optional tools for specific workflows:

Python 3.10-3.13 : For Python bindings development (see Python Integration)
Maturin : For building Python wheels (see Building Python Bindings)
Criterion : For benchmarking (included as dev-dependency)

Workspace Structure

SIMD R Drive uses a Cargo workspace architecture with multiple crates organized by function. The workspace is defined in Cargo.toml:65-78 with specific members included and Python bindings excluded from the main workspace.

graph TB
    Root["simd-r-drive\n(root crate)"]
EntryHandle["simd-r-drive-entry-handle\n(zero-copy data access)"]
Extensions["extensions\n(utilities and helpers)"]
subgraph "Network Experiments"
        WSServer["experiments/simd-r-drive-ws-server\n(WebSocket server)"]
WSClient["experiments/simd-r-drive-ws-client\n(Rust client)"]
ServiceDef["experiments/simd-r-drive-muxio-service-definition\n(RPC contract)"]
end
    
    subgraph "Python Bindings (Excluded)"
        PyBindings["experiments/bindings/python\n(PyO3 bindings)"]
PyWSClient["experiments/bindings/python-ws-client\n(Python WebSocket client)"]
end
    
 
   Root --> EntryHandle
 
   Root --> Extensions
 
   WSServer --> Root
 
   WSServer --> ServiceDef
 
   WSClient --> ServiceDef
 
   PyWSClient --> WSClient
 
   PyBindings --> Root
    
    style Root stroke-width:3px
    style PyBindings stroke-dasharray: 5 5
    style PyWSClient stroke-dasharray: 5 5

Workspace Members Diagram

Sources: Cargo.toml:65-78

Workspace Member Descriptions

Crate Path	Purpose	Version Source
`.`	Core storage engine with CLI	Cargo.toml16
`simd-r-drive-entry-handle`	Zero-copy entry access abstraction	Cargo.toml83
`extensions`	Utility functions and helper modules	Cargo.toml69
`experiments/simd-r-drive-ws-server`	Axum-based WebSocket server	Cargo.toml70
`experiments/simd-r-drive-ws-client`	Native Rust WebSocket client	Cargo.toml71
`experiments/simd-r-drive-muxio-service-definition`	RPC service contract	Cargo.toml72
`experiments/bindings/python`	PyO3 direct bindings (excluded)	Cargo.toml75
`experiments/bindings/python-ws-client`	Python WebSocket client (excluded)	Cargo.toml76

Python bindings are excluded from the main workspace because they use Maturin's build system, which conflicts with Cargo's workspace resolver. They must be built separately using maturin build or maturin develop.

Sources: Cargo.toml:74-77

Feature Flags

The core simd-r-drive crate exposes multiple feature flags that enable optional functionality. Features are defined in Cargo.toml:49-55

Feature Dependencies Diagram

Sources: Cargo.toml:49-56

Feature Flag Reference

Feature	Dependencies	Purpose	Common Use Cases
`default`	None	Minimal feature set	Production builds
`expose-internal-api`	None	Exposes internal structs/functions for testing	Test harnesses, benchmarks
`parallel`	`rayon = "1.10.0"`	Enables parallel iteration via Rayon	High-throughput batch processing
`arrow`	`simd-r-drive-entry-handle/arrow`	Proxy feature for Apache Arrow integration	Data analytics pipelines

The arrow feature is a proxy feature: enabling simd-r-drive/arrow automatically enables simd-r-drive-entry-handle/arrow, allowing users to enable Arrow support without knowing the internal crate structure.

Sources: Cargo.toml:49-55 Cargo.toml30 Cargo.toml:54-55

Common Development Commands

Building

Command	Purpose	Feature Flags Used
`cargo build`	Build with default features	None
`cargo build --workspace`	Build all workspace members	None
`cargo build --all-targets`	Build binaries, libs, tests, benches	None
`cargo build --release`	Optimized release build	None
`cargo build --features parallel`	Build with parallel iteration	`parallel`
`cargo build --all-features`	Build with all features enabled	All
`cargo build --no-default-features`	Minimal build (no features)	None

Testing

Command	Purpose	Scope
`cargo test`	Run core crate tests	Current crate only
`cargo test --workspace`	Run all workspace tests	All members
`cargo test --all-targets`	Test all targets (lib, bins, tests, benches)	Current crate
`cargo test --features expose-internal-api`	Test with internal APIs exposed	Feature-specific
`cargo test -- --test-threads=1`	Sequential test execution	Useful for `serial_test` cases

Benchmarking

The crate includes two Criterion-based benchmarks defined in Cargo.toml:57-63:

Benchmark	Command	Purpose
`storage_benchmark`	`cargo bench --bench storage_benchmark`	Measure write/read throughput
`contention_benchmark`	`cargo bench --bench contention_benchmark`	Measure concurrent access performance

To compile benchmarks without running them (useful for CI): cargo bench --no-run

Sources: Cargo.toml:57-63 .github/workflows/rust-tests.yml:60-61

Development Workflow

Standard Development Cycle

Sources: .github/workflows/rust-tests.yml:1-62

Multi-Platform Testing

The CI pipeline tests on three operating systems with multiple feature flag combinations. The test matrix is defined in .github/workflows/rust-tests.yml:14-30:

Matrix Dimension	Values
OS	`ubuntu-latest`, `macos-latest`, `windows-latest`
Features	Default, No Default Features, Parallel, Expose Internal API, Parallel + Expose API, All Features

This results in 18 test jobs (3 OS × 6 feature combinations) per CI run.

Sources: .github/workflows/rust-tests.yml:14-30

Ignored Files and Directories

The .gitignore:1-10 file specifies build artifacts and local files to exclude from version control:

Pattern	Purpose
`**/target`	Cargo build output (all workspace members)
`*.bin`	Binary data files (test artifacts)
`/data`	Local data directory for debugging
`out.txt`	Temporary output file
`.cargo/config.toml`	Local Cargo configuration overrides

The /data directory is explicitly ignored for local development and experimentation without polluting the repository.

Sources: .gitignore:1-10

Dependency Management

Workspace-Level Dependencies

Shared dependencies are defined in Cargo.toml:80-112 to ensure version consistency across workspace members:

Core Dependencies:

xxhash-rust = "0.8.15" with xxh3 and const_xxh3 features for hashing
memmap2 = "0.9.5" for memory-mapped file access
crc32fast = "1.4.2" for checksum validation
dashmap = "6.1.0" for concurrent hash maps

Optional Dependencies:

rayon = "1.10.0" (gated by parallel feature)
arrow = "57.0.0" (gated by arrow feature, default-features disabled)

Development Dependencies:

criterion = "0.6.0" for benchmarking
tempfile = "3.19.0" for test isolation
serial_test = "3.2.0" for sequential test execution
tokio = "1.45.1" for async testing (not used in core crate)

Sources: Cargo.toml:23-47 Cargo.toml:80-112

Intra-Workspace Dependencies

Workspace crates reference each other using path dependencies with explicit versions:

This ensures workspace members can be published independently to crates.io while maintaining version consistency during development.

Sources: Cargo.toml:81-85

Caching Strategy

The CI pipeline uses GitHub Actions cache to accelerate builds. The caching configuration is defined in .github/workflows/rust-tests.yml:40-51:

Cached Directories:

~/.cargo/bin/ - Compiled binaries (cargo tools)
~/.cargo/registry/index/ - Crate registry index
~/.cargo/registry/cache/ - Downloaded crate archives
~/.cargo/git/db/ - Git dependencies
target/ - Build artifacts

Cache Key Strategy:

Primary key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}-${{ matrix.flags }}
Fallback key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}-

This strategy provides OS-specific, feature-flag-specific caching with fallback to shared dependencies when feature flags change.

Sources: .github/workflows/rust-tests.yml:39-51

Publishing Configuration

All workspace crates share publishing configuration through Cargo.toml:1-9:

Field	Value	Purpose
`authors`	`["Jeremy Harris <jeremy.harris@zenosmosis.com>"]`	Crate metadata
`version`	`"0.15.5-alpha"`	Current release version
`edition`	`"2024"`	Rust edition
`repository`	`"https://github.com/jzombie/rust-simd-r-drive"`	Source location
`license`	`"Apache-2.0"`	Open source license
`categories`	`["database-implementations", "data-structures", "filesystem"]`	crates.io categories
`keywords`	`["storage-engine", "binary-storage", "append-only", "simd", "mmap"]`	Search keywords
`publish`	`true`	Enable publishing to crates.io

Individual crates inherit these values using workspace inheritance: edition.workspace = true.

Sources: Cargo.toml:1-21

Next Steps

Building and Testing : See Building and Testing for detailed instructions on running unit tests, integration tests, concurrency tests, and benchmarks.
CI/CD Pipeline : See CI/CD Pipeline for information about GitHub Actions workflows, matrix testing strategy, and quality checks.
Version History and Migration : See Version History and Migration for breaking changes, migration paths, and storage format evolution.

For information about specific subsystems:

Storage engine internals: Core Storage Engine
Network layer: Network Layer and RPC
Python integration: Python Integration
Performance features: Performance Optimizations

Building and Testing

Relevant source files

This page provides instructions for building SIMD R Drive from source, running the test suite, and executing performance benchmarks. It covers workspace configuration, feature flags, test types, and benchmark utilities. For information about CI/CD pipeline configuration and automated quality checks, see CI/CD Pipeline.

Prerequisites

Required Dependencies:

Rust toolchain (stable channel)
Cargo package manager (bundled with Rust)

Optional Dependencies:

Python 3.10-3.13 (for Python bindings)
Maturin (for building Python wheels)

The project uses Rust 2024 edition and requires no platform-specific dependencies beyond the standard Rust toolchain.

Sources: Cargo.toml:1-13

Workspace Structure

SIMD R Drive is organized as a Cargo workspace with multiple crates. Understanding the workspace layout is essential for targeted builds and tests.

Workspace Members:

graph TB
    subgraph "Workspace Root"
        Root["Cargo.toml\nWorkspace Configuration"]
end
    
    subgraph "Core Crates"
        Core["simd-r-drive\n(Root Package)"]
EntryHandle["simd-r-drive-entry-handle\n(Zero-Copy API)"]
Extensions["extensions/\n(Utility Functions)"]
end
    
    subgraph "Network Experiments"
        WSServer["experiments/simd-r-drive-ws-server\n(WebSocket Server)"]
WSClient["experiments/simd-r-drive-ws-client\n(Rust Client)"]
ServiceDef["experiments/simd-r-drive-muxio-service-definition\n(RPC Contract)"]
end
    
    subgraph "Python Bindings (Excluded)"
        PyBindings["experiments/bindings/python\n(PyO3 Direct)"]
PyWSClient["experiments/bindings/python-ws-client\n(WebSocket Client)"]
end
    
 
   Root --> Core
 
   Root --> EntryHandle
 
   Root --> Extensions
 
   Root --> WSServer
 
   Root --> WSClient
 
   Root --> ServiceDef
    
 
   PyBindings -.->|excluded from workspace| Root
 
   PyWSClient -.->|excluded from workspace| Root
    
 
   Core --> EntryHandle
 
   WSServer --> ServiceDef
 
   WSClient --> ServiceDef

Crate	Path	Purpose
`simd-r-drive`	`.`	Core storage engine
`simd-r-drive-entry-handle`	`./simd-r-drive-entry-handle`	Zero-copy data access API
`simd-r-drive-extensions`	`./extensions`	Utility functions and helpers
`simd-r-drive-ws-server`	`./experiments/simd-r-drive-ws-server`	Axum-based WebSocket server
`simd-r-drive-ws-client`	`./experiments/simd-r-drive-ws-client`	Native Rust WebSocket client
`simd-r-drive-muxio-service-definition`	`./experiments/simd-r-drive-muxio-service-definition`	RPC service contract

Excluded from Workspace:

experiments/bindings/python - Python direct bindings (built separately with Maturin)
experiments/bindings/python-ws-client - Python WebSocket client (built separately with Maturin)

Python bindings are excluded because they require a different build system (Maturin) and have incompatible dependency resolution requirements.

Sources: Cargo.toml:65-78

Building the Project

Basic Build Commands

Build all workspace members:

Build with release optimizations:

Build specific crate:

Build all targets (binaries, libraries, tests, benchmarks):

Sources: .github/workflows/rust-tests.yml54

graph LR
    subgraph "Feature Flags"
        Default["default\n(empty set)"]
Parallel["parallel\nEnables rayon"]
ExposeAPI["expose-internal-api\nExposes internal symbols"]
Arrow["arrow\nArrow integration"]
AllFeatures["--all-features\nEnable everything"]
end
    
    subgraph "Dependencies Enabled"
        Rayon["rayon = 1.10.0\nParallel iteration"]
EntryArrow["simd-r-drive-entry-handle/arrow\nArrow conversions"]
end
    
 
   Parallel --> Rayon
 
   Arrow --> EntryArrow
 
   AllFeatures --> Parallel
 
   AllFeatures --> ExposeAPI
 
   AllFeatures --> Arrow

Feature Flags

The simd-r-drive crate supports optional feature flags that enable additional functionality:

Feature Flag Reference:

Feature	Dependencies	Purpose
`default`	None	Standard storage engine only
`parallel`	`rayon = "1.10.0"`	Parallel iteration support for bulk operations
`expose-internal-api`	None	Exposes internal APIs for advanced use cases
`arrow`	`simd-r-drive-entry-handle/arrow`	Enables Apache Arrow integration

Build Examples:

Sources: Cargo.toml:49-55 Cargo.toml30

CLI Binary

The workspace includes a CLI binary for direct interaction with the storage engine:

The CLI entry point is defined at src/main.rs:1-12 and delegates to command execution logic in the cli module.

Sources: src/main.rs:1-12

Running Tests

Test Organization

Test Categories:

Unit Tests - Embedded in source files, test individual functions and modules
Integration Tests - Located in tests/ directory, test public API interactions
Concurrency Tests - Multi-threaded scenarios validating thread safety
Documentation Tests - Code examples in documentation comments

Sources: Cargo.toml:36-47

Running All Tests

Execute the complete test suite:

Sources: .github/workflows/rust-tests.yml57

Concurrency Tests

Concurrency tests validate thread-safe operations and are located in tests/concurrency_tests.rs:1-230 These tests use serial_test to prevent parallel execution and tokio for async runtime.

Test Scenarios:

graph TB
    subgraph "Concurrency Test Structure"
        TestAttr["#[tokio::test(flavor=multi_thread)]\n#[serial]"]
TempDir["tempfile::tempdir()\nIsolated Storage"]
DataStore["Arc&lt;DataStore&gt;\nShared Storage"]
Tasks["tokio::spawn\nConcurrent Tasks"]
end
    
    subgraph "Test Scenarios"
        StreamTest["concurrent_slow_streamed_write_test\nParallel stream writes"]
WriteTest["concurrent_write_test\n16 threads × 10 writes"]
InterleavedTest["interleaved_read_write_test\nRead/Write coordination"]
end
    
 
   TestAttr --> TempDir
 
   TempDir --> DataStore
 
   DataStore --> Tasks
    
 
   Tasks --> StreamTest
 
   Tasks --> WriteTest
 
   Tasks --> InterleavedTest

1. Concurrent Slow Streamed Write Test (tests/concurrency_tests.rs:16-109):

Simulates slow streaming writes from multiple threads
Uses SlowReader wrapper to introduce artificial latency
Validates data integrity after concurrent stream completion
Configuration: 1MB payloads with 100ms read delays

2. Concurrent Write Test (tests/concurrency_tests.rs:111-161):

Spawns 16 threads, each performing 10 writes
Tests high-contention write scenarios
Validates all 160 writes are retrievable
Uses 5ms delays to simulate realistic timing

3. Interleaved Read/Write Test (tests/concurrency_tests.rs:163-229):

Tests read-after-write and write-after-read patterns
Uses tokio::sync::Notify for coordination
Validates proper synchronization between readers and writers

Run Concurrency Tests:

Key Dependencies:

Dependency	Purpose	Reference
`serial_test = "3.2.0"`	Prevents parallel test execution	Cargo.toml44
`tokio` (with `rt-multi-thread`, `macros`)	Async runtime for concurrent tests	Cargo.toml47
`tempfile = "3.19.0"`	Temporary file creation for isolated tests	Cargo.toml45

Sources: tests/concurrency_tests.rs:1-230 Cargo.toml:44-47

graph LR
    subgraph "Benchmark Configuration"
        CargoBench["cargo bench\nCriterion Runner"]
StorageBench["storage_benchmark\nharness = false"]
ContentionBench["contention_benchmark\nharness = false"]
end
    
    subgraph "Storage Benchmark Operations"
        AppendOp["benchmark_append_entries\n1M writes in batches"]
SeqRead["benchmark_sequential_reads\nZero-copy iteration"]
RandRead["benchmark_random_reads\n1M random lookups"]
BatchRead["benchmark_batch_reads\nVectorized reads"]
end
    
 
   CargoBench --> StorageBench
 
   CargoBench --> ContentionBench
    
 
   StorageBench --> AppendOp
 
   StorageBench --> SeqRead
 
   StorageBench --> RandRead
 
   StorageBench --> BatchRead

Running Benchmarks

The workspace includes micro-benchmarks using the Criterion framework to measure storage engine performance.

Benchmark Targets

Benchmark Targets:

storage_benchmark - Single-process micro-benchmarks (benches/storage_benchmark.rs:1-234)
contention_benchmark - Multi-threaded contention scenarios (referenced in Cargo.toml:62-63)

Both use harness = false to disable Cargo's default benchmark harness and use custom timing logic.

Sources: Cargo.toml:57-63

Storage Benchmark

The storage benchmark (benches/storage_benchmark.rs:1-234) measures four critical operations:

Configuration Constants:

Constant	Value	Purpose
`NUM_ENTRIES`	1,000,000	Total entries for write phase
`ENTRY_SIZE`	8 bytes	Fixed payload size
`WRITE_BATCH_SIZE`	1,024	Entries per batch write
`READ_BATCH_SIZE`	1,024	Entries per batch read
`NUM_RANDOM_CHECKS`	1,000,000	Random read operations
`NUM_BATCH_CHECKS`	1,000,000	Batch read operations

Benchmark Operations:

1. Append Entries (benches/storage_benchmark.rs:52-83):

Writes 1M entries in batches using batch_write
Measures write throughput (writes/second)
Uses fixed 8-byte little-endian payloads

2. Sequential Reads (benches/storage_benchmark.rs:98-118):

Iterates through all entries using zero-copy iteration
Validates data integrity during iteration
Measures sequential read throughput

3. Random Reads (benches/storage_benchmark.rs:124-149):

Performs 1M random single-key lookups
Uses rand::rng().random_range() for key selection
Validates retrieved data matches expected values

4. Batch Reads (benches/storage_benchmark.rs:155-181):

Performs vectorized reads using batch_read
Processes reads in batches of 1,024 keys
Measures amortized batch read performance

Run Benchmarks:

Benchmark Output Format:

The storage benchmark uses a custom fmt_rate function (benches/storage_benchmark.rs:220-233) to format throughput metrics with thousands separators and three decimal places:

Wrote 1,000,000 entries of 8 bytes in 2.345s (426,439.232 writes/s)
Sequentially read 1,000,000 entries in 1.234s (810,372.771 reads/s)
Randomly read 1,000,000 entries in 3.456s (289,351.852 reads/s)
Batch-read verified 1,000,000 entries in 0.987s (1,013,171.005 reads/s)

Sources: benches/storage_benchmark.rs:1-234 Cargo.toml:57-63

graph TB
    subgraph "CI Test Matrix"
        Trigger["Push to main\nor Pull Request"]
end
    
    subgraph "Operating Systems"
        Ubuntu["ubuntu-latest"]
MacOS["macos-latest"]
Windows["windows-latest"]
end
    
    subgraph "Feature Configurations"
        Default["Default (empty)"]
NoDefault["--no-default-features"]
Parallel["--features parallel"]
ExposeAPI["--features expose-internal-api"]
Combined["--features=parallel,expose-internal-api"]
AllFeatures["--all-features"]
end
    
    subgraph "Build Steps"
        Cache["Cache Cargo dependencies"]
Build["cargo build --workspace --all-targets"]
Test["cargo test --workspace --all-targets --verbose"]
BenchCheck["cargo bench --workspace --no-run"]
end
    
 
   Trigger --> Ubuntu
 
   Trigger --> MacOS
 
   Trigger --> Windows
    
 
   Ubuntu --> Default
 
   Ubuntu --> NoDefault
 
   Ubuntu --> Parallel
 
   Ubuntu --> ExposeAPI
 
   Ubuntu --> Combined
 
   Ubuntu --> AllFeatures
    
 
   Default --> Cache
 
   Cache --> Build
 
   Build --> Test
 
   Test --> BenchCheck

CI/CD Test Matrix

The GitHub Actions workflow runs tests across multiple platforms and feature combinations to ensure compatibility.

Test Matrix Configuration:

OS	Feature Flags
Ubuntu, macOS, Windows	Default
Ubuntu, macOS, Windows	`--no-default-features`
Ubuntu, macOS, Windows	`--features parallel`
Ubuntu, macOS, Windows	`--features expose-internal-api`
Ubuntu, macOS, Windows	`--features=parallel,expose-internal-api`
Ubuntu, macOS, Windows	`--all-features`

Total Test Combinations: 3 OS × 6 feature configs = 18 test jobs

CI Pipeline Steps:

Cache Dependencies (.github/workflows/rust-tests.yml:40-51) - Caches ~/.cargo and target/ directories using lock file hash
Build (.github/workflows/rust-tests.yml:53-54) - Compiles all workspace targets with specified feature flags
Test (.github/workflows/rust-tests.yml:56-57) - Runs complete test suite with verbose output
Check Benchmarks (.github/workflows/rust-tests.yml:60-61) - Validates benchmark compilation without execution

Caching Strategy:

The workflow uses cache keys based on:

Operating system (runner.os)
Lock file hash (hashFiles('**/Cargo.lock'))
Feature flags (matrix.flags)

This ensures efficient dependency reuse while avoiding cross-contamination between different build configurations.

Sources: .github/workflows/rust-tests.yml:1-62

Development Dependencies

Test Dependencies

Dependency	Version	Purpose
`serial_test`	3.2.0	Sequential test execution for concurrency tests
`tempfile`	3.19.0	Temporary file/directory creation
`tokio`	1.45.1	Async runtime with multi-thread support
`rand`	0.9.0	Random number generation for benchmarks
`bincode`	1.3.3	Serialization for test data
`serde`	1.0.219	Serialization framework
`serde_json`	1.0.140	JSON serialization for test data
`futures`	0.3.31	Async utilities
`bytemuck`	1.23.2	Zero-copy type conversions

Sources: Cargo.toml:36-47

Benchmark Dependencies

Dependency	Version	Purpose
`criterion`	0.6.0	Benchmark framework (workspace dependency)
`thousands`	0.2.0	Number formatting for benchmark output
`rand`	0.9.0	Random data generation

Sources: Cargo.toml39 Cargo.toml46 Cargo.toml98 Cargo.toml108

Build Artifacts

Binary Output

Compiled binaries are located in:

Debug builds: target/debug/simd-r-drive
Release builds: target/release/simd-r-drive

Library Output

The workspace produces several library artifacts:

libsimd_r_drive.rlib - Core storage engine
libsimd_r_drive_entry_handle.rlib - Zero-copy API
libsimd_r_drive_extensions.rlib - Utility functions

Ignored Files

The .gitignore configuration excludes build artifacts and data files:

**/target - All Cargo build directories
*.bin - Binary storage files created during testing
/data - Development and experimentation data directory
.cargo/config.toml - Local Cargo overrides

Sources: .gitignore:1-10

Best Practices

Pre-Commit Checklist:

Run cargo test --workspace --all-targets to validate all tests pass
Run cargo bench --workspace --no-run to ensure benchmarks compile
Run cargo build --all-features to validate all feature combinations
Check that no test data files (.bin, /data) are committed

Performance Validation:

Use cargo bench --bench storage_benchmark to establish performance baselines
Compare throughput metrics before and after optimization changes
Monitor memory usage during concurrency tests

Feature Flag Testing:

Always test with --no-default-features to ensure minimal builds work
Test all feature combinations that will be published
Verify that feature flags are properly gated with #[cfg(feature = "...")]

Sources: .github/workflows/rust-tests.yml:1-62 benches/storage_benchmark.rs:1-234

CI/CD Pipeline

Relevant source files

Purpose and Scope

This document describes the Continuous Integration and Continuous Deployment (CI/CD) infrastructure for the SIMD R Drive project. It covers the GitHub Actions workflows that enforce code quality, run automated tests, and validate security compliance. The CI/CD system ensures that all code changes meet quality standards before being merged.

For information about building and running tests locally, see Building and Testing. For version-specific breaking changes and migration strategies, see Version History and Migration.

CI/CD Architecture Overview

The SIMD R Drive project uses two primary GitHub Actions workflows to validate all code changes:

Diagram: CI/CD Workflow Architecture

graph TB
    subgraph "Trigger Events"
        Push["Push to main"]
PR["Pull Request"]
Tag["Tag push (v*)"]
end
    
    subgraph "rust-tests.yml"
        TestMatrix["Matrix Testing"]
BuildStep["cargo build"]
TestStep["cargo test"]
BenchCheck["cargo bench --no-run"]
Cache["Cargo Cache"]
end
    
    subgraph "rust-lint.yml"
        FmtCheck["cargo fmt --check"]
ClippyCheck["cargo clippy"]
DocCheck["cargo doc"]
DenyCheck["cargo-deny check"]
AuditCheck["cargo-audit"]
end
    
 
   Push --> TestMatrix
 
   Push --> FmtCheck
 
   PR --> TestMatrix
 
   PR --> FmtCheck
 
   Tag --> TestMatrix
    
 
   TestMatrix --> Cache
 
   Cache --> BuildStep
 
   BuildStep --> TestStep
 
   TestStep --> BenchCheck
    
 
   FmtCheck --> ClippyCheck
 
   ClippyCheck --> DocCheck
 
   DocCheck --> DenyCheck
 
   DenyCheck --> AuditCheck
    
    style TestMatrix fill:#f9f9f9
    style FmtCheck fill:#f9f9f9

The system runs two independent workflows in parallel. The rust-tests.yml workflow performs functional validation through matrix testing across multiple platforms and feature configurations. The rust-lint.yml workflow enforces code quality standards, documentation completeness, and security compliance. All checks must pass before a pull request can be merged.

Sources: .github/workflows/rust-tests.yml:1-62 .github/workflows/rust-lint.yml:1-44

Test Workflow (rust-tests.yml)

Workflow Configuration

The test workflow executes on three trigger events:

Trigger	Condition	Purpose
`push`	Branches: `main`	Validate main branch integrity
`push`	Tags: `v*`	Validate release candidates
`pull_request`	Branches: `main`	Validate proposed changes

Sources: .github/workflows/rust-tests.yml:3-8

Matrix Testing Strategy

The workflow uses a comprehensive matrix strategy to test across multiple dimensions:

Diagram: Matrix Testing Dimensions

graph TB
    subgraph "Operating Systems"
        Ubuntu["ubuntu-latest"]
MacOS["macos-latest"]
Windows["windows-latest"]
end
    
    subgraph "Feature Configurations"
        Default["Default\nflags: (empty)"]
NoDefault["No Default Features\nflags: --no-default-features"]
Parallel["Parallel\nflags: --features parallel"]
ExposeAPI["Expose Internal API\nflags: --features expose-internal-api"]
ParallelAPI["Parallel + Expose API\nflags: --features=parallel,expose-internal-api"]
AllFeatures["All Features\nflags: --all-features"]
end
    
    subgraph "Test Job Matrix"
        Job["test job\nOS x Features\n3 x 6 = 18 configurations"]
end
    
 
   Ubuntu --> Job
 
   MacOS --> Job
 
   Windows --> Job
    
 
   Default --> Job
 
   NoDefault --> Job
 
   Parallel --> Job
 
   ExposeAPI --> Job
 
   ParallelAPI --> Job
 
   AllFeatures --> Job

Each job runs the complete build-test-benchmark pipeline with a unique combination of operating system and feature flags, resulting in 18 total test configurations per workflow execution.

Sources: .github/workflows/rust-tests.yml:13-30

Test Job Steps

Each matrix job executes the following steps:

1. Repository Checkout

Sources: .github/workflows/rust-tests.yml:33-34

2. Rust Toolchain Installation

Uses the stable Rust toolchain across all platforms.

Sources: .github/workflows/rust-tests.yml:36-37

3. Dependency Caching

The workflow caches Cargo dependencies to reduce build times:

Cache Path	Contents
`~/.cargo/bin/`	Installed cargo binaries
`~/.cargo/registry/index/`	Crate registry index
`~/.cargo/registry/cache/`	Downloaded crate archives
`~/.cargo/git/db/`	Git dependencies
`target/`	Build artifacts

The cache key includes the OS, Cargo.lock hash, and matrix flags to ensure correct cache isolation between configurations.

Sources: .github/workflows/rust-tests.yml:40-52

4. Build Step

Builds all workspace crates with all targets (library, binaries, tests, benches, examples) using the matrix-specified feature flags.

Sources: .github/workflows/rust-tests.yml:54-55

5. Test Execution

Runs all tests in the workspace with verbose output, including:

Unit tests
Integration tests
Documentation tests
Example tests

Sources: .github/workflows/rust-tests.yml:57-58

6. Benchmark Compilation Check

Validates that all benchmarks compile successfully without executing them. The --no-run flag ensures benchmarks are compiled but not executed in CI.

Sources: .github/workflows/rust-tests.yml:60-62

Matrix Configuration Table

Matrix Variable	Values	Count
`os`	`ubuntu-latest`, `macos-latest`, `windows-latest`	3
`name`	`Default`, `No Default Features`, `Parallel`, `Expose Internal API`, `Parallel + Expose API`, `All Features`	6
`flags`	`""`, `"--no-default-features"`, `"--features parallel"`, `"--features expose-internal-api"`, `"--features=parallel,expose-internal-api"`, `"--all-features"`	6
Total Jobs	OS × Features	18

Sources: .github/workflows/rust-tests.yml:14-30

Lint Workflow (rust-lint.yml)

Workflow Configuration

The lint workflow runs on all pushes and pull requests without branch restrictions:

Sources: .github/workflows/rust-lint.yml3

Lint Job Architecture

Diagram: Lint Workflow Execution Pipeline

The lint workflow enforces five quality gates in sequence, with each step dependent on the previous step's success.

Sources: .github/workflows/rust-lint.yml:5-44

Quality Gate 1: Code Formatting

Validates that all Rust code adheres to the standard formatting rules defined by rustfmt. The --check flag ensures the command fails if any files require formatting without modifying them.

Sources: .github/workflows/rust-lint.yml:26-27

Quality Gate 2: Clippy Linting

Runs Clippy across the entire workspace with maximum coverage:

--workspace: Checks all workspace crates
--all-targets: Includes lib, bins, tests, benches, examples
--all-features: Enables all optional features
-D warnings: Treats all warnings as errors

Sources: .github/workflows/rust-lint.yml:29-31

Quality Gate 3: Documentation Validation

Generates documentation and fails on any documentation warnings:

RUSTDOCFLAGS="-D warnings": Treats doc warnings as errors
--no-deps: Only documents workspace crates, not dependencies
--document-private-items: Includes private items to ensure complete internal documentation

Sources: .github/workflows/rust-lint.yml:33-35

Quality Gate 4: Dependency Security Check

Runs cargo-deny to validate dependency security and licensing:

Checks for security advisories in dependencies
Validates license compatibility
Detects duplicate dependencies
Enforces allowed/denied crate policies

The tool requires prior installation via cargo install cargo-deny.

Sources: .github/workflows/rust-lint.yml:20-22 .github/workflows/rust-lint.yml:37-39

Quality Gate 5: Vulnerability Audit

Scans Cargo.lock for known security vulnerabilities in dependencies by checking against the RustSec Advisory Database. Requires prior installation via cargo install cargo-audit.

Sources: .github/workflows/rust-lint.yml:20-23 .github/workflows/rust-lint.yml:41-43

Component Installation Workaround

The workflow includes a workaround for a GitHub Actions environment issue:

This step explicitly installs rustfmt and clippy components to address a toolchain installation issue where these components may not be available by default.

Sources: .github/workflows/rust-lint.yml:13-18

Quality Gates Summary

The complete CI/CD pipeline enforces the following quality gates:

Gate	Workflow	Command	Failure Condition
Build	rust-tests.yml	`cargo build`	Compilation errors
Tests	rust-tests.yml	`cargo test`	Test failures
Benchmarks	rust-tests.yml	`cargo bench --no-run`	Benchmark compilation errors
Formatting	rust-lint.yml	`cargo fmt --check`	Unformatted code
Linting	rust-lint.yml	`cargo clippy -D warnings`	Clippy warnings or errors
Documentation	rust-lint.yml	`cargo doc -D warnings`	Missing or invalid docs
Dependencies	rust-lint.yml	`cargo deny check`	Security advisories, license issues
Vulnerabilities	rust-lint.yml	`cargo audit`	Known CVEs in dependencies

All quality gates must pass for the CI/CD pipeline to succeed. The matrix testing strategy in rust-tests.yml ensures compatibility across three operating systems and six feature configurations, providing comprehensive validation coverage.

Sources: .github/workflows/rust-tests.yml:54-62 .github/workflows/rust-lint.yml:26-43

Workflow Execution Context

Runner Configuration

Both workflows execute on GitHub-hosted runners:

Workflow	Runner	OS
rust-tests.yml	`ubuntu-latest`, `macos-latest`, `windows-latest`	Multi-platform
rust-lint.yml	`ubuntu-latest`	Linux only

Sources: .github/workflows/rust-tests.yml13 .github/workflows/rust-lint.yml7

Fail-Fast Strategy

The test workflow uses fail-fast: false to allow all matrix jobs to complete even if one fails, providing comprehensive failure information:

This configuration is valuable for identifying platform-specific or feature-specific issues across the entire matrix rather than stopping at the first failure.

Sources: .github/workflows/rust-tests.yml15

graph LR
    LocalDev["Local Development"]
Commit["git commit"]
Push["git push"]
PR["Pull Request"]
TestsCI["rust-tests.yml\n18 matrix jobs"]
LintCI["rust-lint.yml\n5 quality gates"]
Merge["Merge to main"]
Tag["Tag release (v*)"]
LocalDev --> Commit
 
   Commit --> Push
 
   Push --> PR
    
 
   PR --> TestsCI
 
   PR --> LintCI
    
 
   TestsCI -->|Pass| Merge
 
   LintCI -->|Pass| Merge
    
 
   Merge --> Tag
 
   Tag --> TestsCI

Integration with Development Workflow

The CI/CD pipeline integrates with the development workflow at multiple points:

Diagram: CI/CD Integration Points

The CI/CD system provides automated validation at multiple stages of the development lifecycle. Pull requests trigger both workflows, ensuring all quality gates pass before code review. Pushes to main and version tags also trigger the test workflow to validate the stability of the main branch and release candidates.

Sources: .github/workflows/rust-tests.yml:3-8 .github/workflows/rust-lint.yml3

Ignored Paths

The repository excludes certain paths from version control that are relevant to CI/CD execution:

Path	Purpose
`**/target`	Build artifacts generated by cargo
`*.bin`	Binary data files used in tests
`/data`	Debugging and experimentation data
`.cargo/config.toml`	Local cargo configuration overrides

These exclusions prevent CI cache pollution and ensure consistent builds across environments.

Sources: .gitignore:1-11

Debug Assertions in CI

The codebase includes conditional debug assertions that activate during CI test runs:

The debug_assert_aligned and debug_assert_aligned_offset functions in simd-r-drive-entry-handle/src/debug_assert_aligned.rs:26-88 activate only when cfg(any(test, debug_assertions)) is true. In CI test runs, these assertions validate alignment invariants for memory-mapped payloads and zero-copy access patterns without impacting release builds or benchmarks.

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-88

Maintenance and Evolution

The CI/CD pipeline configuration is versioned alongside the codebase and evolves with project requirements. Changes to the pipeline typically correspond to:

New feature flags requiring matrix coverage
Additional security tools or quality gates
Platform support changes
Performance optimization validation

Historical changes to the CI/CD infrastructure can be tracked through the repository's commit history in the .github/workflows/ directory.

For information about breaking changes that may affect CI/CD behavior, see Version History and Migration, particularly the alignment changes in version 0.15.0 that introduced new debug assertions.

Sources: CHANGELOG.md:25-51

Version History and Migration

Relevant source files

This page documents the version history of SIMD R Drive, detailing breaking changes, format evolutions, and migration procedures required when upgrading between versions. The primary focus is on storage format compatibility and the alignment strategy evolution that occurred between versions 0.13.x and 0.15.x.

For information about the current storage architecture and entry structure, see Entry Structure and Metadata. For details on the current alignment strategy and performance characteristics, see Payload Alignment and Cache Efficiency.

Version Overview

SIMD R Drive follows a loosely semantic versioning scheme and is currently in alpha (0.x.x-alpha). Breaking changes in the storage format have occurred as the alignment strategy evolved to optimize zero-copy access and SIMD performance.

Version	Release Date	Status	Key Changes
0.15.5-alpha	2025-10-27	Current	Apache Arrow 57.0.0, no format changes
0.15.0-alpha	2025-09-25	Breaking	Alignment increased to 64 bytes
0.14.0-alpha	2025-09-08	Breaking	Introduced 16-byte alignment with pre-pad
≤0.13.x-alpha	Pre-2025-09	Legacy	No alignment guarantees

Sources: CHANGELOG.md:1-82

Storage Format Evolution

The most significant changes in SIMD R Drive's history involve the evolution of payload alignment strategy. These changes were driven by the need to optimize zero-copy access, SIMD operations, and CPU cache efficiency.

timeline
    title "Storage Format Alignment Evolution"
    section Pre-0.14
        No Alignment : Payloads written sequentially\n: No pre-padding\n: Variable starting addresses\nsection 0.14.0-alpha\n16-byte Alignment : Introduced PAYLOAD_ALIGNMENT constant
                          : Added pre-pad bytes (0-15)
                          : SSE-friendly boundaries
    section 0.15.0-alpha
        64-byte Alignment : Increased to cache-line size
                          : Pre-pad bytes (0-63)
                          : AVX-512 and cacheline optimized

Timeline: Alignment Strategy Evolution

Format Comparison: Pre-0.14 vs 0.14+ vs 0.15+

Sources: CHANGELOG.md:55-82 CHANGELOG.md:25-52 README.md:51-59

Version Compatibility Matrix

Understanding compatibility between readers and writers is critical when managing SIMD R Drive deployments across multiple services or upgrading existing storage files.

Reader/Writer Compatibility

Writer Version	≤0.13.x Reader	0.14.x Reader	0.15.x Reader
≤0.13.x	✅ Compatible	❌ Incompatible*	❌ Incompatible*
0.14.x	❌ Incompatible	✅ Compatible	❌ Incompatible
0.15.x	❌ Incompatible	❌ Incompatible	✅ Compatible

Legend:

✅ Compatible: Reader can correctly parse the storage format
❌ Incompatible: Reader will misinterpret pre-pad bytes as payload data
- Newer readers may attempt to parse old formats but will fail validation

Incompatibility Root Cause

The incompatibility arises because:

Pre-0.14 readers expect prev_offset to point directly to payload start
0.14+ formats insert 0-15 or 0-63 pre-pad bytes before the payload
Old readers interpret these zero bytes as part of the payload, corrupting data

Sources: CHANGELOG.md:55-60 CHANGELOG.md:27-31

Migration Procedures

Migrating from 0.13.x or earlier to 0.14.x+

When upgrading from pre-alignment versions to any aligned version (0.14.x or 0.15.x), you must regenerate storage files.

flowchart TD
    Start["Start: Have 0.13.x store"]
ReadOld["Read all entries\nwith 0.13.x binary"]
CreateNew["Create new store\nwith 0.14.x+ binary"]
WriteNew["Write entries to\nnew store"]
Verify["Verify new store\nintegrity"]
Replace["Replace old file\nwith new file"]
Start --> ReadOld
 
   ReadOld --> CreateNew
 
   CreateNew --> WriteNew
 
   WriteNew --> Verify
 
   Verify -->|Success| Replace
 
   Verify -->|Failure| CreateNew
    
    MultiService{"Multi-service\ndeployment?"}
UpgradeReaders["Deploy reader upgrades\nto all services first"]
UpgradeWriters["Deploy writer upgrades\nafter readers"]
Replace --> MultiService
 
   MultiService -->|Yes| UpgradeReaders
 
   MultiService -->|No| Done["Migration Complete"]
UpgradeReaders --> UpgradeWriters
 
   UpgradeWriters --> Done

Migration Workflow

Step-by-Step Migration: 0.13.x → 0.14.x/0.15.x

Prepare the old binary:
Build the new binary:
Extract data from old store:
Create new store with new binary:
Verify integrity:
Replace old file:

Sources: CHANGELOG.md:75-82 CHANGELOG.md:43-52

Migrating from 0.14.x to 0.15.x

The migration from 0.14.x (16-byte alignment) to 0.15.x (64-byte alignment) follows the same regeneration process, as the alignment constants are incompatible.

Alignment Constant Changes

The key difference between versions:

Version	Constant File	`PAYLOAD_ALIGN_LOG2`	`PAYLOAD_ALIGNMENT`
0.14.x	`src/storage_engine/constants.rs`	4	16 bytes
0.15.x	`src/storage_engine/constants.rs`	6	64 bytes

Migration Steps: 0.14.x → 0.15.x

Follow the same procedure as 0.13.x → 0.14.x migration:

Read entries with 0.14.x binary
Write to new store with 0.15.x binary
Verify new store integrity
Deploy reader upgrades before writer upgrades in multi-service environments

Rationale for 64-byte alignment:

Matches typical CPU cache line size
Enables full-speed AVX-512 operations
Provides safety margin for future SIMD extensions
Optimizes for zero-copy typed slices (see Payload Alignment and Cache Efficiency)

Sources: CHANGELOG.md:25-52 README.md:51-59 CHANGELOG.md39

Version-Specific Details

Version 0.15.5-alpha (2025-10-27)

Type: Non-breaking maintenance release

Changes:

Updated Apache Arrow dependency from version 56.x to 57.0.0
No storage format changes
No API changes
No migration required

Compatibility: Fully compatible with 0.15.0-alpha through 0.15.4-alpha

Sources: CHANGELOG.md:19-22

Version 0.15.0-alpha (2025-09-25)

Type: Breaking storage format change

Primary Change: Increased default PAYLOAD_ALIGNMENT from 16 bytes to 64 bytes

Motivation:

Cache-line alignment for optimal memory access
Support for wider SIMD instructions (AVX-512)
Improved zero-copy performance for typed slices
Future-proofing for emerging CPU architectures

Breaking Changes:

Storage files created with 0.15.x cannot be read by 0.14.x or earlier
Pre-pad calculation changed: pad = (64 - (prev_tail % 64)) & 63
Pointer and offset alignment assertions added in debug/test builds

New Features:

Debug-only alignment assertions via debug_assert_aligned() function
Offset-level alignment validation via debug_assert_aligned_offset()
Enhanced documentation for alignment guarantees

Code Changes:

simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-89 - New alignment assertion functions
Configuration via PAYLOAD_ALIGNMENT constant (64 bytes)

Migration Path: See [Migrating from 0.14.x to 0.15.x](https://github.com/jzombie/rust-simd-r-drive/blob/487b7b98/Migrating from 0.14.x to 0.15.x) above.

Sources: CHANGELOG.md:25-52 simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-89

Version 0.14.0-alpha (2025-09-08)

Type: Breaking storage format change

Primary Change: Introduced fixed payload alignment with pre-padding

Motivation:

Enable zero-copy typed views (e.g., &[u16], &[u32], &[u64])
Guarantee SIMD-safe alignment for AVX2/NEON operations
Improve CPU cache utilization

Breaking Changes:

New on-disk layout with optional pre-pad bytes (0-15 bytes)
Files written by 0.14.0-alpha are incompatible with ≤0.13.x readers
Pre-pad calculation: pad = (16 - (prev_tail % 16)) & 15

New Features:

Experimental arrow feature flag
EntryHandle::as_arrow_buffer() method for Apache Arrow integration
EntryHandle::into_arrow_buffer() method for zero-copy Arrow buffers
Configurable alignment via PAYLOAD_ALIGN_LOG2 and PAYLOAD_ALIGNMENT constants

Storage Layout:

[Pre-Pad: 0-15 bytes] [Payload: variable] [Metadata: 20 bytes]

Configuration:

Alignment controlled in src/storage_engine/constants.rs
PAYLOAD_ALIGN_LOG2 = 4 (2^4 = 16 bytes)
PAYLOAD_ALIGNMENT = 1 << PAYLOAD_ALIGN_LOG2

Migration Path: See [Migrating from 0.13.x or earlier to 0.14.x+](https://github.com/jzombie/rust-simd-r-drive/blob/487b7b98/Migrating from 0.13.x or earlier to 0.14.x+) above.

Sources: CHANGELOG.md:55-82 README.md:112-138

graph TB
    subgraph Phase1["Phase 1: Reader Upgrades"]
R1["Service A\n(Reader)"]
R2["Service B\n(Reader)"]
R3["Service C\n(Reader)"]
end
    
    subgraph Phase2["Phase 2: Writer Upgrades"]
W1["Service D\n(Writer)"]
W2["Service E\n(Writer)"]
end
    
    subgraph Phase3["Phase 3: Storage Migration"]
M1["Migrate shared\nstorage files"]
M2["Switch to new\nformat"]
end
    
    Start["Start: All services\non old version"]
Start --> Phase1
 
   Phase1 -->|Verify all readers operational| Phase2
 
   Phase2 -->|Verify all writers operational| Phase3
 
   Phase3 --> Complete["Complete: All services\non new version"]
Note1["Readers can handle\nold format during\ntransition"]
Note2["Writers continue\nproducing old format\nuntil migration"]
Phase1 -.-> Note1
 
   Phase2 -.-> Note2

Multi-Service Deployment Strategy

When upgrading SIMD R Drive across multiple services that share storage files or communicate via the WebSocket RPC interface, a staged deployment approach prevents compatibility issues.

Recommended Upgrade Order

Deployment Steps

Phase 1 - Upgrade all readers:
- Deploy new binaries to services that only read from storage
- Deploy new WebSocket clients (Python or Rust)
- Verify that services can still read existing (old format) data
- Monitor for errors or compatibility issues
Phase 2 - Upgrade writers (optional hold):
- Deploy new binaries to services that write to storage
- Important: Writers should continue using the old format during this phase
- Verify writer functionality with old format
Phase 3 - Migrate storage and switch format:
- Use migration procedure to regenerate storage files
- Update writer configuration to use new format
- Monitor write operations for alignment correctness
Rollback strategy:
- Keep backups of old format files until migration is verified
- Keep old binaries available for emergency rollback
- Test rollback procedure before starting migration

Sources: CHANGELOG.md:43-52 CHANGELOG.md:75-82

graph TB
    subgraph DebugMode["Debug/Test Mode"]
Call1["debug_assert_aligned(ptr, align)"]
Call2["debug_assert_aligned_offset(offset)"]
Check1["Verify ptr % align == 0"]
Check2["Verify offset % PAYLOAD_ALIGNMENT == 0"]
Pass1["Assertion passes"]
Fail1["Panic with alignment error"]
Call1 --> Check1
 
       Call2 --> Check2
 
       Check1 -->|Aligned| Pass1
 
       Check1 -->|Misaligned| Fail1
 
       Check2 -->|Aligned| Pass1
 
       Check2 -->|Misaligned| Fail1
    end
    
    subgraph ReleaseMode["Release/Bench Mode"]
Call3["debug_assert_aligned(ptr, align)"]
Call4["debug_assert_aligned_offset(offset)"]
NoOp["No-op (zero cost)"]
Call3 --> NoOp
 
       Call4 --> NoOp
    end

Alignment Validation and Debugging

Starting with version 0.15.0, SIMD R Drive includes debug-time alignment assertions to catch alignment violations early in development.

Debug Assertion Functions

Implementation Details

The assertion functions are exported from simd-r-drive-entry-handle and can be called from any crate:

debug_assert_aligned(ptr: *const u8, align: usize)
- Validates pointer alignment in debug/test builds
- Used for verifying Arrow buffer compatibility
- Zero cost in release builds
debug_assert_aligned_offset(offset: u64)
- Validates file offset alignment in debug/test builds
- Checks against PAYLOAD_ALIGNMENT constant
- Zero cost in release builds

Implementation approach:

Functions always exist (stable symbol for cross-crate calls)
Body is gated by #[cfg(any(test, debug_assertions))]
Release/bench builds compile to no-ops with no runtime cost

Sources: simd-r-drive-entry-handle/src/debug_assert_aligned.rs:1-89 CHANGELOG.md:33-42

Future Format Stability

Current Status

SIMD R Drive is in alpha (0.x.x-alpha) and storage format stability is not guaranteed. Each alignment change requires storage regeneration.

Path to Stability

Future work may include:

Format version headers - Store format version in file header for automatic detection
Multi-format readers - Support reading multiple alignment versions in a single binary
In-place migration - Tools to migrate storage without full regeneration
Stable 1.0 format - Commitment to format stability after 1.0 release

Current Recommendations

Until format stability is achieved:

Plan for regeneration: Assume storage files will need regeneration on major updates
Version pinning: Pin specific SIMD R Drive versions in production deployments
Test thoroughly: Validate migration procedures in staging environments
Keep backups: Maintain backups of storage files before migrations
Monitor changelogs: Review CHANGELOG.md for breaking changes before upgrading

Sources: CHANGELOG.md:1-82 README.md:1-10

Summary: Key Takeaways

Topic	Key Points
Format Compatibility	Each major version (0.13, 0.14, 0.15) has incompatible storage formats
Migration Required	Upgrading between major versions requires regenerating storage files
Alignment Evolution	Pre-0.14: none → 0.14.x: 16 bytes → 0.15.x: 64 bytes
Deployment Order	Upgrade readers first, then writers, then migrate storage
Debug Tools	Use `debug_assert_aligned()` functions to validate alignment in tests
Production Strategy	Pin versions, plan for regeneration, test migrations thoroughly

For specific migration procedures, refer to the [Migration Procedures](https://github.com/jzombie/rust-simd-r-drive/blob/487b7b98/Migration Procedures) section above. For understanding the current storage format, see Entry Structure and Metadata.

Sources: CHANGELOG.md:1-82 README.md:1-282

Keyboard shortcuts

rust-simd-r-drive Documentation