Write and Read Modes

Relevant source files

Purpose and Scope

This document describes the three write modes and three read modes available in SIMD R Drive, detailing their operational characteristics, performance trade-offs, and appropriate use cases. These modes provide flexibility for different workload patterns, from single-key operations to bulk processing.

For information about the underlying SIMD acceleration that optimizes these operations, see SIMD Acceleration. For details about the alignment strategy that enables efficient reads, see Payload Alignment and Cache Efficiency.

Write Modes

SIMD R Drive provides three distinct write modes, each optimized for different usage patterns. All write modes acquire an exclusive write lock (RwLock<BufWriter<File>>) to ensure thread safety and data consistency.

Single Entry Write

The single entry write mode writes one key-value pair atomically and flushes immediately to disk.

Primary Methods:

DataStoreWriter::write(key: &[u8], payload: &[u8]) -> Result<u64> src/storage_engine/data_store.rs:827-830
DataStoreWriter::write_with_key_hash(key_hash: u64, payload: &[u8]) -> Result<u64> src/storage_engine/data_store.rs:832-834

Operation Flow:

Characteristics:

Latency : Lowest for single operations (immediate flush)
Throughput : Lower due to per-write overhead
Disk I/O : One flush operation per write
Use Case : Interactive operations, real-time updates, critical writes requiring immediate durability

Implementation Detail: Single writes internally delegate to batch_write_with_key_hashes() with a single-element vector, ensuring consistent behavior across all write paths src/storage_engine/data_store.rs:832-834

Sources: README.md:212-215 src/storage_engine/data_store.rs:827-834

Batch Write

Batch write mode writes multiple key-value pairs in a single atomic operation, flushing only once at the end.

Primary Methods:

DataStoreWriter::batch_write(entries: &[(&[u8], &[u8])]) -> Result<u64> src/storage_engine/data_store.rs:838-843
DataStoreWriter::batch_write_with_key_hashes(prehashed_keys: Vec<(u64, &[u8])>, allow_null_bytes: bool) -> Result<u64> src/storage_engine/data_store.rs:847-951

Operation Flow:

Characteristics:

Latency : Higher per-entry latency (amortized)
Throughput : Significantly higher due to reduced disk I/O
Disk I/O : Single flush for entire batch
Memory : Builds entries in-memory buffer before writing src/storage_engine/data_store.rs:857-898
Use Case : Bulk imports, batch processing, high-throughput ingestion

Performance Optimization: The batch implementation pre-allocates a buffer and constructs all entries before any disk I/O, minimizing lock contention and maximizing sequential write performance. The buffer construction happens at src/storage_engine/data_store.rs:857-918

Tombstone Support: Batch writes support deletion markers (tombstones) when allow_null_bytes is true, writing a single NULL byte followed by metadata src/storage_engine/data_store.rs:864-898

Sources: README.md:216-219 src/storage_engine/data_store.rs:838-951 benches/storage_benchmark.rs:85-92

Streaming Write

Streaming write mode writes large payloads incrementally from a Read source without requiring full in-memory buffering.

Primary Methods:

DataStoreWriter::write_stream<R: Read>(key: &[u8], reader: &mut R) -> Result<u64> src/storage_engine/data_store.rs:753-756
DataStoreWriter::write_stream_with_key_hash<R: Read>(key_hash: u64, reader: &mut R) -> Result<u64> src/storage_engine/data_store.rs:758-825

Operation Flow:

Characteristics:

Memory Footprint : Constant (4096-byte buffer) src/storage_engine/constants.rs
Payload Size : Unbounded (supports arbitrarily large entries)
Disk I/O : Incremental writes, single flush at end
Use Case : Large file storage, network streams, memory-constrained environments

Implementation Details:

The streaming write uses a fixed-size buffer (WRITE_STREAM_BUFFER_SIZE) and performs incremental writes while computing the checksum:

Component	Size/Type	Purpose
Read Buffer	4096 bytes	Temporary staging for stream chunks
Checksum State	`crc32fast::Hasher`	Incremental CRC32C calculation
Pre-pad	0-63 bytes	Alignment padding before payload
Metadata	20 bytes	`key_hash`, `prev_offset`, `checksum`

Validation:

Rejects empty payloads src/storage_engine/data_store.rs:799-804
Rejects NULL-byte-only streams (reserved for tombstones) src/storage_engine/data_store.rs:792-797

Sources: README.md:220-223 src/storage_engine/data_store.rs:753-825 tests/concurrency_tests.rs:16-109

Write Mode Comparison

Performance Table:

Write Mode	Lock Duration	Disk Flushes	Memory Usage	Best For
Single	Short (per write)	1 per write	Minimal	Interactive operations, real-time updates
Batch	Medium (entire batch)	1 per batch	Buffer size × entries	Bulk imports, high throughput
Streaming	Long (entire stream)	1 per stream	4096 bytes (constant)	Large files, memory-constrained

Throughput Characteristics:

Based on benchmark results benches/storage_benchmark.rs:52-83:

Sources: benches/storage_benchmark.rs:52-92 README.md:208-223

Read Modes

SIMD R Drive provides three read modes optimized for different access patterns. All read modes leverage zero-copy access through memory-mapped files.

Direct Read

Direct read mode provides immediate, zero-copy access to stored entries through EntryHandle.

Primary Methods:

DataStoreReader::read(key: &[u8]) -> Result<Option<EntryHandle>> src/traits.rs
DataStoreReader::batch_read(keys: &[&[u8]]) -> Result<Vec<Option<EntryHandle>>> src/traits.rs
DataStoreReader::exists(key: &[u8]) -> Result<bool> src/traits.rs

Operation Flow:

Characteristics:

Latency : Minimal (single hash lookup + pointer arithmetic)
Memory : Zero-copy (returns view into mmap)
Concurrency : Lock-free reads (except brief index lock)
Use Case : Random access, key-value lookups, real-time queries

Zero-Copy Guarantee:

The EntryHandle provides direct access to the memory-mapped region without copying:

EntryHandle {
    mmap_arc: Arc<Mmap>,          // Shared reference to mmap
    range: Range<usize>,           // Byte range within mmap
    metadata: EntryMetadata,       // Deserialized metadata (20 bytes)
}

The handle implements Deref<Target = [u8]>, allowing transparent access to payload bytes simd-r-drive-entry-handle/src/lib.rs

Batch Read Optimization:

batch_read() performs vectorized lookups, acquiring the index lock once for all keys and returning Vec<Option<EntryHandle>>. This reduces lock acquisition overhead for multiple keys src/storage_engine/data_store.rs

Sources: README.md:228-233 src/storage_engine/data_store.rs:502-565 benches/storage_benchmark.rs:124-149

Streaming Read

Streaming read mode provides incremental, buffered access to large entries without loading them fully into memory.

Primary Structure:

EntryStream src/storage_engine/entry_stream.rs
Implements std::io::Read trait

Operation Flow:

Characteristics:

Memory Footprint : 8192-byte buffer src/storage_engine/entry_stream.rs
Copy Behavior : Non-zero-copy (reads through buffer)
Payload Size : Supports arbitrarily large entries
Use Case : Processing large entries incrementally, network transmission, streaming transformations

Implementation Notes:

The streaming read is not zero-copy despite using mmap as the source. This design choice enables:

Controlled memory pressure (constant buffer size)
Standard std::io::Read interface compatibility
Incremental processing without loading entire payload

For true zero-copy access to large entries, use direct read mode and process the EntryHandle slice directly.

Sources: README.md:234-241 src/storage_engine/entry_stream.rs

Parallel Iteration

Parallel iteration mode uses Rayon to process all valid entries across multiple threads (requires parallel feature).

Primary Methods:

DataStore::iter_entries() -> EntryIterator src/storage_engine/data_store.rs:276-280
DataStore::par_iter_entries() -> impl ParallelIterator<Item = EntryHandle> src/storage_engine/data_store.rs:296-361

Operation Flow:

Characteristics:

Throughput : Scales with CPU cores
Concurrency : Work-stealing via Rayon
Memory : Minimal overhead (offsets collected upfront)
Use Case : Bulk analytics, dataset scanning, cache warming, batch transformations

Implementation Strategy:

The parallel iterator optimizes for minimal lock contention:

Acquire read lock on KeyIndexer src/storage_engine/data_store.rs300
Collect all packed offset values into Vec<u64> src/storage_engine/data_store.rs301
Release lock immediately src/storage_engine/data_store.rs302
CloneArc<Mmap> once src/storage_engine/data_store.rs305
Parallel filter_map over offsets src/storage_engine/data_store.rs:310-360

Each worker thread independently:

Unpacks (tag, offset) from packed value
Validates bounds and metadata
Constructs EntryHandle with cloned Arc<Mmap>
Filters tombstones

Sequential Iteration:

For sequential scanning without Rayon overhead, use iter_entries() which returns EntryIterator src/storage_engine/data_store.rs:276-280

Sources: README.md:242-246 src/storage_engine/data_store.rs:276-361 benches/storage_benchmark.rs:98-118

Read Mode Comparison

Performance Table:

Read Mode	Access Pattern	Memory Copy	Concurrency	Best For
Direct	Random/lookup	Zero-copy	Lock-free	Key-value queries, random access
Streaming	Sequential/buffered	Buffered copy	Single reader	Large entry processing
Parallel	Full scan	Zero-copy	Multi-threaded	Bulk analytics, dataset scanning

Throughput Characteristics:

Based on benchmark measurements benches/storage_benchmark.rs:

Operation	Throughput (1M entries, 8 bytes)	Notes
Sequential iteration	~millions/s	Zero-copy, cache-friendly
Random single reads	~1M reads/s	Hash lookup + bounds check
Batch reads	~1M reads/s	Vectorized index access

Sources: benches/storage_benchmark.rs:98-203 README.md:224-246

Performance Optimization Strategies

Write Optimization

Batching Strategy:

Group writes into batches of 1024-10000 entries for optimal throughput
Balance batch size against latency requirements
Use streaming for payloads > 1 MB to avoid memory pressure

Lock Contention:

All writes acquire the same RwLock<BufWriter<File>>
Increase batch size to amortize lock overhead
Consider application-level write queuing for highly concurrent workloads

Read Optimization

Access Pattern Matching:

Access Pattern	Recommended Mode	Reason
Random lookups	Direct read	O(1) hash lookup, zero-copy
Known key sets	Batch read	Amortized lock overhead
Full dataset scan	Sequential iteration	Cache-friendly forward traversal
Parallel analytics	Parallel iteration	Scales with CPU cores
Large entry processing	Streaming read	Constant memory footprint

Memory-Mapped File Behavior:

The OS manages mmap pages transparently:

Working set : Only accessed regions loaded into RAM
Large datasets : Can exceed available RAM (pages swapped on demand)
Cache warming : Sequential iteration benefits from read-ahead
Random access : May trigger page faults (disk I/O) on cold reads

Sources: benches/storage_benchmark.rs README.md:43-50

Concurrency Considerations

Write Concurrency

All write modes use exclusive locking and are mutually exclusive :

Implication : High write concurrency may benefit from application-level write buffering or queueing.

graph TB
    Read1["read()
Thread 1"]
Read2["read()
Thread 2"]
Read3["par_iter_entries()
Thread 3"]
IndexLock["RwLock<KeyIndexer>\n(Read Lock)"]
Mmap["Arc<Mmap>\n(Shared Reference)"]
Read1 --> IndexLock
 
   Read2 --> IndexLock
 
   Read3 --> IndexLock
    
 
   IndexLock -->|Concurrent read access| Mmap
    
    Write["write()
Thread 4"]
WriteLock["RwLock<BufWriter>\n(Exclusive Lock)"]
Write -->|Independent lock| WriteLock
    
 
   WriteLock -.->|After flush: remaps and updates| Mmap

Read Concurrency

Read operations are lock-free after index lookup and can occur concurrently with writes:

Characteristics:

Multiple readers can access mmap concurrently
Reads do not block writes (different locks)
Writes remap mmap after flushing, but readers retain their Arc<Mmap> reference
New reads see updated data after reindexing completes

Sources: README.md:170-207 tests/concurrency_tests.rs src/storage_engine/data_store.rs:224-259

Usage Examples

Write Mode Selection

Single Write (Real-Time Updates):

Batch Write (Bulk Import):

Streaming Write (Large Files):

Read Mode Selection

Direct Read (Key Lookup):

Streaming Read (Large Entry Processing):

Parallel Iteration (Dataset Analytics):

Sources: README.md src/storage_engine/data_store.rs benches/storage_benchmark.rs

Keyboard shortcuts

rust-simd-r-drive Documentation