Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Concurrency and Thread Safety

Loading…

Concurrency and Thread Safety

Relevant source files

Purpose and Scope

This document describes the concurrency model and thread safety guarantees of the SIMD R Drive storage engine. It covers the synchronization primitives used to enable safe multi-threaded access within a single process, including lock strategies for reads and writes, atomic operations, and memory map management.

For information about the core storage architecture and data structures, see Storage Architecture. For details on memory-mapped file usage, see Memory Management and Zero-Copy Access.

Key Limitation : The concurrency mechanisms described here apply only to single-process, multi-threaded environments. Multiple processes accessing the same storage file simultaneously are not supported and require external file locking mechanisms.


Concurrency Model Overview

The DataStore structure uses a combination of read-write locks, atomic operations, and mutexes to enable safe concurrent access across multiple threads while maintaining data consistency.

Diagram: DataStore Synchronization Architecture

graph TB
    subgraph "DataStore Synchronization Primitives"
        FILE["Arc<RwLock<BufWriter<File>>>\nfile"]
MMAP["Arc<Mutex<Arc<Mmap>>>\nmmap"]
TAIL["AtomicU64\ntail_offset"]
INDEX["Arc<RwLock<KeyIndexer>>\nkey_indexer"]
end
    
    subgraph "Write Operations"
        W_STREAM["write_stream"]
W_SINGLE["write"]
W_BATCH["batch_write"]
end
    
    subgraph "Read Operations"
        R_SINGLE["read"]
R_BATCH["batch_read"]
R_ITER["iter_entries"]
end
    
 
   W_STREAM --> FILE
 
   W_SINGLE --> FILE
 
   W_BATCH --> FILE
    
 
   W_STREAM --> TAIL
 
   W_SINGLE --> TAIL
 
   W_BATCH --> TAIL
    
    W_STREAM -.updates.-> MMAP
    W_SINGLE -.updates.-> MMAP
    W_BATCH -.updates.-> MMAP
    
    W_STREAM -.updates.-> INDEX
    W_SINGLE -.updates.-> INDEX
    W_BATCH -.updates.-> INDEX
    
 
   R_SINGLE --> INDEX
 
   R_BATCH --> INDEX
 
   R_ITER --> MMAP
    
 
   R_SINGLE --> MMAP
 
   R_BATCH --> MMAP

Sources: src/storage_engine/data_store.rs:27-33 README.md:172-183


Synchronization Primitives

DataStore Field Overview

The DataStore struct contains four primary fields that implement concurrency control:

FieldTypePurposeLock Type
fileArc<RwLock<BufWriter<File>>>File handle for writesRead-write lock
mmapArc<Mutex<Arc<Mmap>>>Memory-mapped viewExclusive mutex
tail_offsetAtomicU64Current file end positionLock-free atomic
key_indexerArc<RwLock<KeyIndexer>>Hash index for lookupsRead-write lock

Sources: src/storage_engine/data_store.rs:27-33


RwLock for File Writes

All write operations acquire an exclusive write lock on the file handle to prevent concurrent modifications.

Diagram: Write Lock Serialization

sequenceDiagram
    participant T1 as "Thread 1"
    participant T2 as "Thread 2"
    participant FILE as "RwLock&lt;File&gt;"
    
    T1->>FILE: write.lock() - acquire
    Note over T1,FILE: Thread 1 holds write lock
    T2->>FILE: write.lock() - blocks
    Note over T2: Thread 2 waits
    T1->>FILE: write data + flush
    T1->>FILE: release lock
    Note over FILE: Lock released
    T2->>FILE: acquire lock
    Note over T2,FILE: Thread 2 now writes
    T2->>FILE: write data + flush
    T2->>FILE: release lock

Write Lock Acquisition

Write operations acquire the lock at the start of the write process:

This pattern appears in:

The write lock ensures that only one thread can append data to the file at any given time, preventing:

  • Race conditions on file position
  • Interleaved writes corrupting the append-only chain
  • Inconsistent metadata ordering

Sources: src/storage_engine/data_store.rs:752-825 src/storage_engine/data_store.rs:847-945 README.md176


AtomicU64 for Tail Offset

The tail_offset field tracks the current end of the valid data in the storage file using atomic operations, enabling lock-free reads of the current file position.

Atomic Operations Used

OperationMethodPurpose
Loadload(Ordering::Acquire)Read current tail position
Storestore(offset, Ordering::Release)Update tail after write

Load Operation

Reads use Acquire ordering to ensure they see all previous writes:

Examples:

Store Operation

Writes use Release ordering to ensure all previous writes are visible:

Location: src/storage_engine/data_store.rs256

This atomic coordination ensures that:

  • Readers always see a consistent tail offset
  • Writers update the tail only after data is flushed
  • No locks are needed for reading the tail position

Sources: src/storage_engine/data_store.rs30 src/storage_engine/data_store.rs256 src/storage_engine/data_store.rs278 README.md182


graph LR
    subgraph "Memory Map Management"
        MUTEX["Mutex&lt;Arc&lt;Mmap&gt;&gt;"]
MMAP1["Arc&lt;Mmap&gt; v1"]
MMAP2["Arc&lt;Mmap&gt; v2"]
end
    
    subgraph "Readers"
        R1["Reader Thread 1"]
R2["Reader Thread 2"]
end
    
    subgraph "Writer"
        W["Writer Thread"]
end
    
    R1 -.clones.-> MMAP1
    R2 -.clones.-> MMAP1
 
   W -->|1. Lock mutex| MUTEX
 
   W -->|2. Create new| MMAP2
 
   W -->|3. Replace| MUTEX
 
   W -->|4. Release| MUTEX
    
    MMAP1 -.remains valid.-> R1
    MMAP1 -.remains valid.-> R2

Mutex for Memory Map

The memory-mapped file reference is protected by a Mutex<Arc<Mmap>> to prevent concurrent remapping during reads.

Diagram: Memory Map Arc Cloning Pattern

Accessing the Memory Map

Read operations clone the Arc<Mmap> to obtain a stable reference:

Source: src/storage_engine/data_store.rs:658-663

This pattern ensures:

  • Readers hold a reference to a specific memory map version
  • Writers can create a new memory map without invalidating existing readers
  • The Arc reference counting prevents premature deallocation
  • The mutex is held only briefly during the clone operation

Remapping After Writes

After writing and flushing data, the reindex function creates a new memory map:

Source: src/storage_engine/data_store.rs:231-255

Sources: src/storage_engine/data_store.rs29 src/storage_engine/data_store.rs:224-259 src/storage_engine/data_store.rs:658-663 README.md180


RwLock for Key Index

The KeyIndexer is protected by a read-write lock, allowing multiple concurrent readers but exclusive writers.

Read Access Pattern

Multiple threads can acquire read locks simultaneously for lookups:

Example: src/storage_engine/data_store.rs509

Write Access Pattern

Index updates require exclusive write access:

Source: src/storage_engine/data_store.rs:233-253

Parallel Iterator Lock Strategy

The parallel iterator minimizes lock holding time by collecting offsets first:

Source: src/storage_engine/data_store.rs:300-302

Sources: src/storage_engine/data_store.rs31 src/storage_engine/data_store.rs:233-253 src/storage_engine/data_store.rs:300-302 README.md178


sequenceDiagram
    participant R1 as "Reader 1"
    participant R2 as "Reader 2"
    participant R3 as "Reader 3"
    participant INDEX as "RwLock&lt;KeyIndexer&gt;"
    participant MMAP as "Arc&lt;Mmap&gt;"
    
    par Concurrent Reads
        R1->>INDEX: read().lock() - acquire
        R2->>INDEX: read().lock() - acquire
        R3->>INDEX: read().lock() - acquire
    end
    
    par Index Lookups
        R1->>INDEX: get_packed(key_hash_1)
        R2->>INDEX: get_packed(key_hash_2)
        R3->>INDEX: get_packed(key_hash_3)
    end
    
    Note over R1,R3: All readers release index lock
    
    par Zero-Copy Access
        R1->>MMAP: Access offset_1
        R2->>MMAP: Access offset_2
        R3->>MMAP: Access offset_3
    end
    
    Note over R1,MMAP: No locks during data access

Lock-Free Read Operations

Read operations achieve lock-free access through memory-mapped files and atomic operations.

Diagram: Concurrent Lock-Free Read Pattern

Zero-Copy Read Implementation

Once the offset is obtained from the index, data access is lock-free:

Source: src/storage_engine/data_store.rs:502-565

Benefits of Lock-Free Reads

  1. No Read Contention : Multiple readers access different memory regions simultaneously
  2. Zero-Copy : Data is accessed directly from the memory map without copying
  3. Scalability : Read throughput scales linearly with CPU cores
  4. Low Latency : No lock acquisition overhead after index lookup

Sources: README.md174 src/storage_engine/data_store.rs:502-565 tests/concurrency_tests.rs:163-229


graph TB
    subgraph "Write Operation Phases"
        ACQUIRE["1. Acquire RwLock&lt;File&gt;"]
LOAD["2. Load tail_offset\n(Atomic)"]
CALC["3. Calculate pre-padding"]
WRITE["4. Write payload + metadata"]
FLUSH["5. Flush to disk"]
REMAP["6. Remap file (Mutex)"]
UPDATE["7. Update index (RwLock)"]
STORE["8. Store new tail_offset\n(Atomic)"]
RELEASE["9. Release file lock"]
end
    
 
   ACQUIRE --> LOAD
 
   LOAD --> CALC
 
   CALC --> WRITE
 
   WRITE --> FLUSH
 
   FLUSH --> REMAP
 
   REMAP --> UPDATE
 
   UPDATE --> STORE
 
   STORE --> RELEASE

Write Synchronization

Write operations are fully serialized through the file lock, ensuring consistency.

Diagram: Write Operation Synchronization Flow

Single Write Flow

Source: src/storage_engine/data_store.rs:758-825

Batch Write Optimization

Batch writes hold the lock once for multiple entries:

Source: src/storage_engine/data_store.rs:847-945

Sources: src/storage_engine/data_store.rs:752-825 src/storage_engine/data_store.rs:847-945 README.md176


Thread Safety Guarantees

Thread Safety Matrix

The following table summarizes thread safety guarantees for different environments:

EnvironmentReadsWritesIndex UpdatesStorage Safety
Single Process, Single Thread✅ Safe✅ Safe✅ Safe✅ Safe
Single Process, Multi-Threaded✅ Safe (lock-free, zero-copy)✅ Safe (RwLock<File>)✅ Safe (RwLock<KeyIndexer>)✅ Safe (Mutex<Arc<Mmap>>)
Multiple Processes, Shared File⚠️ Unsafe (no cross-process coordination)❌ Unsafe (no external locking)❌ Unsafe (separate memory spaces)❌ Unsafe (risk of race conditions)

Source: README.md:196-200

graph TB
    subgraph "Thread Safety Properties"
        P1["Atomic Tail Updates\nNo torn reads/writes"]
P2["Serialized File Writes\nNo interleaved data"]
P3["Consistent Index View\nRwLock guarantees"]
P4["Valid Memory Maps\nArc prevents premature free"]
P5["Backward Chain Integrity\nSequential offsets"]
end
    
 
   P1 --> SAFE["Thread-Safe\nMulti-Reader/Single-Writer"]
P2 --> SAFE
 
   P3 --> SAFE
 
   P4 --> SAFE
 
   P5 --> SAFE

Safe Concurrency Properties

The design ensures the following properties in single-process, multi-threaded environments:

Diagram: Thread Safety Property Dependencies

  1. Atomicity : All operations on shared state are atomic or properly locked
  2. Visibility : Changes made by one thread are visible to others through Release/Acquire semantics
  3. Ordering : The append-only design ensures writes happen in a strict sequence
  4. Isolation : Readers see a consistent snapshot via Arc<Mmap> cloning

Sources: README.md:172-206 src/storage_engine/data_store.rs:27-33


Single-Process vs Multi-Process

Single-Process Multi-Threaded (Supported)

All synchronization primitives work correctly within a single process:

Diagram: Single-Process Shared State

Example from concurrency tests:

Source: tests/concurrency_tests.rs:117-137

graph TB
    subgraph "Process 1"
        P1_DS["DataStore"]
P1_INDEX["KeyIndexer\n(separate)"]
P1_MMAP["Mmap\n(separate)"]
end
    
    subgraph "Process 2"
        P2_DS["DataStore"]
P2_INDEX["KeyIndexer\n(separate)"]
P2_MMAP["Mmap\n(separate)"]
end
    
    FILE["storage.bin\n(shared file)"]
P1_DS --> P1_INDEX
 
   P1_DS --> P1_MMAP
 
   P2_DS --> P2_INDEX
 
   P2_DS --> P2_MMAP
    
    P1_MMAP -.unsafe.-> FILE
    P2_MMAP -.unsafe.-> FILE

Multi-Process (Not Supported)

Multiple processes have separate address spaces and cannot share the in-memory synchronization primitives:

Diagram: Multi-Process Unsafe Access

Why Multi-Process is Unsafe

  1. Separate Index State : Each process has its own KeyIndexer in memory
  2. Independent Mmap Views : Memory maps are not synchronized across processes
  3. No Lock Coordination : RwLock and Mutex are process-local, not system-wide
  4. Race Conditions : Concurrent writes can corrupt the file structure

Recommendation : Use external file locking (e.g., flock, advisory locks) if multi-process access is required.

Sources: README.md:186-206 README.md:189-191


Testing Concurrency

The test suite validates concurrent access patterns to ensure thread safety guarantees.

Concurrent Write Test

Tests multiple threads writing simultaneously:

Source: tests/concurrency_tests.rs:111-161

Interleaved Read-Write Test

Tests read-after-write consistency with coordinated threads:

Source: tests/concurrency_tests.rs:163-229

Concurrent Streamed Write Test

Tests slow, streaming writes that hold the lock for extended periods:

Source: tests/concurrency_tests.rs:14-109

Sources: tests/concurrency_tests.rs:1-230


Summary

The SIMD R Drive concurrency model provides thread-safe access through a carefully coordinated set of synchronization primitives:

  • RwLock : Serializes file writes while allowing concurrent reads of the lock
  • AtomicU64 : Provides lock-free tail offset tracking
  • Mutex : Protects memory map updates without blocking existing readers
  • RwLock : Enables highly concurrent index reads with exclusive write access

This design achieves:

  • Zero-copy concurrent reads via memory mapping
  • Serialized writes preventing data corruption
  • Linear read scalability across CPU cores
  • Consistent snapshots through atomic operations

However, these guarantees apply only within a single process. Multi-process access requires external coordination mechanisms.

Sources: README.md:170-206 src/storage_engine/data_store.rs:26-33 tests/concurrency_tests.rs:1-230

Dismiss

Refresh this wiki

Enter email to refresh