Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Memory Management and Zero-Copy Access

Loading…

Memory Management and Zero-Copy Access

Relevant source files

Purpose and Scope

This document describes the memory management strategy used by SIMD R Drive’s core storage engine, focusing on memory-mapped file access and zero-copy read patterns. It covers the memmap2 crate integration, the Arc<Mmap> shared reference architecture, and how EntryHandle provides zero-copy views into stored data.

For details on entry structure and metadata organization, see Entry Structure and Metadata. For concurrency mechanisms that protect memory-mapped access, see Concurrency and Thread Safety.


Memory-Mapped File Architecture

Core mmap Integration

The storage engine uses the memmap2 crate to memory-map the entire storage file, allowing direct access to file contents without explicit read system calls. The memory-mapped region is managed through a layered reference-counting structure:

Arc<Mutex<Arc<Mmap>>>
    │
    ├─ Outer Arc: Shared across DataStore clones
    ├─ Mutex: Serializes remapping operations
    └─ Inner Arc<Mmap>: Shared across readers

Sources: src/storage_engine/data_store.rs:1-30

DataStore mmap Field Structure

The DataStore struct maintains the memory map using nested Arc wrappers:

LayerTypePurpose
OuterArc<Mutex<...>>Allows shared ownership of the mutex across DataStore instances
MutexMutex<...>Serializes remapping operations during writes
InnerArc<Mmap>Enables zero-cost cloning for concurrent readers
CoreMmapThe actual memory-mapped file region from memmap2

This structure enables:

  • Multiple readers to hold Arc<Mmap> references simultaneously
  • Safe remapping after writes without invalidating existing reader references
  • Lock-free reads once an Arc<Mmap> is obtained

Sources: src/storage_engine/data_store.rs:26-33 README.md:174-183


Memory Map Initialization and Remapping

graph TB
    Open["DataStore::open()"]
OpenFile["open_file_in_append_mode()"]
InitMmap["init_mmap()"]
UnsafeMap["unsafe memmap2::MmapOptions::new().map()"]
ArcWrap["Arc::new(mmap)"]
Open --> OpenFile
 
   OpenFile --> InitMmap
 
   InitMmap --> UnsafeMap
 
   UnsafeMap --> ArcWrap
    
    OpenFile -.returns.-> File
    UnsafeMap -.returns.-> Mmap
    ArcWrap -.stored in.-> DataStore

Initial Mapping

When a DataStore is opened, the storage file is memory-mapped using unsafe code that delegates to the OS:

Diagram: Initial memory map creation flow

The init_mmap function wraps the unsafe memmap2::MmapOptions::new().map() call, which asks the OS to map the file into the process address space. The resulting Mmap is immediately wrapped in an Arc for shared access.

Sources: src/storage_engine/data_store.rs:172-174 src/storage_engine/data_store.rs:84-117

sequenceDiagram
    participant Writer as "Write Operation"
    participant File as "RwLock<BufWriter<File>>"
    participant Reindex as "reindex()"
    participant MmapMutex as "Mutex<Arc<Mmap>>"
    participant Indexer as "RwLock<KeyIndexer>"
    
    Writer->>File: Acquire write lock
    Writer->>File: Append data + metadata
    Writer->>File: flush()
    Writer->>Reindex: reindex(&write_guard, offsets, tail)
    
    Reindex->>File: init_mmap(&write_guard)
    Note over Reindex,File: Create new Mmap from flushed file
    
    Reindex->>MmapMutex: lock()
    Reindex->>MmapMutex: *guard = Arc::new(new_mmap)
    Note over MmapMutex: Old Arc<Mmap> still valid for readers
    
    Reindex->>Indexer: write().insert(key_hash, offset)
    Reindex->>Indexer: Release lock
    
    Reindex->>MmapMutex: Release lock
    
    Note over Writer: New reads see updated mmap

Remapping After Writes

After write operations extend the file, the memory map must be refreshed to make new data visible. The reindex method handles this critical operation:

Diagram: Memory map remapping sequence during writes

The reindex method performs three synchronized updates:

  1. Creates a new Mmap from the extended file
  2. Atomically replaces the Arc<Mmap> in the mutex
  3. Updates the key indexer with new offsets

Sources: src/storage_engine/data_store.rs:224-259 src/storage_engine/data_store.rs:176-186


Zero-Copy Read Patterns

graph LR
    subgraph "DataStore"
        MmapContainer["Mutex<Arc<Mmap>>"]
end
    
    subgraph "EntryHandle"
        MmapRef["Arc<Mmap>"]
Range["range: Range<usize>"]
Metadata["metadata: EntryMetadata"]
end
    
    subgraph "User Code"
        Slice["&[u8] payload slice"]
end
    
 
   MmapContainer -->|get_mmap_arc| MmapRef
 
   MmapRef -->|&mmap[range]| Slice
    Range -.defines region.-> Slice
    
    Note1["Zero-copy: slice points\ndirectly into mmap"]
Slice -.-> Note1

EntryHandle Architecture

EntryHandle is the primary abstraction for zero-copy reads. It holds an Arc<Mmap> reference and a byte range, providing direct slice access without copying:

Diagram: EntryHandle zero-copy architecture

When EntryHandle::as_slice() is called, it returns &self.mmap_arc[self.range.clone()], which is a direct reference into the memory-mapped region. No data is copied; the slice is a view into the OS page cache.

Sources: [simd-r-drive-entry-handle crate](https://github.com/jzombie/rust-simd-r-drive/blob/0299fd5d/simd-r-drive-entry-handle crate) src/storage_engine/data_store.rs:560-565

graph TB
    Read["read(key)"]
ComputeHash["compute_hash(key)"]
GetMmap["get_mmap_arc()"]
LockIndex["key_indexer.read()"]
ReadContext["read_entry_with_context()"]
IndexLookup["key_indexer.get_packed(key_hash)"]
Unpack["KeyIndexer::unpack(packed)"]
CreateHandle["EntryHandle { mmap_arc, range, metadata }"]
AsSlice["entry.as_slice()"]
DirectRef["&mmap[range]"]
Read --> ComputeHash
 
   Read --> GetMmap
 
   Read --> LockIndex
 
   ComputeHash --> ReadContext
 
   GetMmap --> ReadContext
 
   LockIndex --> ReadContext
 
   ReadContext --> IndexLookup
 
   IndexLookup --> Unpack
 
   Unpack --> CreateHandle
 
   CreateHandle --> AsSlice
 
   AsSlice --> DirectRef
    
    DirectRef -.zero-copy.-> OSPageCache["OS Page Cache"]

Read Operation Flow

The zero-copy read flow demonstrates how data moves from disk to user code without intermediate buffers:

Diagram: Zero-copy read operation flow from key lookup to slice access

Key points:

  • get_mmap_arc() obtains an Arc<Mmap> clone (cheap atomic increment)
  • Index lookup finds the file offset
  • EntryHandle is constructed with the Arc<Mmap> and byte range
  • as_slice() returns a reference directly into the mapped memory

Sources: src/storage_engine/data_store.rs:1040-1049 src/storage_engine/data_store.rs:502-565 src/storage_engine/data_store.rs:658-663


Shared Access with Arc

Thread-Safe Reference Counting

The Arc<Mmap> enables multiple threads to hold references to the same memory-mapped region simultaneously. Each clone increments an atomic reference count:

OperationCostThread Safety
Arc::clone()Single atomic incrementLock-free
Holding Arc<Mmap>No synchronization neededFully safe
Dropping Arc<Mmap>Single atomic decrementLock-free
Last reference dropsMmap unmapped by OSSafe

When a writer remaps the file, it replaces the Arc<Mmap> inside the mutex. Old Arc<Mmap> references remain valid until all readers drop them, at which point the OS automatically unmaps the old region.

Sources: src/storage_engine/data_store.rs:658-663 README.md:174-183

Clone Semantics in Iteration

EntryIterator demonstrates efficient Arc<Mmap> usage. The iterator holds one Arc<Mmap> and clones it for each EntryHandle it yields:

Diagram: Arc cloning pattern in EntryIterator

graph TB
    IterNew["EntryIterator::new(mmap_arc, tail)"]
IterField["EntryIterator { mmap: Arc<Mmap>, ... }"]
Next["next()
called"]
CreateHandle["EntryHandle { mmap_arc: Arc::clone(&self.mmap), ... }"]
UserCode["User processes EntryHandle"]
Drop["EntryHandle dropped"]
IterNew --> IterField
 
   IterField --> Next
 
   Next --> CreateHandle
    CreateHandle -.cheap clone.-> UserCode
 
   UserCode --> Drop
    Drop -.atomic decrement.-> RefCount["Reference count"]
Note["Iterator holds 1 Arc\nEach EntryHandle clones it\nAll point to same Mmap"]
IterField -.-> Note

This design allows the iterator and all yielded handles to coexist safely. The cloning overhead is minimal—just an atomic operation—while providing complete memory safety.

Sources: src/storage_engine/entry_iterator.rs:21-47 src/storage_engine/entry_iterator.rs:121-125


Memory Management Flow

graph TB
    subgraph "Initialization"
        OpenFile["open_file_in_append_mode()"]
InitMmap1["init_mmap(&file)"]
Recovery["recover_valid_chain()"]
ReinitMmap["Remap if truncation needed"]
BuildIndex["KeyIndexer::build()"]
StoreMmap["Store Arc<Mutex<Arc<Mmap>>>"]
end
    
    subgraph "Read Path"
        GetArc["get_mmap_arc()"]
ReadLock["key_indexer.read()"]
Lookup["Index lookup"]
ConstructHandle["EntryHandle { Arc::clone(mmap_arc), range, ... }"]
AsSlice["as_slice() → &mmap[range]"]
end
    
    subgraph "Write Path"
        WriteLock["file.write()"]
AppendData["Append payload + metadata"]
Flush["flush()"]
Reindex["reindex()"]
NewMmap["init_mmap() → new Mmap"]
SwapMmap["Mutex: *guard = Arc::new(new_mmap)"]
UpdateIndex["KeyIndexer: insert offsets"]
end
    
    subgraph "Iterator Path"
        IterCreate["iter_entries()"]
CloneMmap["get_mmap_arc()"]
IterNew["EntryIterator::new(mmap_arc, tail)"]
IterNext["next() → EntryHandle"]
end
    
 
   OpenFile --> InitMmap1
 
   InitMmap1 --> Recovery
 
   Recovery --> ReinitMmap
 
   ReinitMmap --> BuildIndex
 
   BuildIndex --> StoreMmap
    
    StoreMmap -.available for.-> GetArc
 
   GetArc --> ReadLock
 
   ReadLock --> Lookup
 
   Lookup --> ConstructHandle
 
   ConstructHandle --> AsSlice
    
    StoreMmap -.available for.-> WriteLock
 
   WriteLock --> AppendData
 
   AppendData --> Flush
 
   Flush --> Reindex
 
   Reindex --> NewMmap
 
   NewMmap --> SwapMmap
 
   SwapMmap --> UpdateIndex
    
    StoreMmap -.available for.-> IterCreate
 
   IterCreate --> CloneMmap
 
   CloneMmap --> IterNew
 
   IterNew --> IterNext

Complete Lifecycle

The following diagram maps the complete lifecycle of memory-mapped access, from initial file open through reads and writes to iterator cleanup:

Diagram: Complete memory management lifecycle

Sources: src/storage_engine/data_store.rs:84-117 src/storage_engine/data_store.rs:1040-1049 src/storage_engine/data_store.rs:752-825 src/storage_engine/data_store.rs:276-280

Code Entity Mapping

The following table maps high-level concepts to specific code entities:

ConceptCode EntityLocation
Memory-mapped filememmap2::Mmapsrc/storage_engine/data_store.rs9
Shared mmap referenceArc<Mmap>Throughout codebase
Mmap containerArc<Mutex<Arc<Mmap>>>src/storage_engine/data_store.rs29
Mmap initializationinit_mmap(file: &BufWriter<File>)src/storage_engine/data_store.rs:172-174
Mmap retrievalget_mmap_arc(&self)src/storage_engine/data_store.rs:658-663
Remapping operationreindex(&self, write_guard, offsets, tail, deleted)src/storage_engine/data_store.rs:224-259
Zero-copy handlesimd_r_drive_entry_handle::EntryHandleSeparate crate
Iterator with mmapEntryIterator { mmap: Arc<Mmap>, ... }src/storage_engine/entry_iterator.rs:21-25
Raw mmap pointer (testing)arc_ptr(&self) → *const u8src/storage_engine/data_store.rs:653-655

Sources: src/storage_engine/data_store.rs:1-33 src/storage_engine/entry_iterator.rs:21-25


Safety Considerations

OS Page Cache Integration

The memory-mapped approach delegates memory management to the OS page cache:

Diagram: OS page cache interaction with memory-mapped region

Key benefits:

  • Pages loaded on-demand (lazy loading)
  • OS handles eviction when memory is tight
  • Multiple processes can share the same page cache entries
  • No explicit memory allocation in application code

Sources: README.md:43-50 README.md174

Large File Handling

The system is designed to handle datasets larger than available RAM. The memory mapping does not load the entire file into RAM:

File SizeRAM UsageBehavior
< Available RAMEntire file may be cachedFast access, no swapping

Available RAM| Only accessed pages cached| OS loads pages on-demand

Available RAM| LRU page eviction active| Older pages evicted as needed

When iterating or reading, only the accessed byte ranges are loaded into physical memory. The OS automatically evicts least-recently-used pages under memory pressure.

Sources: README.md:45-50

Unsafe Code Boundaries

Memory mapping inherently requires unsafe code:

DataStore::init_mmap()
    └─> unsafe { memmap2::MmapOptions::new().map(file) }

The memmap2 crate provides safe abstractions over this unsafe operation, ensuring:

  • The file descriptor remains valid while mapped
  • The mapped region respects file size boundaries
  • Concurrent modifications to the file (outside the mmap) are handled correctly

SIMD R Drive’s architecture ensures safety by:

  • Never resizing the file while an mmap exists
  • Remapping after writes extend the file
  • Using Arc<Mmap> to prevent use-after-unmap bugs

Sources: src/storage_engine/data_store.rs:172-174 src/lib.rs:123-124

Thread Safety Guarantees

The nested Arc<Mutex<Arc<Mmap>>> structure provides these guarantees:

OperationSynchronizationSafety Property
Reading from Arc<Mmap>None (lock-free)Safe: immutable data
Cloning Arc<Mmap>Atomic refcountSafe: no data race
RemappingMutex heldSafe: serialized with other remaps
Old mmap still referencedIndependent ArcSafe: won’t be unmapped
Concurrent reads + remapSeparate Arc instancesSafe: readers use old or new mmap

The key insight is that remapping creates a new Arc<Mmap> without invalidating existing references. Readers holding old Arc<Mmap> instances continue accessing the old mapping until they drop their references.

Sources: src/storage_engine/data_store.rs:26-33 README.md:174-183 README.md:196-206


Memory Pressure and Resource Management

Automatic Resource Cleanup

When memory pressure increases, the OS automatically evicts pages from the page cache. However, the Mmap object itself is small—it only holds file descriptor information and address space pointers. The actual memory is managed by the kernel.

Arc<Mmap> ensures that:

  • The file is not unmapped while any thread holds a reference
  • When the last Arc is dropped, the Mmap destructor unmaps the region
  • The OS then reclaims the virtual address space

Sources: src/storage_engine/data_store.rs:658-663

Testing Hooks

For validation and testing, the system exposes mmap internals in debug builds:

MethodPurposeAvailability
get_mmap_arc_for_testing()Returns Arc<Mmap> for inspection#[cfg(any(test, debug_assertions))]
arc_ptr()Returns raw *const u8 pointer#[cfg(any(test, debug_assertions))]

These methods allow tests to verify zero-copy behavior by comparing pointer addresses and validating that slices point directly into the mapped region.

Sources: src/storage_engine/data_store.rs:631-656

Dismiss

Refresh this wiki

Enter email to refresh