This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Memory Management and Zero-Copy Access
Loading…
Memory Management and Zero-Copy Access
Relevant source files
- README.md
- src/lib.rs
- src/storage_engine.rs
- src/storage_engine/data_store.rs
- src/storage_engine/entry_iterator.rs
Purpose and Scope
This document describes the memory management strategy used by SIMD R Drive’s core storage engine, focusing on memory-mapped file access and zero-copy read patterns. It covers the memmap2 crate integration, the Arc<Mmap> shared reference architecture, and how EntryHandle provides zero-copy views into stored data.
For details on entry structure and metadata organization, see Entry Structure and Metadata. For concurrency mechanisms that protect memory-mapped access, see Concurrency and Thread Safety.
Memory-Mapped File Architecture
Core mmap Integration
The storage engine uses the memmap2 crate to memory-map the entire storage file, allowing direct access to file contents without explicit read system calls. The memory-mapped region is managed through a layered reference-counting structure:
Arc<Mutex<Arc<Mmap>>>
│
├─ Outer Arc: Shared across DataStore clones
├─ Mutex: Serializes remapping operations
└─ Inner Arc<Mmap>: Shared across readers
Sources: src/storage_engine/data_store.rs:1-30
DataStore mmap Field Structure
The DataStore struct maintains the memory map using nested Arc wrappers:
| Layer | Type | Purpose |
|---|---|---|
| Outer | Arc<Mutex<...>> | Allows shared ownership of the mutex across DataStore instances |
| Mutex | Mutex<...> | Serializes remapping operations during writes |
| Inner | Arc<Mmap> | Enables zero-cost cloning for concurrent readers |
| Core | Mmap | The actual memory-mapped file region from memmap2 |
This structure enables:
- Multiple readers to hold
Arc<Mmap>references simultaneously - Safe remapping after writes without invalidating existing reader references
- Lock-free reads once an
Arc<Mmap>is obtained
Sources: src/storage_engine/data_store.rs:26-33 README.md:174-183
Memory Map Initialization and Remapping
graph TB
Open["DataStore::open()"]
OpenFile["open_file_in_append_mode()"]
InitMmap["init_mmap()"]
UnsafeMap["unsafe memmap2::MmapOptions::new().map()"]
ArcWrap["Arc::new(mmap)"]
Open --> OpenFile
OpenFile --> InitMmap
InitMmap --> UnsafeMap
UnsafeMap --> ArcWrap
OpenFile -.returns.-> File
UnsafeMap -.returns.-> Mmap
ArcWrap -.stored in.-> DataStore
Initial Mapping
When a DataStore is opened, the storage file is memory-mapped using unsafe code that delegates to the OS:
Diagram: Initial memory map creation flow
The init_mmap function wraps the unsafe memmap2::MmapOptions::new().map() call, which asks the OS to map the file into the process address space. The resulting Mmap is immediately wrapped in an Arc for shared access.
Sources: src/storage_engine/data_store.rs:172-174 src/storage_engine/data_store.rs:84-117
sequenceDiagram
participant Writer as "Write Operation"
participant File as "RwLock<BufWriter<File>>"
participant Reindex as "reindex()"
participant MmapMutex as "Mutex<Arc<Mmap>>"
participant Indexer as "RwLock<KeyIndexer>"
Writer->>File: Acquire write lock
Writer->>File: Append data + metadata
Writer->>File: flush()
Writer->>Reindex: reindex(&write_guard, offsets, tail)
Reindex->>File: init_mmap(&write_guard)
Note over Reindex,File: Create new Mmap from flushed file
Reindex->>MmapMutex: lock()
Reindex->>MmapMutex: *guard = Arc::new(new_mmap)
Note over MmapMutex: Old Arc<Mmap> still valid for readers
Reindex->>Indexer: write().insert(key_hash, offset)
Reindex->>Indexer: Release lock
Reindex->>MmapMutex: Release lock
Note over Writer: New reads see updated mmap
Remapping After Writes
After write operations extend the file, the memory map must be refreshed to make new data visible. The reindex method handles this critical operation:
Diagram: Memory map remapping sequence during writes
The reindex method performs three synchronized updates:
- Creates a new
Mmapfrom the extended file - Atomically replaces the
Arc<Mmap>in the mutex - Updates the key indexer with new offsets
Sources: src/storage_engine/data_store.rs:224-259 src/storage_engine/data_store.rs:176-186
Zero-Copy Read Patterns
graph LR
subgraph "DataStore"
MmapContainer["Mutex<Arc<Mmap>>"]
end
subgraph "EntryHandle"
MmapRef["Arc<Mmap>"]
Range["range: Range<usize>"]
Metadata["metadata: EntryMetadata"]
end
subgraph "User Code"
Slice["&[u8] payload slice"]
end
MmapContainer -->|get_mmap_arc| MmapRef
MmapRef -->|&mmap[range]| Slice
Range -.defines region.-> Slice
Note1["Zero-copy: slice points\ndirectly into mmap"]
Slice -.-> Note1
EntryHandle Architecture
EntryHandle is the primary abstraction for zero-copy reads. It holds an Arc<Mmap> reference and a byte range, providing direct slice access without copying:
Diagram: EntryHandle zero-copy architecture
When EntryHandle::as_slice() is called, it returns &self.mmap_arc[self.range.clone()], which is a direct reference into the memory-mapped region. No data is copied; the slice is a view into the OS page cache.
Sources: [simd-r-drive-entry-handle crate](https://github.com/jzombie/rust-simd-r-drive/blob/0299fd5d/simd-r-drive-entry-handle crate) src/storage_engine/data_store.rs:560-565
graph TB
Read["read(key)"]
ComputeHash["compute_hash(key)"]
GetMmap["get_mmap_arc()"]
LockIndex["key_indexer.read()"]
ReadContext["read_entry_with_context()"]
IndexLookup["key_indexer.get_packed(key_hash)"]
Unpack["KeyIndexer::unpack(packed)"]
CreateHandle["EntryHandle { mmap_arc, range, metadata }"]
AsSlice["entry.as_slice()"]
DirectRef["&mmap[range]"]
Read --> ComputeHash
Read --> GetMmap
Read --> LockIndex
ComputeHash --> ReadContext
GetMmap --> ReadContext
LockIndex --> ReadContext
ReadContext --> IndexLookup
IndexLookup --> Unpack
Unpack --> CreateHandle
CreateHandle --> AsSlice
AsSlice --> DirectRef
DirectRef -.zero-copy.-> OSPageCache["OS Page Cache"]
Read Operation Flow
The zero-copy read flow demonstrates how data moves from disk to user code without intermediate buffers:
Diagram: Zero-copy read operation flow from key lookup to slice access
Key points:
get_mmap_arc()obtains anArc<Mmap>clone (cheap atomic increment)- Index lookup finds the file offset
EntryHandleis constructed with theArc<Mmap>and byte rangeas_slice()returns a reference directly into the mapped memory
Sources: src/storage_engine/data_store.rs:1040-1049 src/storage_engine/data_store.rs:502-565 src/storage_engine/data_store.rs:658-663
Shared Access with Arc
Thread-Safe Reference Counting
The Arc<Mmap> enables multiple threads to hold references to the same memory-mapped region simultaneously. Each clone increments an atomic reference count:
| Operation | Cost | Thread Safety |
|---|---|---|
Arc::clone() | Single atomic increment | Lock-free |
Holding Arc<Mmap> | No synchronization needed | Fully safe |
Dropping Arc<Mmap> | Single atomic decrement | Lock-free |
| Last reference drops | Mmap unmapped by OS | Safe |
When a writer remaps the file, it replaces the Arc<Mmap> inside the mutex. Old Arc<Mmap> references remain valid until all readers drop them, at which point the OS automatically unmaps the old region.
Sources: src/storage_engine/data_store.rs:658-663 README.md:174-183
Clone Semantics in Iteration
EntryIterator demonstrates efficient Arc<Mmap> usage. The iterator holds one Arc<Mmap> and clones it for each EntryHandle it yields:
Diagram: Arc cloning pattern in EntryIterator
graph TB
IterNew["EntryIterator::new(mmap_arc, tail)"]
IterField["EntryIterator { mmap: Arc<Mmap>, ... }"]
Next["next()
called"]
CreateHandle["EntryHandle { mmap_arc: Arc::clone(&self.mmap), ... }"]
UserCode["User processes EntryHandle"]
Drop["EntryHandle dropped"]
IterNew --> IterField
IterField --> Next
Next --> CreateHandle
CreateHandle -.cheap clone.-> UserCode
UserCode --> Drop
Drop -.atomic decrement.-> RefCount["Reference count"]
Note["Iterator holds 1 Arc\nEach EntryHandle clones it\nAll point to same Mmap"]
IterField -.-> Note
This design allows the iterator and all yielded handles to coexist safely. The cloning overhead is minimal—just an atomic operation—while providing complete memory safety.
Sources: src/storage_engine/entry_iterator.rs:21-47 src/storage_engine/entry_iterator.rs:121-125
Memory Management Flow
graph TB
subgraph "Initialization"
OpenFile["open_file_in_append_mode()"]
InitMmap1["init_mmap(&file)"]
Recovery["recover_valid_chain()"]
ReinitMmap["Remap if truncation needed"]
BuildIndex["KeyIndexer::build()"]
StoreMmap["Store Arc<Mutex<Arc<Mmap>>>"]
end
subgraph "Read Path"
GetArc["get_mmap_arc()"]
ReadLock["key_indexer.read()"]
Lookup["Index lookup"]
ConstructHandle["EntryHandle { Arc::clone(mmap_arc), range, ... }"]
AsSlice["as_slice() → &mmap[range]"]
end
subgraph "Write Path"
WriteLock["file.write()"]
AppendData["Append payload + metadata"]
Flush["flush()"]
Reindex["reindex()"]
NewMmap["init_mmap() → new Mmap"]
SwapMmap["Mutex: *guard = Arc::new(new_mmap)"]
UpdateIndex["KeyIndexer: insert offsets"]
end
subgraph "Iterator Path"
IterCreate["iter_entries()"]
CloneMmap["get_mmap_arc()"]
IterNew["EntryIterator::new(mmap_arc, tail)"]
IterNext["next() → EntryHandle"]
end
OpenFile --> InitMmap1
InitMmap1 --> Recovery
Recovery --> ReinitMmap
ReinitMmap --> BuildIndex
BuildIndex --> StoreMmap
StoreMmap -.available for.-> GetArc
GetArc --> ReadLock
ReadLock --> Lookup
Lookup --> ConstructHandle
ConstructHandle --> AsSlice
StoreMmap -.available for.-> WriteLock
WriteLock --> AppendData
AppendData --> Flush
Flush --> Reindex
Reindex --> NewMmap
NewMmap --> SwapMmap
SwapMmap --> UpdateIndex
StoreMmap -.available for.-> IterCreate
IterCreate --> CloneMmap
CloneMmap --> IterNew
IterNew --> IterNext
Complete Lifecycle
The following diagram maps the complete lifecycle of memory-mapped access, from initial file open through reads and writes to iterator cleanup:
Diagram: Complete memory management lifecycle
Sources: src/storage_engine/data_store.rs:84-117 src/storage_engine/data_store.rs:1040-1049 src/storage_engine/data_store.rs:752-825 src/storage_engine/data_store.rs:276-280
Code Entity Mapping
The following table maps high-level concepts to specific code entities:
| Concept | Code Entity | Location |
|---|---|---|
| Memory-mapped file | memmap2::Mmap | src/storage_engine/data_store.rs9 |
| Shared mmap reference | Arc<Mmap> | Throughout codebase |
| Mmap container | Arc<Mutex<Arc<Mmap>>> | src/storage_engine/data_store.rs29 |
| Mmap initialization | init_mmap(file: &BufWriter<File>) | src/storage_engine/data_store.rs:172-174 |
| Mmap retrieval | get_mmap_arc(&self) | src/storage_engine/data_store.rs:658-663 |
| Remapping operation | reindex(&self, write_guard, offsets, tail, deleted) | src/storage_engine/data_store.rs:224-259 |
| Zero-copy handle | simd_r_drive_entry_handle::EntryHandle | Separate crate |
| Iterator with mmap | EntryIterator { mmap: Arc<Mmap>, ... } | src/storage_engine/entry_iterator.rs:21-25 |
| Raw mmap pointer (testing) | arc_ptr(&self) → *const u8 | src/storage_engine/data_store.rs:653-655 |
Sources: src/storage_engine/data_store.rs:1-33 src/storage_engine/entry_iterator.rs:21-25
Safety Considerations
OS Page Cache Integration
The memory-mapped approach delegates memory management to the OS page cache:
Diagram: OS page cache interaction with memory-mapped region
Key benefits:
- Pages loaded on-demand (lazy loading)
- OS handles eviction when memory is tight
- Multiple processes can share the same page cache entries
- No explicit memory allocation in application code
Sources: README.md:43-50 README.md174
Large File Handling
The system is designed to handle datasets larger than available RAM. The memory mapping does not load the entire file into RAM:
| File Size | RAM Usage | Behavior |
|---|---|---|
| < Available RAM | Entire file may be cached | Fast access, no swapping |
Available RAM| Only accessed pages cached| OS loads pages on-demand
Available RAM| LRU page eviction active| Older pages evicted as needed
When iterating or reading, only the accessed byte ranges are loaded into physical memory. The OS automatically evicts least-recently-used pages under memory pressure.
Sources: README.md:45-50
Unsafe Code Boundaries
Memory mapping inherently requires unsafe code:
DataStore::init_mmap()
└─> unsafe { memmap2::MmapOptions::new().map(file) }
The memmap2 crate provides safe abstractions over this unsafe operation, ensuring:
- The file descriptor remains valid while mapped
- The mapped region respects file size boundaries
- Concurrent modifications to the file (outside the mmap) are handled correctly
SIMD R Drive’s architecture ensures safety by:
- Never resizing the file while an mmap exists
- Remapping after writes extend the file
- Using
Arc<Mmap>to prevent use-after-unmap bugs
Sources: src/storage_engine/data_store.rs:172-174 src/lib.rs:123-124
Thread Safety Guarantees
The nested Arc<Mutex<Arc<Mmap>>> structure provides these guarantees:
| Operation | Synchronization | Safety Property |
|---|---|---|
Reading from Arc<Mmap> | None (lock-free) | Safe: immutable data |
Cloning Arc<Mmap> | Atomic refcount | Safe: no data race |
| Remapping | Mutex held | Safe: serialized with other remaps |
| Old mmap still referenced | Independent Arc | Safe: won’t be unmapped |
| Concurrent reads + remap | Separate Arc instances | Safe: readers use old or new mmap |
The key insight is that remapping creates a new Arc<Mmap> without invalidating existing references. Readers holding old Arc<Mmap> instances continue accessing the old mapping until they drop their references.
Sources: src/storage_engine/data_store.rs:26-33 README.md:174-183 README.md:196-206
Memory Pressure and Resource Management
Automatic Resource Cleanup
When memory pressure increases, the OS automatically evicts pages from the page cache. However, the Mmap object itself is small—it only holds file descriptor information and address space pointers. The actual memory is managed by the kernel.
Arc<Mmap> ensures that:
- The file is not unmapped while any thread holds a reference
- When the last
Arcis dropped, theMmapdestructor unmaps the region - The OS then reclaims the virtual address space
Sources: src/storage_engine/data_store.rs:658-663
Testing Hooks
For validation and testing, the system exposes mmap internals in debug builds:
| Method | Purpose | Availability |
|---|---|---|
get_mmap_arc_for_testing() | Returns Arc<Mmap> for inspection | #[cfg(any(test, debug_assertions))] |
arc_ptr() | Returns raw *const u8 pointer | #[cfg(any(test, debug_assertions))] |
These methods allow tests to verify zero-copy behavior by comparing pointer addresses and validating that slices point directly into the mapped region.
Sources: src/storage_engine/data_store.rs:631-656
Dismiss
Refresh this wiki
Enter email to refresh