Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Compaction and Maintenance

Loading…

Compaction and Maintenance

Relevant source files

Purpose and Scope

This page documents the maintenance operations available in the DataStore, focusing on compaction for space reclamation and automatic file recovery mechanisms. These operations ensure the storage engine remains efficient and resilient despite its append-only architecture.

For information about the underlying append-only storage model, see Storage Architecture. For details on entry structure that affects compaction, see Entry Structure and Metadata.

Sources: src/storage_engine/data_store.rs:1-1183


Compaction Process

Overview

The compaction process eliminates space waste caused by outdated entry versions. In the append-only model, updating a key creates a new entry while leaving the old version in the file. Compaction creates a new file containing only the latest version of each key, then atomically swaps it with the original.

graph TB
    subgraph "Original DataStore"
        DS["DataStore\n(self)"]
FILE["file: RwLock<BufWriter<File>>"]
MMAP["mmap: Mutex<Arc<Mmap>>"]
IDX["key_indexer: RwLock<KeyIndexer>"]
PATH["path: PathBuf"]
end
    
    subgraph "Compaction Process"
        COMPACT["compact()"]
ITER["iter_entries()"]
COPY["copy_handle()"]
end
    
    subgraph "Temporary DataStore"
        TEMP_DS["DataStore\n(compacted_storage)"]
TEMP_FILE["file (path + .bk)"]
TEMP_PATH["compacted_path"]
end
    
    subgraph "Final Operation"
        RENAME["std::fs::rename()"]
SWAP["Atomic File Swap"]
end
    
 
   DS --> COMPACT
 
   COMPACT --> TEMP_PATH
 
   TEMP_PATH --> TEMP_DS
 
   DS --> ITER
 
   ITER --> COPY
 
   COPY --> TEMP_DS
 
   TEMP_DS --> TEMP_FILE
 
   TEMP_FILE --> RENAME
 
   RENAME --> SWAP
 
   SWAP --> PATH
    
    style COMPACT fill:#f9f9f9
    style TEMP_DS fill:#f9f9f9
    style SWAP fill:#f9f9f9

Architecture

Sources: src/storage_engine/data_store.rs:706-749

Implementation Details

The compact() method at src/storage_engine/data_store.rs:706-749 performs the following sequence:

Sources: src/storage_engine/data_store.rs:706-749 src/storage_engine/data_store.rs:587-590 src/storage_engine/data_store.rs:269-280

Key Implementation Characteristics

AspectImplementationLocation
Temporary FileAppends .bk extension to original pathsrc/storage_engine/data_store.rs707
Entry SelectionUses iter_entries() which returns only latest versionssrc/storage_engine/data_store.rs714
Copy Mechanismcopy_handle()EntryStreamwrite_stream_with_key_hash()src/storage_engine/data_store.rs:587-590
Atomic Swapstd::fs::rename() provides atomic replacementsrc/storage_engine/data_store.rs746
Space OptimizationSkips static index if it wouldn’t save spacesrc/storage_engine/data_store.rs:727-741

Sources: src/storage_engine/data_store.rs:706-749

Thread Safety Considerations

The compact() method has critical thread safety limitations documented at src/storage_engine/data_store.rs:681-693:

Important Constraints:

  • Requires&mut self: Prevents concurrent mutations but does NOT prevent concurrent reads
  • Arc-wrapped risk : If DataStore is wrapped in Arc<DataStore>, other threads may hold read references during compaction
  • Recommendation : Only compact when you have exclusive access (single thread or external synchronization)
  • No automatic locking : External synchronization must be enforced by the caller

Sources: src/storage_engine/data_store.rs:679-705

EntryHandle Copying Mechanism

The copy_handle() method at src/storage_engine/data_store.rs:587-590 creates an EntryStream from the entry’s memory-mapped data and writes it to the target store using the original key hash. This preserves the entry’s identity while creating a new physical copy.

Sources: src/storage_engine/data_store.rs:567-590


Space Reclamation

Estimating Compaction Savings

The estimate_compaction_savings() method at src/storage_engine/data_store.rs:605-616 calculates potential space savings without performing actual compaction:

Algorithm:

  1. Get total file size via file_size()
  2. Iterate through all valid entries (latest versions only)
  3. Track seen keys using HashSet<u64> with Xxh3BuildHasher
  4. Sum the file_size() of each unique entry
  5. Return total_size - unique_entry_size

Note: The iter_entries() method already filters to latest versions, so this calculation represents the minimum achievable size through compaction.

Sources: src/storage_engine/data_store.rs:592-616

When to Compact

Compaction should be considered when:

ConditionDescriptionDetection Method
High Waste RatioSignificant difference between total and unique entry sizesestimate_compaction_savings() / file_size() > threshold
Frequent UpdatesMany keys updated multiple timesApplication-level tracking
Long-Running StorageFile has been in use for extended periodsTime-based policy
Before BackupMinimize backup sizePre-backup maintenance

The compaction process at src/storage_engine/data_store.rs:727-741 includes logic to skip static index generation if it wouldn’t save space, indicating an awareness of space optimization trade-offs.

Sources: src/storage_engine/data_store.rs:605-616 src/storage_engine/data_store.rs:727-741


File Recovery

Chain Validation

The recover_valid_chain() method at src/storage_engine/data_store.rs:383-482 ensures data integrity by validating the backward-linked chain of entries:

Sources: src/storage_engine/data_store.rs:363-482

Validation Algorithm Details

The recovery process validates multiple constraints:

ValidationCheckPurpose
Metadata Sizecursor >= METADATA_SIZEEnsure enough space for metadata read
Prev Offset Boundsprev_tail < metadata_offsetPrevent circular references
Entry Startentry_start < metadata_offsetEnsure valid entry range
Pre-paddingprepad_len(prev_tail)Account for 64-byte alignment
Tombstone DetectionSingle NULL byte checkHandle deletion markers
Chain DepthWalk to offset 0Ensure complete chain
Total Sizetotal_size <= file_lenPrevent overflow

Sources: src/storage_engine/data_store.rs:383-482

Corruption Detection and Handling

When DataStore::open() is called at src/storage_engine/data_store.rs:84-117 it automatically invokes recovery:

Sources: src/storage_engine/data_store.rs:84-117

Automatic Recovery Steps

The recovery process at src/storage_engine/data_store.rs:91-103:

  1. Detection : recover_valid_chain() returns final_len < file_len
  2. Warning : Logs truncation message with offsets
  3. Cleanup : Drops mmap and file handles
  4. Truncation : Re-opens file and calls set_len(final_len)
  5. Sync : Forces sync_all() to persist truncation
  6. Retry : Recursively calls open() with clean file

This ensures corrupted tail data is removed and the file is left in a valid state.

Sources: src/storage_engine/data_store.rs:89-103


Maintenance Operations

Re-indexing After Writes

The reindex() method at src/storage_engine/data_store.rs:224-259 updates internal structures after writes:

Key Operations:

  1. Re-map file : Creates new Mmap reflecting written data
  2. Update index : Inserts or removes key-hash-to-offset mappings
  3. Collision check : insert() returns error on hash collision (tag mismatch)
  4. Atomic update : Stores new tail_offset with Release ordering
  5. Visibility : New mmap makes written data visible to readers

Sources: src/storage_engine/data_store.rs:176-259

File Path and Metadata Access

MethodReturn TypePurposeLocation
get_path()PathBufReturns storage file pathsrc/storage_engine/data_store.rs:265-267
file_size()Result<u64>Returns current file sizesrc/storage_engine/data_store.rs:1179-1181
len()Result<usize>Returns count of unique keyssrc/storage_engine/data_store.rs:1164-1171
is_empty()Result<bool>Checks if storage has no entriessrc/storage_engine/data_store.rs:1173-1177

These methods provide essential metadata for maintenance decision-making.

Sources: src/storage_engine/data_store.rs:261-267 src/storage_engine/data_store.rs:1164-1181

Iterators for Maintenance

The storage provides multiple iteration mechanisms useful for maintenance:

Iteration Characteristics:

Sources: src/storage_engine/data_store.rs:269-361


Summary

The maintenance system provides:

ComponentPurposeKey Method
CompactionReclaim space from outdated entriescompact()
Space EstimationCalculate potential savingsestimate_compaction_savings()
RecoveryValidate and repair corrupted filesrecover_valid_chain()
Re-indexingUpdate structures after writesreindex()
IterationScan for maintenance operationsiter_entries(), par_iter_entries()

All operations maintain the append-only guarantee while ensuring data integrity and efficient space utilization.

Sources: src/storage_engine/data_store.rs:1-1183

Dismiss

Refresh this wiki

Enter email to refresh