This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Compaction and Maintenance
Loading…
Compaction and Maintenance
Relevant source files
Purpose and Scope
This page documents the maintenance operations available in the DataStore, focusing on compaction for space reclamation and automatic file recovery mechanisms. These operations ensure the storage engine remains efficient and resilient despite its append-only architecture.
For information about the underlying append-only storage model, see Storage Architecture. For details on entry structure that affects compaction, see Entry Structure and Metadata.
Sources: src/storage_engine/data_store.rs:1-1183
Compaction Process
Overview
The compaction process eliminates space waste caused by outdated entry versions. In the append-only model, updating a key creates a new entry while leaving the old version in the file. Compaction creates a new file containing only the latest version of each key, then atomically swaps it with the original.
graph TB
subgraph "Original DataStore"
DS["DataStore\n(self)"]
FILE["file: RwLock<BufWriter<File>>"]
MMAP["mmap: Mutex<Arc<Mmap>>"]
IDX["key_indexer: RwLock<KeyIndexer>"]
PATH["path: PathBuf"]
end
subgraph "Compaction Process"
COMPACT["compact()"]
ITER["iter_entries()"]
COPY["copy_handle()"]
end
subgraph "Temporary DataStore"
TEMP_DS["DataStore\n(compacted_storage)"]
TEMP_FILE["file (path + .bk)"]
TEMP_PATH["compacted_path"]
end
subgraph "Final Operation"
RENAME["std::fs::rename()"]
SWAP["Atomic File Swap"]
end
DS --> COMPACT
COMPACT --> TEMP_PATH
TEMP_PATH --> TEMP_DS
DS --> ITER
ITER --> COPY
COPY --> TEMP_DS
TEMP_DS --> TEMP_FILE
TEMP_FILE --> RENAME
RENAME --> SWAP
SWAP --> PATH
style COMPACT fill:#f9f9f9
style TEMP_DS fill:#f9f9f9
style SWAP fill:#f9f9f9
Architecture
Sources: src/storage_engine/data_store.rs:706-749
Implementation Details
The compact() method at src/storage_engine/data_store.rs:706-749 performs the following sequence:
Sources: src/storage_engine/data_store.rs:706-749 src/storage_engine/data_store.rs:587-590 src/storage_engine/data_store.rs:269-280
Key Implementation Characteristics
| Aspect | Implementation | Location |
|---|---|---|
| Temporary File | Appends .bk extension to original path | src/storage_engine/data_store.rs707 |
| Entry Selection | Uses iter_entries() which returns only latest versions | src/storage_engine/data_store.rs714 |
| Copy Mechanism | copy_handle() → EntryStream → write_stream_with_key_hash() | src/storage_engine/data_store.rs:587-590 |
| Atomic Swap | std::fs::rename() provides atomic replacement | src/storage_engine/data_store.rs746 |
| Space Optimization | Skips static index if it wouldn’t save space | src/storage_engine/data_store.rs:727-741 |
Sources: src/storage_engine/data_store.rs:706-749
Thread Safety Considerations
The compact() method has critical thread safety limitations documented at src/storage_engine/data_store.rs:681-693:
Important Constraints:
- Requires
&mut self: Prevents concurrent mutations but does NOT prevent concurrent reads - Arc-wrapped risk : If
DataStoreis wrapped inArc<DataStore>, other threads may hold read references during compaction - Recommendation : Only compact when you have exclusive access (single thread or external synchronization)
- No automatic locking : External synchronization must be enforced by the caller
Sources: src/storage_engine/data_store.rs:679-705
EntryHandle Copying Mechanism
The copy_handle() method at src/storage_engine/data_store.rs:587-590 creates an EntryStream from the entry’s memory-mapped data and writes it to the target store using the original key hash. This preserves the entry’s identity while creating a new physical copy.
Sources: src/storage_engine/data_store.rs:567-590
Space Reclamation
Estimating Compaction Savings
The estimate_compaction_savings() method at src/storage_engine/data_store.rs:605-616 calculates potential space savings without performing actual compaction:
Algorithm:
- Get total file size via
file_size() - Iterate through all valid entries (latest versions only)
- Track seen keys using
HashSet<u64>withXxh3BuildHasher - Sum the
file_size()of each unique entry - Return
total_size - unique_entry_size
Note: The iter_entries() method already filters to latest versions, so this calculation represents the minimum achievable size through compaction.
Sources: src/storage_engine/data_store.rs:592-616
When to Compact
Compaction should be considered when:
| Condition | Description | Detection Method |
|---|---|---|
| High Waste Ratio | Significant difference between total and unique entry sizes | estimate_compaction_savings() / file_size() > threshold |
| Frequent Updates | Many keys updated multiple times | Application-level tracking |
| Long-Running Storage | File has been in use for extended periods | Time-based policy |
| Before Backup | Minimize backup size | Pre-backup maintenance |
The compaction process at src/storage_engine/data_store.rs:727-741 includes logic to skip static index generation if it wouldn’t save space, indicating an awareness of space optimization trade-offs.
Sources: src/storage_engine/data_store.rs:605-616 src/storage_engine/data_store.rs:727-741
File Recovery
Chain Validation
The recover_valid_chain() method at src/storage_engine/data_store.rs:383-482 ensures data integrity by validating the backward-linked chain of entries:
Sources: src/storage_engine/data_store.rs:363-482
Validation Algorithm Details
The recovery process validates multiple constraints:
| Validation | Check | Purpose |
|---|---|---|
| Metadata Size | cursor >= METADATA_SIZE | Ensure enough space for metadata read |
| Prev Offset Bounds | prev_tail < metadata_offset | Prevent circular references |
| Entry Start | entry_start < metadata_offset | Ensure valid entry range |
| Pre-padding | prepad_len(prev_tail) | Account for 64-byte alignment |
| Tombstone Detection | Single NULL byte check | Handle deletion markers |
| Chain Depth | Walk to offset 0 | Ensure complete chain |
| Total Size | total_size <= file_len | Prevent overflow |
Sources: src/storage_engine/data_store.rs:383-482
Corruption Detection and Handling
When DataStore::open() is called at src/storage_engine/data_store.rs:84-117 it automatically invokes recovery:
Sources: src/storage_engine/data_store.rs:84-117
Automatic Recovery Steps
The recovery process at src/storage_engine/data_store.rs:91-103:
- Detection :
recover_valid_chain()returnsfinal_len < file_len - Warning : Logs truncation message with offsets
- Cleanup : Drops mmap and file handles
- Truncation : Re-opens file and calls
set_len(final_len) - Sync : Forces
sync_all()to persist truncation - Retry : Recursively calls
open()with clean file
This ensures corrupted tail data is removed and the file is left in a valid state.
Sources: src/storage_engine/data_store.rs:89-103
Maintenance Operations
Re-indexing After Writes
The reindex() method at src/storage_engine/data_store.rs:224-259 updates internal structures after writes:
Key Operations:
- Re-map file : Creates new
Mmapreflecting written data - Update index : Inserts or removes key-hash-to-offset mappings
- Collision check :
insert()returns error on hash collision (tag mismatch) - Atomic update : Stores new
tail_offsetwithReleaseordering - Visibility : New mmap makes written data visible to readers
Sources: src/storage_engine/data_store.rs:176-259
File Path and Metadata Access
| Method | Return Type | Purpose | Location |
|---|---|---|---|
get_path() | PathBuf | Returns storage file path | src/storage_engine/data_store.rs:265-267 |
file_size() | Result<u64> | Returns current file size | src/storage_engine/data_store.rs:1179-1181 |
len() | Result<usize> | Returns count of unique keys | src/storage_engine/data_store.rs:1164-1171 |
is_empty() | Result<bool> | Checks if storage has no entries | src/storage_engine/data_store.rs:1173-1177 |
These methods provide essential metadata for maintenance decision-making.
Sources: src/storage_engine/data_store.rs:261-267 src/storage_engine/data_store.rs:1164-1181
Iterators for Maintenance
The storage provides multiple iteration mechanisms useful for maintenance:
Iteration Characteristics:
- Sequential :
iter_entries()at src/storage_engine/data_store.rs:269-280 - Parallel :
par_iter_entries()at src/storage_engine/data_store.rs:283-361 (requiresparallelfeature) - Latest Only : Both return only the most recent version of each key
- Zero-Copy : Return
EntryHandlewith memory-mapped data references
Sources: src/storage_engine/data_store.rs:269-361
Summary
The maintenance system provides:
| Component | Purpose | Key Method |
|---|---|---|
| Compaction | Reclaim space from outdated entries | compact() |
| Space Estimation | Calculate potential savings | estimate_compaction_savings() |
| Recovery | Validate and repair corrupted files | recover_valid_chain() |
| Re-indexing | Update structures after writes | reindex() |
| Iteration | Scan for maintenance operations | iter_entries(), par_iter_entries() |
All operations maintain the append-only guarantee while ensuring data integrity and efficient space utilization.
Sources: src/storage_engine/data_store.rs:1-1183
Dismiss
Refresh this wiki
Enter email to refresh