This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Extensions and Utilities
Loading…
Extensions and Utilities
Relevant source files
- extensions/Cargo.toml
- simd-r-drive-entry-handle/src/constants.rs
- simd-r-drive-entry-handle/src/lib.rs
- src/utils.rs
- src/utils/align_or_copy.rs
- src/utils/verify_file_existence.rs
- tests/align_or_copy_tests.rs
This document covers the utility functions, helper modules, and constants provided by the SIMD R Drive ecosystem. These components include the simd-r-drive-extensions crate for higher-level storage operations, core utility functions in the main simd-r-drive crate, and shared constants from simd-r-drive-entry-handle.
For details on the core storage engine API, see DataStore API. For performance optimization features like SIMD acceleration, see SIMD Acceleration. For alignment-related architecture decisions, see Payload Alignment and Cache Efficiency.
Extensions Crate Overview
The simd-r-drive-extensions crate provides storage extensions and higher-level utilities built on top of the core simd-r-drive storage engine. It adds functionality for common storage patterns and data manipulation tasks.
graph TB
subgraph "simd-r-drive-extensions"
ExtCrate["simd-r-drive-extensions"]
ExtDeps["Dependencies:\n- bincode\n- serde\n- simd-r-drive\n- walkdir"]
end
subgraph "Core Dependencies"
Core["simd-r-drive"]
Bincode["bincode\nBinary Serialization"]
Serde["serde\nSerialization Traits"]
Walkdir["walkdir\nDirectory Traversal"]
end
ExtCrate --> ExtDeps
ExtDeps --> Core
ExtDeps --> Bincode
ExtDeps --> Serde
ExtDeps --> Walkdir
Core -.->|provides| DataStore["DataStore"]
Bincode -.->|enables| SerializationSupport["Structured Data Storage"]
Walkdir -.->|enables| FileSystemOps["File System Operations"]
Crate Structure
Sources: extensions/Cargo.toml:1-22
| Dependency | Purpose |
|---|---|
bincode | Binary serialization/deserialization for structured data storage |
serde | Serialization trait support with derive macros |
simd-r-drive | Core storage engine access |
walkdir | Directory tree traversal utilities |
Sources: extensions/Cargo.toml:13-17
Core Utilities Module
The main simd-r-drive crate exposes several utility functions through its utils module. These functions handle common tasks like alignment optimization, string formatting, and data validation.
graph TB
subgraph "utils Module"
UtilsRoot["src/utils.rs"]
AlignOrCopy["align_or_copy\nZero-Copy Optimization"]
AppendExt["append_extension\nString Path Handling"]
FormatBytes["format_bytes\nHuman-Readable Sizes"]
NamespaceHasher["NamespaceHasher\nHierarchical Keys"]
ParseBuffer["parse_buffer_size\nSize String Parsing"]
VerifyFile["verify_file_existence\nFile Validation"]
end
UtilsRoot --> AlignOrCopy
UtilsRoot --> AppendExt
UtilsRoot --> FormatBytes
UtilsRoot --> NamespaceHasher
UtilsRoot --> ParseBuffer
UtilsRoot --> VerifyFile
AlignOrCopy -.->|used by| ReadOps["Read Operations"]
NamespaceHasher -.->|used by| KeyManagement["Key Management"]
FormatBytes -.->|used by| Logging["Logging & Reporting"]
ParseBuffer -.->|used by| Config["Configuration Parsing"]
Utility Functions Overview
Sources: src/utils.rs:1-17
align_or_copy Function
The align_or_copy utility function provides zero-copy deserialization with automatic fallback for misaligned data. It attempts to reinterpret a byte slice as a typed slice without copying, and falls back to manual decoding when alignment requirements are not met.
Function Signature
Sources: src/utils/align_or_copy.rs:44-50
Operation Flow
Sources: src/utils/align_or_copy.rs:44-73
Usage Patterns
| Scenario | Outcome | Performance |
|---|---|---|
| Aligned 64-byte boundary, exact multiple | Cow::Borrowed | Zero-copy, optimal |
| Misaligned address | Cow::Owned | Allocation + decode |
| Non-multiple of element size | Panic | Invalid input |
Example Usage:
Sources: src/utils/align_or_copy.rs:38-43 tests/align_or_copy_tests.rs:7-12
Safety Considerations
The function uses unsafe for the align_to::<T>() call, which requires:
- Starting address must be aligned to
align_of::<T>() - Total size must be a multiple of
size_of::<T>()
These requirements are validated by checking that prefix and suffix slices are empty before returning the borrowed slice. If validation fails, the function falls back to safe manual decoding.
Sources: src/utils/align_or_copy.rs:28-35 src/utils/align_or_copy.rs:53-60
Other Utility Functions
| Function | Module Path | Purpose |
|---|---|---|
append_extension | src/utils/append_extension.rs | Safely appends file extensions to paths |
format_bytes | src/utils/format_bytes.rs | Formats byte counts as human-readable strings (KB, MB, GB) |
NamespaceHasher | src/utils/namespace_hasher.rs | Generates hierarchical, namespaced hash keys |
parse_buffer_size | src/utils/parse_buffer_size.rs | Parses size strings like “64KB”, “1MB” into byte counts |
verify_file_existence | src/utils/verify_file_existence.rs | Validates file paths before operations |
Sources: src/utils.rs:1-17
Entry Handle Constants
The simd-r-drive-entry-handle crate defines shared constants used throughout the storage system. These constants establish the binary layout of entries and alignment requirements.
graph TB
subgraph "simd-r-drive-entry-handle"
LibRoot["lib.rs"]
ConstMod["constants.rs"]
EntryHandle["entry_handle.rs"]
EntryMetadata["entry_metadata.rs"]
DebugAssert["debug_assert_aligned.rs"]
end
subgraph "Exported Constants"
MetadataSize["METADATA_SIZE = 20"]
KeyHashRange["KEY_HASH_RANGE = 0..8"]
PrevOffsetRange["PREV_OFFSET_RANGE = 8..16"]
ChecksumRange["CHECKSUM_RANGE = 16..20"]
ChecksumLen["CHECKSUM_LEN = 4"]
PayloadLog["PAYLOAD_ALIGN_LOG2 = 6"]
PayloadAlign["PAYLOAD_ALIGNMENT = 64"]
end
LibRoot --> ConstMod
LibRoot --> EntryHandle
LibRoot --> EntryMetadata
LibRoot --> DebugAssert
ConstMod --> MetadataSize
ConstMod --> KeyHashRange
ConstMod --> PrevOffsetRange
ConstMod --> ChecksumRange
ConstMod --> ChecksumLen
ConstMod --> PayloadLog
ConstMod --> PayloadAlign
PayloadAlign -.->|ensures| CacheLineOpt["Cache-Line Optimization"]
PayloadAlign -.->|enables| SIMDOps["SIMD Operations"]
Constants Module Structure
Sources: simd-r-drive-entry-handle/src/lib.rs:1-10 simd-r-drive-entry-handle/src/constants.rs:1-19
Metadata Layout Constants
The following constants define the fixed 20-byte metadata structure at the end of each entry:
| Constant | Value | Description |
|---|---|---|
METADATA_SIZE | 20 | Total size of entry metadata in bytes |
KEY_HASH_RANGE | 0..8 | Byte range for 64-bit XXH3 key hash |
PREV_OFFSET_RANGE | 8..16 | Byte range for 64-bit previous entry offset |
CHECKSUM_RANGE | 16..20 | Byte range for 32-bit CRC32C checksum |
CHECKSUM_LEN | 4 | Explicit length of checksum field |
Sources: simd-r-drive-entry-handle/src/constants.rs:3-11
Alignment Constants
These constants enforce 64-byte alignment for all payload data:
PAYLOAD_ALIGN_LOG2: Base-2 logarithm of alignment requirement (6 = 64 bytes)PAYLOAD_ALIGNMENT: Computed alignment value (64 bytes)
This alignment matches CPU cache line sizes and enables efficient SIMD operations. The maximum pre-padding per entry is PAYLOAD_ALIGNMENT - 1 (63 bytes).
Sources: simd-r-drive-entry-handle/src/constants.rs:13-18
Constant Relationships
Sources: simd-r-drive-entry-handle/src/constants.rs:1-19
sequenceDiagram
participant Client
participant EntryHandle
participant align_or_copy
participant Memory
Client->>EntryHandle: get_payload_bytes()
EntryHandle->>Memory: read &[u8] from mmap
EntryHandle->>align_or_copy: align_or_copy<f32, 4>(bytes, f32::from_le_bytes)
alt Aligned on 64-byte boundary
align_or_copy->>Memory: validate alignment
align_or_copy-->>Client: Cow::Borrowed(&[f32])
Note over Client,Memory: Zero-copy: direct memory access\nelse Misaligned
align_or_copy->>align_or_copy: chunks_exact(4)
align_or_copy->>align_or_copy: map(f32::from_le_bytes)
align_or_copy->>align_or_copy: collect into Vec<f32>
align_or_copy-->>Client: Cow::Owned(Vec<f32>)
Note over Client,align_or_copy: Fallback: allocated copy
end
Common Patterns
Zero-Copy Data Access
Utilities like align_or_copy enable zero-copy access patterns when memory alignment allows:
Sources: src/utils/align_or_copy.rs:44-73 simd-r-drive-entry-handle/src/constants.rs:13-18
Namespace-Based Key Management
The NamespaceHasher utility enables hierarchical key organization:
Sources: src/utils.rs:11-12
Size Formatting for Logging
The format_bytes utility provides human-readable output:
| Input Bytes | Formatted Output |
|---|---|
| 1023 | “1023 B” |
| 1024 | “1.00 KB” |
| 1048576 | “1.00 MB” |
| 1073741824 | “1.00 GB” |
Sources: src/utils.rs:7-8
Configuration Parsing
The parse_buffer_size utility handles size string inputs:
| Input String | Parsed Bytes |
|---|---|
| “64” | 64 |
| “64KB” | 65,536 |
| “1MB” | 1,048,576 |
| “2GB” | 2,147,483,648 |
Sources: src/utils.rs:13-14
Integration with Core Systems
Relationship to Storage Engine
Sources: extensions/Cargo.toml:1-22 src/utils.rs:1-17 simd-r-drive-entry-handle/src/lib.rs:1-10
Performance Considerations
| Utility | Performance Impact | Use Case |
|---|---|---|
align_or_copy | Zero-copy when aligned | Deserializing typed arrays from storage |
NamespaceHasher | Single XXH3 hash | Generating hierarchical keys |
format_bytes | String allocation | Logging and user display only |
PAYLOAD_ALIGNMENT | Enables SIMD ops | Core storage layout requirement |
Sources: src/utils/align_or_copy.rs:1-74 simd-r-drive-entry-handle/src/constants.rs:13-18
Dismiss
Refresh this wiki
Enter email to refresh