This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Repository Structure
Loading…
Repository Structure
Relevant source files
Purpose and Scope
This document describes the organization of the SIMD R Drive repository as a Cargo workspace, detailing the individual packages (crates) that comprise the system, their purposes, and their inter-dependencies. The repository is structured as a monorepo containing a core storage engine, supporting libraries, experimental network components, and language bindings.
For information about the core storage engine architecture and on-disk format, see Storage Architecture. For details on building and testing the codebase, see Building and Testing.
Workspace Organization
The SIMD R Drive repository is organized as a Cargo workspace defined in Cargo.toml:65-78 The workspace uses Cargo’s resolver version 2 and manages multiple interdependent packages with shared versioning and dependencies.
Sources: Cargo.toml:65-78
Workspace Configuration
The workspace defines common package metadata that all member crates inherit:
| Metadata Field | Value |
|---|---|
| Version | 0.15.5-alpha |
| Edition | 2024 |
| Repository | https://github.com/jzombie/rust-simd-r-drive |
| License | Apache-2.0 |
| Categories | database-implementations, data-structures, filesystem |
| Keywords | storage-engine, binary-storage, append-only, simd, mmap |
Sources: Cargo.toml:1-9
Package Structure Overview
Workspace Members
The workspace includes six member crates defined in Cargo.toml:66-73:
"."- The rootsimd-r-drivepackage"simd-r-drive-entry-handle"- Entry abstraction library"extensions"- Utility extensions"experiments/simd-r-drive-ws-server"- WebSocket server"experiments/simd-r-drive-ws-client"- WebSocket client"experiments/simd-r-drive-muxio-service-definition"- RPC service contract
Excluded Members
Two Python binding packages are excluded from the workspace Cargo.toml:74-77 because they use maturin with separate build systems:
"experiments/bindings/python"- Direct Rust-Python bindings"experiments/bindings/python-ws-client"- Python WebSocket client bindings
Sources: Cargo.toml:65-78
Core Packages
simd-r-drive (Root Package)
Location: Root directory
Cargo Name: simd-r-drive
Description: “SIMD-optimized append-only schema-less storage engine. Key-based binary storage in a single-file storage container.”
This is the main storage engine package providing the DataStore API for append-only key-value storage with SIMD acceleration and memory-mapped file access.
Key Exports:
DataStore- Main storage interfaceDataStoreReader/DataStoreWriter- Trait-based access patternsKeyIndexer- Hash-based key indexing with xxh3_64
Dependencies:
simd-r-drive-entry-handle(workspace)dashmap- Lock-free concurrent hash mapmemmap2- Memory-mapped file accessxxhash-rust- Fast hashing with SIMD supportrayon(optional, withparallelfeature)
Features:
default- No features enabled by defaultexpose-internal-api- Exposes internal APIs for testing/extensionsparallel- Enables parallel iteration with rayonarrow- Proxies tosimd-r-drive-entry-handle/arrow
Sources: Cargo.toml:11-56
simd-r-drive-entry-handle
Location: simd-r-drive-entry-handle/
Cargo Name: simd-r-drive-entry-handle
Provides the EntryHandle abstraction for zero-copy access to stored entries via memory-mapped files. This package is separated to allow optional Apache Arrow integration without requiring arrow dependencies in the core package.
Key Exports:
EntryHandle- Zero-copy entry accessorEntryMetadata- Entry metadata structure (key_hash, prev_offset, crc32)
Dependencies:
memmap2- Memory-mapped file accesscrc32fast- CRC32 checksum validationarrow(optional, witharrowfeature) - Apache Arrow buffer integration
Features:
arrow- Enables zero-copy integration with Apache Arrow buffers
Sources: Cargo.toml83 Cargo.lock:1823-1829
simd-r-drive-extensions
Location: extensions/
Cargo Name: simd-r-drive-extensions
Utility functions and helpers built on top of the core storage engine, including alignment utilities, formatting helpers, and namespace hashing.
Key Exports:
align_or_copy- Memory alignment utilitiesformat_bytes- Human-readable byte formattingNamespaceHasher- Namespace-based key hashing- File verification utilities
Dependencies:
simd-r-drive(workspace)bincode- Serialization supportserde- Serialization framework
Sources: Cargo.toml:66-73 Cargo.lock:1832-1841
Experimental Network Components
simd-r-drive-muxio-service-definition
Location: experiments/simd-r-drive-muxio-service-definition/
Cargo Name: simd-r-drive-muxio-service-definition
Defines the RPC service contract (interface definition) for remote access to the storage engine. This serves as the shared contract between WebSocket clients and servers, ensuring type-safe communication.
Key Exports:
- Service trait definitions for RPC operations
- Request/response message types
- Bitcode serialization schemas
Dependencies:
bitcode- Compact binary serializationmuxio-rpc-service- RPC service framework
Sources: Cargo.toml:66-73 Cargo.lock:1844-1849
simd-r-drive-ws-server
Location: experiments/simd-r-drive-ws-server/
Cargo Name: simd-r-drive-ws-server
WebSocket server implementation providing remote RPC access to a DataStore instance via the muxio framework.
Key Exports:
- WebSocket server with RPC endpoint
- Service implementation for
simd-r-drive-muxio-service-definition
Dependencies:
simd-r-drive(workspace)simd-r-drive-muxio-service-definition(workspace)muxio-tokio-rpc-server- RPC server implementationtokio- Async runtimeclap- CLI argument parsing
Sources: Cargo.toml:66-73 Cargo.lock:1866-1878
simd-r-drive-ws-client
Location: experiments/simd-r-drive-ws-client/
Cargo Name: simd-r-drive-ws-client
Rust WebSocket client for connecting to simd-r-drive-ws-server instances. Provides a native Rust client API matching the core DataStore interface but operating over the network.
Key Exports:
WsClient- WebSocket client implementation- Async methods mirroring
DataStoreAPI
Dependencies:
simd-r-drive(workspace)simd-r-drive-muxio-service-definition(workspace)muxio-tokio-rpc-client- RPC client implementationtokio- Async runtime
Sources: Cargo.toml:66-73 Cargo.lock:1852-1863
Python Bindings (External Build System)
experiments/bindings/python
Location: experiments/bindings/python/
Build System: Maturin + PyO3
Direct Python bindings to the core simd-r-drive package using PyO3. Provides a Python API for local (in-process) access to the storage engine. This package is excluded from the Cargo workspace because it uses a separate pyproject.toml build configuration with maturin.
Key Exports:
- Python
DataStoreclass wrapping Rust implementation - Type stubs (
.pyifiles) for IDE support
Sources: Cargo.toml:74-77
experiments/bindings/python-ws-client
Location: experiments/bindings/python-ws-client/
Build System: Maturin + PyO3
Python bindings for the simd-r-drive-ws-client, enabling remote access to storage servers from Python via asyncio. Uses pyo3-async-runtimes to bridge Python’s asyncio with Rust’s tokio.
Key Exports:
DataStoreWsClient- Python async WebSocket client- Asyncio-compatible API
- Type stubs for Python type checkers
Sources: Cargo.toml:74-77
Dependency Relationships
Sources: Cargo.toml:23-34 Cargo.toml:80-112 Cargo.lock:1795-1878
Workspace Dependency Management
The workspace defines shared dependencies in the [workspace.dependencies] section Cargo.toml:80-112 to ensure version consistency across all member crates:
Intra-Workspace Dependencies
Key External Dependencies
| Dependency | Version | Purpose |
|---|---|---|
memmap2 | 0.9.5 | Memory-mapped file access |
dashmap | 6.1.0 | Lock-free concurrent hashmap |
xxhash-rust | 0.8.15 | Fast non-cryptographic hashing |
crc32fast | 1.4.2 | CRC32 checksum validation |
arrow | 57.0.0 | Apache Arrow integration (optional) |
tokio | 1.45.1 | Async runtime (experimental features only) |
bitcode | 0.6.6 | Compact binary serialization |
rayon | 1.10.0 | Parallel iteration (optional) |
Sources: Cargo.toml:80-112
Feature Flags
The root simd-r-drive package defines three feature flags Cargo.toml:49-55:
default
No features enabled by default. This keeps the core storage engine lightweight with minimal dependencies.
expose-internal-api
Exposes internal APIs that are normally private. Used for extension development and integration testing. Not intended for general use.
parallel
Enables parallel iteration capabilities using the rayon crate. When enabled, operations like iter_entries() can leverage multi-core parallelism for improved throughput on large datasets.
arrow
A proxy feature that enables simd-r-drive-entry-handle/arrow. This provides zero-copy integration with Apache Arrow buffers, allowing EntryHandle instances to be viewed as Arrow arrays without data copying.
Sources: Cargo.toml:49-55
Benchmarks
The root package defines two benchmark suites using Criterion.rs Cargo.toml:57-63:
storage_benchmark
Measures write throughput, read throughput, batch operations, and streaming performance for the core storage engine.
contention_benchmark
Measures performance under concurrent access patterns, testing the effectiveness of the lock-free index and concurrent read scalability.
Both benchmarks use harness = false to integrate with Criterion’s custom benchmark harness.
Sources: Cargo.toml:57-63
Version Management
All workspace members share a common version number 0.15.5-alpha managed through Cargo.toml3 The -alpha suffix indicates this is pre-release software under active development. The workspace uses semantic versioning, where:
- Major version (0): Pre-1.0 indicating API instability
- Minor version (15): Feature releases and API changes
- Patch version (5): Bug fixes and minor improvements
- Suffix (-alpha): Pre-release stability indicator
Sources: Cargo.toml3
File System Layout
The physical repository structure mirrors the logical package organization:
rust-simd-r-drive/
├── Cargo.toml # Workspace root
├── Cargo.lock # Dependency lock file
├── src/ # simd-r-drive source
├── benches/ # Benchmark suites
├── simd-r-drive-entry-handle/ # Entry handle crate
│ ├── Cargo.toml
│ └── src/
├── extensions/ # Extensions crate
│ ├── Cargo.toml
│ └── src/
└── experiments/
├── simd-r-drive-muxio-service-definition/
│ ├── Cargo.toml
│ └── src/
├── simd-r-drive-ws-server/
│ ├── Cargo.toml
│ └── src/
├── simd-r-drive-ws-client/
│ ├── Cargo.toml
│ └── src/
└── bindings/
├── python/ # Excluded from workspace
│ ├── pyproject.toml
│ └── src/
└── python-ws-client/ # Excluded from workspace
├── pyproject.toml
└── src/
Sources: Cargo.toml:65-78
Summary Table: All Packages
| Package Name | Location | Type | Dependencies | Purpose | |—|—|—|—| | simd-r-drive | . | Core | simd-r-drive-entry-handle, dashmap, memmap2, xxhash-rust | Main storage engine | | simd-r-drive-entry-handle | simd-r-drive-entry-handle/ | Library | memmap2, crc32fast, arrow (opt) | Entry abstraction | | simd-r-drive-extensions | extensions/ | Library | simd-r-drive, bincode | Utility functions | | simd-r-drive-muxio-service-definition | experiments/... | Library | bitcode, muxio-rpc-service | RPC contract | | simd-r-drive-ws-server | experiments/... | Binary | Core + service-def + muxio-server | WebSocket server | | simd-r-drive-ws-client | experiments/... | Library | Core + service-def + muxio-client | WebSocket client | | Python bindings | experiments/bindings/python | PyO3 | simd-r-drive, pyo3 | Direct Python access | | Python WS client | experiments/bindings/python-ws-client | PyO3 | simd-r-drive-ws-client, pyo3 | Remote Python access |
Sources: Cargo.toml:1-112 Cargo.lock:1795-1878
Dismiss
Refresh this wiki
Enter email to refresh