Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Repository Structure

Loading…

Repository Structure

Relevant source files

Purpose and Scope

This document describes the organization of the SIMD R Drive repository as a Cargo workspace, detailing the individual packages (crates) that comprise the system, their purposes, and their inter-dependencies. The repository is structured as a monorepo containing a core storage engine, supporting libraries, experimental network components, and language bindings.

For information about the core storage engine architecture and on-disk format, see Storage Architecture. For details on building and testing the codebase, see Building and Testing.


Workspace Organization

The SIMD R Drive repository is organized as a Cargo workspace defined in Cargo.toml:65-78 The workspace uses Cargo’s resolver version 2 and manages multiple interdependent packages with shared versioning and dependencies.

Sources: Cargo.toml:65-78


Workspace Configuration

The workspace defines common package metadata that all member crates inherit:

Metadata FieldValue
Version0.15.5-alpha
Edition2024
Repositoryhttps://github.com/jzombie/rust-simd-r-drive
LicenseApache-2.0
Categoriesdatabase-implementations, data-structures, filesystem
Keywordsstorage-engine, binary-storage, append-only, simd, mmap

Sources: Cargo.toml:1-9


Package Structure Overview

Workspace Members

The workspace includes six member crates defined in Cargo.toml:66-73:

  1. "." - The root simd-r-drive package
  2. "simd-r-drive-entry-handle" - Entry abstraction library
  3. "extensions" - Utility extensions
  4. "experiments/simd-r-drive-ws-server" - WebSocket server
  5. "experiments/simd-r-drive-ws-client" - WebSocket client
  6. "experiments/simd-r-drive-muxio-service-definition" - RPC service contract

Excluded Members

Two Python binding packages are excluded from the workspace Cargo.toml:74-77 because they use maturin with separate build systems:

  • "experiments/bindings/python" - Direct Rust-Python bindings
  • "experiments/bindings/python-ws-client" - Python WebSocket client bindings

Sources: Cargo.toml:65-78


Core Packages

simd-r-drive (Root Package)

Location: Root directory
Cargo Name: simd-r-drive
Description: “SIMD-optimized append-only schema-less storage engine. Key-based binary storage in a single-file storage container.”

This is the main storage engine package providing the DataStore API for append-only key-value storage with SIMD acceleration and memory-mapped file access.

Key Exports:

  • DataStore - Main storage interface
  • DataStoreReader / DataStoreWriter - Trait-based access patterns
  • KeyIndexer - Hash-based key indexing with xxh3_64

Dependencies:

  • simd-r-drive-entry-handle (workspace)
  • dashmap - Lock-free concurrent hash map
  • memmap2 - Memory-mapped file access
  • xxhash-rust - Fast hashing with SIMD support
  • rayon (optional, with parallel feature)

Features:

  • default - No features enabled by default
  • expose-internal-api - Exposes internal APIs for testing/extensions
  • parallel - Enables parallel iteration with rayon
  • arrow - Proxies to simd-r-drive-entry-handle/arrow

Sources: Cargo.toml:11-56


simd-r-drive-entry-handle

Location: simd-r-drive-entry-handle/
Cargo Name: simd-r-drive-entry-handle

Provides the EntryHandle abstraction for zero-copy access to stored entries via memory-mapped files. This package is separated to allow optional Apache Arrow integration without requiring arrow dependencies in the core package.

Key Exports:

  • EntryHandle - Zero-copy entry accessor
  • EntryMetadata - Entry metadata structure (key_hash, prev_offset, crc32)

Dependencies:

  • memmap2 - Memory-mapped file access
  • crc32fast - CRC32 checksum validation
  • arrow (optional, with arrow feature) - Apache Arrow buffer integration

Features:

  • arrow - Enables zero-copy integration with Apache Arrow buffers

Sources: Cargo.toml83 Cargo.lock:1823-1829


simd-r-drive-extensions

Location: extensions/
Cargo Name: simd-r-drive-extensions

Utility functions and helpers built on top of the core storage engine, including alignment utilities, formatting helpers, and namespace hashing.

Key Exports:

  • align_or_copy - Memory alignment utilities
  • format_bytes - Human-readable byte formatting
  • NamespaceHasher - Namespace-based key hashing
  • File verification utilities

Dependencies:

  • simd-r-drive (workspace)
  • bincode - Serialization support
  • serde - Serialization framework

Sources: Cargo.toml:66-73 Cargo.lock:1832-1841


Experimental Network Components

simd-r-drive-muxio-service-definition

Location: experiments/simd-r-drive-muxio-service-definition/
Cargo Name: simd-r-drive-muxio-service-definition

Defines the RPC service contract (interface definition) for remote access to the storage engine. This serves as the shared contract between WebSocket clients and servers, ensuring type-safe communication.

Key Exports:

  • Service trait definitions for RPC operations
  • Request/response message types
  • Bitcode serialization schemas

Dependencies:

  • bitcode - Compact binary serialization
  • muxio-rpc-service - RPC service framework

Sources: Cargo.toml:66-73 Cargo.lock:1844-1849


simd-r-drive-ws-server

Location: experiments/simd-r-drive-ws-server/
Cargo Name: simd-r-drive-ws-server

WebSocket server implementation providing remote RPC access to a DataStore instance via the muxio framework.

Key Exports:

  • WebSocket server with RPC endpoint
  • Service implementation for simd-r-drive-muxio-service-definition

Dependencies:

  • simd-r-drive (workspace)
  • simd-r-drive-muxio-service-definition (workspace)
  • muxio-tokio-rpc-server - RPC server implementation
  • tokio - Async runtime
  • clap - CLI argument parsing

Sources: Cargo.toml:66-73 Cargo.lock:1866-1878


simd-r-drive-ws-client

Location: experiments/simd-r-drive-ws-client/
Cargo Name: simd-r-drive-ws-client

Rust WebSocket client for connecting to simd-r-drive-ws-server instances. Provides a native Rust client API matching the core DataStore interface but operating over the network.

Key Exports:

  • WsClient - WebSocket client implementation
  • Async methods mirroring DataStore API

Dependencies:

  • simd-r-drive (workspace)
  • simd-r-drive-muxio-service-definition (workspace)
  • muxio-tokio-rpc-client - RPC client implementation
  • tokio - Async runtime

Sources: Cargo.toml:66-73 Cargo.lock:1852-1863


Python Bindings (External Build System)

experiments/bindings/python

Location: experiments/bindings/python/
Build System: Maturin + PyO3

Direct Python bindings to the core simd-r-drive package using PyO3. Provides a Python API for local (in-process) access to the storage engine. This package is excluded from the Cargo workspace because it uses a separate pyproject.toml build configuration with maturin.

Key Exports:

  • Python DataStore class wrapping Rust implementation
  • Type stubs (.pyi files) for IDE support

Sources: Cargo.toml:74-77


experiments/bindings/python-ws-client

Location: experiments/bindings/python-ws-client/
Build System: Maturin + PyO3

Python bindings for the simd-r-drive-ws-client, enabling remote access to storage servers from Python via asyncio. Uses pyo3-async-runtimes to bridge Python’s asyncio with Rust’s tokio.

Key Exports:

  • DataStoreWsClient - Python async WebSocket client
  • Asyncio-compatible API
  • Type stubs for Python type checkers

Sources: Cargo.toml:74-77


Dependency Relationships

Sources: Cargo.toml:23-34 Cargo.toml:80-112 Cargo.lock:1795-1878


Workspace Dependency Management

The workspace defines shared dependencies in the [workspace.dependencies] section Cargo.toml:80-112 to ensure version consistency across all member crates:

Intra-Workspace Dependencies

Key External Dependencies

DependencyVersionPurpose
memmap20.9.5Memory-mapped file access
dashmap6.1.0Lock-free concurrent hashmap
xxhash-rust0.8.15Fast non-cryptographic hashing
crc32fast1.4.2CRC32 checksum validation
arrow57.0.0Apache Arrow integration (optional)
tokio1.45.1Async runtime (experimental features only)
bitcode0.6.6Compact binary serialization
rayon1.10.0Parallel iteration (optional)

Sources: Cargo.toml:80-112


Feature Flags

The root simd-r-drive package defines three feature flags Cargo.toml:49-55:

default

No features enabled by default. This keeps the core storage engine lightweight with minimal dependencies.

expose-internal-api

Exposes internal APIs that are normally private. Used for extension development and integration testing. Not intended for general use.

parallel

Enables parallel iteration capabilities using the rayon crate. When enabled, operations like iter_entries() can leverage multi-core parallelism for improved throughput on large datasets.

arrow

A proxy feature that enables simd-r-drive-entry-handle/arrow. This provides zero-copy integration with Apache Arrow buffers, allowing EntryHandle instances to be viewed as Arrow arrays without data copying.

Sources: Cargo.toml:49-55


Benchmarks

The root package defines two benchmark suites using Criterion.rs Cargo.toml:57-63:

storage_benchmark

Measures write throughput, read throughput, batch operations, and streaming performance for the core storage engine.

contention_benchmark

Measures performance under concurrent access patterns, testing the effectiveness of the lock-free index and concurrent read scalability.

Both benchmarks use harness = false to integrate with Criterion’s custom benchmark harness.

Sources: Cargo.toml:57-63


Version Management

All workspace members share a common version number 0.15.5-alpha managed through Cargo.toml3 The -alpha suffix indicates this is pre-release software under active development. The workspace uses semantic versioning, where:

  • Major version (0): Pre-1.0 indicating API instability
  • Minor version (15): Feature releases and API changes
  • Patch version (5): Bug fixes and minor improvements
  • Suffix (-alpha): Pre-release stability indicator

Sources: Cargo.toml3


File System Layout

The physical repository structure mirrors the logical package organization:

rust-simd-r-drive/
├── Cargo.toml                      # Workspace root
├── Cargo.lock                      # Dependency lock file
├── src/                            # simd-r-drive source
├── benches/                        # Benchmark suites
├── simd-r-drive-entry-handle/      # Entry handle crate
│   ├── Cargo.toml
│   └── src/
├── extensions/                     # Extensions crate
│   ├── Cargo.toml
│   └── src/
└── experiments/
    ├── simd-r-drive-muxio-service-definition/
    │   ├── Cargo.toml
    │   └── src/
    ├── simd-r-drive-ws-server/
    │   ├── Cargo.toml
    │   └── src/
    ├── simd-r-drive-ws-client/
    │   ├── Cargo.toml
    │   └── src/
    └── bindings/
        ├── python/                 # Excluded from workspace
        │   ├── pyproject.toml
        │   └── src/
        └── python-ws-client/       # Excluded from workspace
            ├── pyproject.toml
            └── src/

Sources: Cargo.toml:65-78


Summary Table: All Packages

| Package Name | Location | Type | Dependencies | Purpose | |—|—|—|—| | simd-r-drive | . | Core | simd-r-drive-entry-handle, dashmap, memmap2, xxhash-rust | Main storage engine | | simd-r-drive-entry-handle | simd-r-drive-entry-handle/ | Library | memmap2, crc32fast, arrow (opt) | Entry abstraction | | simd-r-drive-extensions | extensions/ | Library | simd-r-drive, bincode | Utility functions | | simd-r-drive-muxio-service-definition | experiments/... | Library | bitcode, muxio-rpc-service | RPC contract | | simd-r-drive-ws-server | experiments/... | Binary | Core + service-def + muxio-server | WebSocket server | | simd-r-drive-ws-client | experiments/... | Library | Core + service-def + muxio-client | WebSocket client | | Python bindings | experiments/bindings/python | PyO3 | simd-r-drive, pyo3 | Direct Python access | | Python WS client | experiments/bindings/python-ws-client | PyO3 | simd-r-drive-ws-client, pyo3 | Remote Python access |

Sources: Cargo.toml:1-112 Cargo.lock:1795-1878

Dismiss

Refresh this wiki

Enter email to refresh