Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Python Integration

Loading…

Python Integration

Relevant source files

This page provides an overview of the Python bindings for SIMD R Drive. The system offers two approaches for Python integration:

  1. Modern WebSocket Client (simd-r-drive-ws-client-py): Communicates with a remote simd-r-drive-ws-server via WebSocket RPC. This is the primary, recommended approach documented in this section.
  2. Legacy Direct Bindings (simd-r-drive-py): Directly embeds the Rust storage engine into Python. This approach is deprecated and not covered in detail here.

The WebSocket client bindings are implemented in Rust using PyO3, compiled to native Python extension modules (.so/.pyd), and distributed as platform-specific wheels via Maturin. The package is published as simd-r-drive-ws-client on PyPI.

Related Pages:

Sources: experiments/bindings/python-ws-client/README.md:1-60 experiments/bindings/python-ws-client/pyproject.toml:1-6

Architecture Overview

The WebSocket client bindings use a layered architecture that bridges Python user code to the native Rust WebSocket client implementation. The package consists of pure Python wrapper code, PyO3-compiled Rust bindings, and the underlying simd-r-drive-ws-client Rust crate.

Diagram: Python Binding Architecture with Code Entities

graph TB
    subgraph "Python_Layer"
        UserCode["user_script.py"]
Import["from simd_r_drive_ws_client import DataStoreWsClient"]
end
    
    subgraph "Package_simd_r_drive_ws_client"
        InitPy["__init__.py"]
DataStoreWsClientPy["data_store_ws_client.py::DataStoreWsClient"]
TypeStubs["data_store_ws_client.pyi"]
end
    
    subgraph "PyO3_Native_Extension"
        BinaryModule["simd_r_drive_ws_client.so / .pyd"]
BaseDataStoreWsClient["BaseDataStoreWsClient"]
NamespaceHasher["NamespaceHasher"]
end
    
    subgraph "Rust_Dependencies"
        WsClient["simd-r-drive-ws-client crate"]
MuxioClient["muxio-tokio-rpc-client"]
ServiceDef["simd-r-drive-muxio-service-definition"]
end
    
 
   UserCode --> Import
 
   Import --> InitPy
 
   InitPy --> DataStoreWsClientPy
    DataStoreWsClientPy -.inherits.-> BaseDataStoreWsClient
    DataStoreWsClientPy -.types.-> TypeStubs
    
 
   BaseDataStoreWsClient --> WsClient
 
   NamespaceHasher --> WsClient
 
   BinaryModule --> BaseDataStoreWsClient
 
   BinaryModule --> NamespaceHasher
    
 
   WsClient --> MuxioClient
 
   WsClient --> ServiceDef

Architecture Layers

LayerComponentsTechnologyLocation
Python User CodeApplication scriptsPure PythonUser-provided
Python PackageDataStoreWsClient, __init__.pyPure Pythonexperiments/bindings/python-ws-client/simd_r_drive_ws_client/
PyO3 BindingsBaseDataStoreWsClient, NamespaceHasherRust → compiled .so/.pydexperiments/bindings/python-ws-client/src/lib.rs
Rust Implementationsimd-r-drive-ws-client, muxio-*Native Rust cratesexperiments/ws-client/

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:1-63 experiments/bindings/python-ws-client/README.md:12-15

Python API Surface

The simd-r-drive-ws-client package exposes two primary classes:

  1. DataStoreWsClient - Main client for read/write operations
  2. NamespaceHasher - Utility for generating collision-free namespaced keys
graph TB
    subgraph "Python_Wrapper"
        DSWsClient["data_store_ws_client.py::DataStoreWsClient"]
NSHasher["NamespaceHasher"]
end
    
    subgraph "PyO3_Bindings"
        BaseClient["BaseDataStoreWsClient"]
NSHasherImpl["NamespaceHasher_impl"]
end
    
    subgraph "Method_Sources"
        RustMethods["write()\nbatch_write()\ndelete()\nread()\nbatch_read()\nexists()\n__len__()\n__contains__()\nis_empty()\nfile_size()"]
PythonMethods["batch_read_structured()"]
end
    
 
   DSWsClient -->|inherits| BaseClient
 
   NSHasher -->|exposed via PyO3| NSHasherImpl
    
 
   BaseClient --> RustMethods
 
   DSWsClient --> PythonMethods
    
    RustMethods -.implemented in.-> WsClientCrate["simd-r-drive-ws-client"]

The API is implemented through a combination of Rust PyO3 bindings (BaseDataStoreWsClient) and Python wrapper code that adds convenience methods.

Diagram: Class Hierarchy and Method Implementation

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:11-63 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:8-219

Core Operations

DataStoreWsClient provides operations organized by implementation layer:

Operation TypeMethodsImplementationFile Reference
Write Operationswrite(), batch_write(), delete()Rust (BaseDataStoreWsClient)data_store_ws_client.pyi:27-141
Read Operationsread(), batch_read(), exists()Rust (BaseDataStoreWsClient)data_store_ws_client.pyi:53-107
Metadata Operations__len__(), __contains__(), is_empty(), file_size()Rust (BaseDataStoreWsClient)data_store_ws_client.pyi:143-168
Structured Readsbatch_read_structured()Python wrapperdata_store_ws_client.py:12-62

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-168 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:11-63

Python-Rust Method Mapping

Diagram: Method Call Flow from Python to Rust

The batch_read_structured() method demonstrates the hybrid approach:

StepLayerAction
1. DecompilePythonExtract flat list of keys from nested dict/list structure
2. Batch readRustCall fast batch_read() via PyO3
3. RebuildPythonReconstruct original structure with fetched values

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:12-62 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:109-129

PyO3 Binding Architecture

PyO3 provides the FFI layer that exposes Rust structs and methods as Python classes. The binding layer uses PyO3 procedural macros (#[pyclass], #[pymethods]) to generate Python-compatible wrappers around Rust types.

Diagram: PyO3 Macro Transformation Pipeline

graph TB
    subgraph "Rust_Source_Code"
        StructDef["#[pyclass]\nstruct BaseDataStoreWsClient"]
MethodsDef["#[pymethods]\nimpl BaseDataStoreWsClient"]
NSStruct["#[pyclass]\nstruct NamespaceHasher"]
NSMethods["#[pymethods]\nimpl NamespaceHasher"]
end
    
    subgraph "PyO3_Macro_Expansion"
        PyClassTrait["PyClass trait\nPyTypeInfo\nPyObjectProtocol"]
PyMethodsWrap["Method wrappers\nPyArg extraction\nResult conversion"]
end
    
    subgraph "Python_Extension_Module"
        PythonClass["BaseDataStoreWsClient\nwrite()\nread()\nbatch_write()"]
PythonNS["NamespaceHasher\n__init__()\nnamespace()"]
end
    
 
   StructDef --> PyClassTrait
 
   MethodsDef --> PyMethodsWrap
 
   NSStruct --> PyClassTrait
 
   NSMethods --> PyMethodsWrap
    
 
   PyClassTrait --> PythonClass
 
   PyMethodsWrap --> PythonClass
 
   PyClassTrait --> PythonNS
 
   PyMethodsWrap --> PythonNS

PyO3 Macro Functions

MacroPurposeGenerated Code
#[pyclass]Mark Rust struct as Python classImplements PyTypeInfo, PyClass, reference counting
#[pymethods]Expose Rust methods to PythonGenerates wrapper functions with argument extraction and error handling
#[pyfunction]Expose standalone Rust functionsModule-level function bindings

Sources: experiments/bindings/python-ws-client/Cargo.lock:832-846 experiments/bindings/python-ws-client/Cargo.lock:1096-1108

graph TB
    subgraph "Python_Async_Layer"
        PyAsyncCall["await client.write(key, data)"]
PyEventLoop["asyncio event loop"]
end
    
    subgraph "pyo3_async_runtimes_Bridge"
        Bridge["pyo3_async_runtimes::tokio"]
FutureConv["Future<Output=T> → PyObject"]
LocalSet["LocalSet spawning"]
end
    
    subgraph "Tokio_Runtime"
        TokioFuture["async fn write() → Future"]
TokioExecutor["Tokio thread pool"]
end
    
 
   PyAsyncCall --> PyEventLoop
 
   PyEventLoop --> Bridge
 
   Bridge --> FutureConv
 
   FutureConv --> LocalSet
 
   LocalSet --> TokioFuture
 
   TokioFuture --> TokioExecutor

Async Runtime Bridge

The Python bindings use pyo3-async-runtimes to bridge Python’s async/await model with Rust’s Tokio runtime. This enables Python code to call async Rust methods transparently.

Diagram: Python-Tokio Async Bridge

Runtime Bridge Components

ComponentCrateFunction
pyo3-async-runtimesCargo.lock:849-860Async bridge between Python and Tokio
tokioCargo.lock:1287-1308Rust async runtime
PyO3Cargo.lock:832-846FFI layer for Python-Rust interop

The bridge automatically converts Rust Future<Output=T> values to Python awaitables, handling the differences in execution models between Python’s single-threaded async and Tokio’s work-stealing scheduler.

Sources: experiments/bindings/python-ws-client/Cargo.lock:849-860 experiments/bindings/python-ws-client/Cargo.lock:1287-1308

graph LR
    subgraph "Configuration"
        PyProject["pyproject.toml\n[build-system]\nbuild-backend = maturin"]
CargoToml["Cargo.toml\n[lib]\ncrate-type = ['cdylib']"]
end
    
    subgraph "Build_Process"
        RustcCompile["rustc\n--crate-type=cdylib\nPyO3 linking"]
CreateExtension["simd_r_drive_ws_client.so\nor .pyd"]
PackageWheel["maturin build\nAdd Python files\nAdd metadata"]
end
    
    subgraph "Artifacts"
        Wheel["simd_r_drive_ws_client-0.11.1-cp310-linux_x86_64.whl"]
PyPI["PyPI\npip install simd-r-drive-ws-client"]
end
    
 
   PyProject --> RustcCompile
 
   CargoToml --> RustcCompile
 
   RustcCompile --> CreateExtension
 
   CreateExtension --> PackageWheel
 
   PackageWheel --> Wheel
 
   Wheel --> PyPI

Build and Distribution System

The Python package is built using Maturin, which compiles Rust code to native extensions and packages them as platform-specific wheels. The build process produces binary wheels containing the compiled .so (Linux/macOS) or .pyd (Windows) extension module.

Diagram: Maturin Build and Distribution Pipeline

Sources: experiments/bindings/python-ws-client/pyproject.toml:29-35 experiments/bindings/python-ws-client/README.md:25-38

Build Configuration

pyproject.toml configures the build system and package metadata:

SectionLinesConfiguration
[project]pyproject.toml:1-27Package name, version, description, PyPI classifiers
[build-system]pyproject.toml:29-31requires = ["maturin>=1.5"], build-backend = "maturin"
[tool.maturin]pyproject.toml:33-35bindings = "pyo3", requires-python = ">=3.10"
[dependency-groups]pyproject.toml:37-46Development dependencies: maturin, pytest, mypy, numpy

Build Commands

Sources: experiments/bindings/python-ws-client/pyproject.toml:1-47 experiments/bindings/python-ws-client/README.md:31-36

Platform and Python Version Support

The package is distributed as pre-compiled wheels for multiple Python versions and platforms.

Supported Configurations

ComponentSupported Versions/Platforms
Python3.10, 3.11, 3.12, 3.13 (CPython only)
Operating SystemsLinux (x86_64, aarch64), macOS (x86_64, arm64), Windows (x86_64)
Architectures64-bit only

Wheel Naming Convention

simd_r_drive_ws_client-{version}-{python_tag}-{platform_tag}.whl

Examples:
- simd_r_drive_ws_client-0.11.1-cp310-cp310-manylinux_2_17_x86_64.whl
- simd_r_drive_ws_client-0.11.1-cp312-cp312-macosx_11_0_arm64.whl
- simd_r_drive_ws_client-0.11.1-cp313-cp313-win_amd64.whl

Sources: experiments/bindings/python-ws-client/pyproject.toml:19-27 experiments/bindings/python-ws-client/README.md:18-23

Dependency Management

The package uses uv for Python dependency management and cargo for Rust dependencies. Runtime Python dependencies are zero—all Rust dependencies are statically compiled into the wheel.

Dependency Categories

CategoryToolsLock FilePurpose
Python Developmentuv pip, pytest, mypyuv.lock:1-299Testing, type checking, benchmarking
Rust DependenciescargoCargo.lock:1-1380Core functionality, WebSocket RPC, serialization
Build ToolsmaturinBoth lock filesCompiles Rust → Python extension

Key Development Dependencies

Key Rust Dependencies

CrateVersionPurpose
pyo3Cargo.lock:832-846Python FFI
pyo3-async-runtimesCargo.lock:849-860Async bridge
tokioCargo.lock:1287-1308Async runtime
simd-r-drive-ws-client(workspace)WebSocket RPC client

Sources: experiments/bindings/python-ws-client/pyproject.toml:37-46 experiments/bindings/python-ws-client/uv.lock:1-299 experiments/bindings/python-ws-client/Cargo.lock:1-1380

Type Stubs and IDE Support

The package includes .pyi type stub files that provide complete type information for IDEs and static type checkers like mypy.

Type Stub File:data_store_ws_client.pyi

Type Stub Features

FeatureDescriptionExample
Full signaturesComplete method signatures with typesdef write(self, key: bytes, data: bytes) -> None
DocstringsComprehensive documentationdata_store_ws_client.pyi:27-94
Generic typesSupport for complex typesUnion[Dict[Any, bytes], List[Dict[Any, bytes]]]
Final classesPrevent subclassing@final class DataStoreWsClient

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219

graph LR
    Input1["Namespace prefix\ne.g., b'users'"]
Input2["Key\ne.g., b'user123'"]
Hash1["XXH3 hash\n8 bytes"]
Hash2["XXH3 hash\n8 bytes"]
Output["Namespaced key\n16 bytes total"]
Input1 -->|hash once at init| Hash1
 
   Input2 -->|hash per call| Hash2
 
   Hash1 --> Output
 
   Hash2 --> Output
graph LR
    subgraph "Input"
        Prefix["prefix: bytes\ne.g. b'users'"]
Key["key: bytes\ne.g. b'user123'"]
end
    
    subgraph "Hashing"
        XXH3_Prefix["XXH3(prefix)"]
XXH3_Key["XXH3(key)"]
end
    
    subgraph "Output"
        PrefixHash["8 bytes\nprefix_hash"]
KeyHash["8 bytes\nkey_hash"]
Combined["16 bytes total\nprefix_hash // key_hash"]
end
    
 
   Prefix --> XXH3_Prefix
 
   Key --> XXH3_Key
 
   XXH3_Prefix --> PrefixHash
 
   XXH3_Key --> KeyHash
 
   PrefixHash --> Combined
 
   KeyHash --> Combined

NamespaceHasher Utility

NamespaceHasher provides deterministic key namespacing using XXH3 hashing to prevent key collisions across logical domains.

Diagram: Namespace Key Derivation

Usage Example

Key Properties

PropertyValueDescription
Output length16 bytesFixed-size namespaced key
Hash functionXXH3Fast, high-quality 64-bit hash
Collision resistanceHighXXH3 provides strong distribution
DeterministicYesSame input always produces same output

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:170-219 src/storage_engine/key_indexer.rs:64-72

graph TB
    subgraph "integration_test.sh"
        Setup["cd experiments/\nBuild if needed"]
StartServer["cargo run --package\nsimd-r-drive-ws-server\n/tmp/simd-r-drive-pytest-storage.bin\n--host 127.0.0.1 --port 34129"]
SetupPython["uv venv\nuv pip install pytest maturin\nuv pip install -e . --group dev"]
ExtractTests["python extract_readme_tests.py"]
RunPytest["pytest -v -s\nTEST_SERVER_HOST=127.0.0.1\nTEST_SERVER_PORT=34129"]
Cleanup["kill -9 $SERVER_PID\nrm /tmp/simd-r-drive-pytest-storage.bin"]
end
    
 
   Setup --> StartServer
 
   StartServer --> SetupPython
 
   SetupPython --> ExtractTests
 
   ExtractTests --> RunPytest
 
   RunPytest --> Cleanup

Integration Test Infrastructure

The Python bindings include comprehensive integration tests that validate the entire stack, from Python client code to the WebSocket server and storage engine.

Diagram: Integration Test Workflow

Sources: experiments/bindings/python-ws-client/integration_test.sh:1-91

Test Components

The test infrastructure consists of multiple components working together:

ComponentFilePurpose
Integration script integration_test.sh1-91Orchestrates full-stack test execution
README test extractor extract_readme_tests.py1-46Converts README code blocks to pytest functions
Generated teststests/test_readme_blocks.pyExecutable tests from README examples
Manual teststests/test_*.pyHand-written unit and integration tests

Sources: experiments/bindings/python-ws-client/integration_test.sh:1-91 experiments/bindings/python-ws-client/extract_readme_tests.py:1-46

graph LR
    subgraph "Input_File"
        README["README.md"]
CodeBlocks["```python\ncode\n```"]
end
    
    subgraph "Extraction_Logic"
        Regex["re.compile(r'```python\\n(.*?)```', re.DOTALL)"]
Extract["pattern.findall(text)"]
Wrap["def test_readme_block_{i}():\n {indented_code}"]
end
    
    subgraph "Output_File"
        TestFile["tests/test_readme_blocks.py"]
TestFunctions["test_readme_block_0()\ntest_readme_block_1()\ntest_readme_block_N()"]
end
    
 
   README --> CodeBlocks
 
   CodeBlocks --> Regex
 
   Regex --> Extract
 
   Extract --> Wrap
 
   Wrap --> TestFile
 
   TestFile --> TestFunctions

README Test Extraction

extract_readme_tests.py automatically extracts Python code blocks from the README and generates pytest test functions, ensuring documentation examples remain accurate.

Diagram: README to Pytest Pipeline

Extraction Process

StepFunctionAction
1. ReadREADME.read_text()Load README.md as string
2. Extractre.findall(r'```python\n(.*?)```')Find all Python code blocks
3. Wrapwrap_as_test_fn(code, idx)Convert each block to test_readme_block_N()
4. WriteTEST_FILE.write_text()Write tests/test_readme_blocks.py

This ensures documentation examples are automatically validated on every test run, preventing drift between documentation and implementation.

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:14-45


graph TB
    subgraph "Internal Modules"
        RustBinary["simd_r_drive_ws_client_py.so/.pyd\nBinary compiled module"]
RustSymbols["BaseDataStoreWsClient\nNamespaceHasher\nsetup_logging\ntest_rust_logging"]
PythonWrapper["data_store_ws_client.py\nDataStoreWsClient"]
end
    
    subgraph "Package __init__.py"
        ImportRust["from .simd_r_drive_ws_client import\n setup_logging, test_rust_logging"]
ImportPython["from .data_store_ws_client import\n DataStoreWsClient, NamespaceHasher"]
AllList["__all__ = [\n 'DataStoreWsClient',\n 'NamespaceHasher',\n 'setup_logging',\n 'test_rust_logging'\n]"]
end
    
    subgraph "Public API"
        UserCode["from simd_r_drive_ws_client import DataStoreWsClient"]
end
    
 
   RustBinary --> RustSymbols
 
   RustSymbols --> ImportRust
 
   PythonWrapper --> ImportPython
    
 
   ImportRust --> AllList
 
   ImportPython --> AllList
 
   AllList --> UserCode

Package Exports and Public API

The package’s public API is defined through the __init__.py file, which controls what symbols are available when users import the package.

Export Structure

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14

The __all__ list explicitly defines the public API surface, preventing internal implementation details from being accidentally imported by users. This follows Python best practices for package design.

Dismiss

Refresh this wiki

Enter email to refresh