Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

DeepWiki GitHub

Python Integration

Relevant source files

This page documents the Python bindings for SIMD R Drive, which provide a high-level Python interface to the storage engine via WebSocket RPC. The bindings are implemented in Rust using PyO3 and packaged as Python wheels using Maturin.

For details on the Python WebSocket Client API specifically, see Python WebSocket Client API. For information on building and distributing the Python package, see Building Python Bindings. For testing infrastructure, see Integration Testing.


Architecture Overview

The Python integration uses a multi-layer architecture that bridges Python code to the native Rust WebSocket client. The bindings are not pure Python—they are Rust code compiled to native extensions that Python can import.

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:1-63

graph TB
    subgraph "Python User Code"
        UserScript["User Application\n*.py files"]
Imports["from simd_r_drive_ws_client import DataStoreWsClient"]
end
    
    subgraph "Python Package Layer"
        InitPy["__init__.py\nPackage Exports"]
DataStoreWsClientPy["data_store_ws_client.py\nDataStoreWsClient class"]
TypeStubs["data_store_ws_client.pyi\nType Annotations"]
end
    
    subgraph "PyO3 Binding Layer"
        RustModule["simd_r_drive_ws_client_py\nRust Binary Module\n.so / .pyd"]
BaseClass["BaseDataStoreWsClient\n#[pyclass]"]
NamespaceHasherClass["NamespaceHasher\n#[pyclass]"]
end
    
    subgraph "Native Rust Implementation"
        WsClient["simd-r-drive-ws-client\nNative WebSocket Client"]
MuxioRPC["muxio-tokio-rpc-client\nRPC Client Runtime"]
end
    
 
   UserScript --> Imports
 
   Imports --> InitPy
 
   InitPy --> DataStoreWsClientPy
 
   DataStoreWsClientPy --> BaseClass
    DataStoreWsClientPy -.type hints.-> TypeStubs
    
 
   BaseClass --> WsClient
 
   NamespaceHasherClass --> WsClient
    
 
   RustModule --> BaseClass
 
   RustModule --> NamespaceHasherClass
    
 
   WsClient --> MuxioRPC

The architecture consists of four distinct layers:

LayerTechnologyPurpose
Python User CodePure PythonApplication-level logic using the client
Python PackagePure Python wrapperConvenience methods and type annotations
PyO3 BindingsRust compiled to native extensionFFI bridge exposing Rust functionality
Native ImplementationRust (simd-r-drive-ws-client)WebSocket RPC client implementation

Python API Surface

The Python package exposes two primary classes that users interact with: DataStoreWsClient and NamespaceHasher. The API is defined through a combination of Rust PyO3 bindings and Python wrapper code.

graph TB
    subgraph "Python Space"
        DSWsClient["DataStoreWsClient\nexperiments/.../data_store_ws_client.py"]
NSHasher["NamespaceHasher\nRust #[pyclass]"]
end
    
    subgraph "Rust PyO3 Bindings"
        BaseClient["BaseDataStoreWsClient\nRust #[pyclass]\nsrc/lib.rs"]
NSHasherRust["NamespaceHasher\nRust Implementation\nsrc/lib.rs"]
end
    
    subgraph "Method Categories"
        WriteOps["write()\nbatch_write()\ndelete()"]
ReadOps["read()\nbatch_read()\nexists()"]
MetaOps["__len__()\n__contains__()\nis_empty()\nfile_size()"]
PyOnlyOps["batch_read_structured()\nPure Python Logic"]
end
    
 
   DSWsClient -->|inherits| BaseClient
    NSHasher -.exposed as.-> NSHasherRust
    
 
   BaseClient --> WriteOps
 
   BaseClient --> ReadOps
 
   BaseClient --> MetaOps
 
   DSWsClient --> PyOnlyOps

Class Hierarchy and Implementation

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:11-63 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:8-219

Core Operations

The DataStoreWsClient provides the following operation categories:

Operation TypeMethodsImplementation Location
Write Operationswrite(), batch_write(), delete()Rust (BaseDataStoreWsClient)
Read Operationsread(), batch_read(), exists()Rust (BaseDataStoreWsClient)
Metadata Operations__len__(), __contains__(), is_empty(), file_size()Rust (BaseDataStoreWsClient)
Structured Readsbatch_read_structured()Python (DataStoreWsClient)

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-168

Python-Rust Method Mapping

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:12-62 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-129

The batch_read_structured() method is implemented entirely in Python as a convenience wrapper. It decompiles dictionaries or lists of dictionaries to extract a flat list of keys, calls the fast Rust batch_read() method, and then rebuilds the original structure with the fetched values.


PyO3 Binding Architecture

PyO3 provides the Foreign Function Interface (FFI) that allows Python to call Rust code. The bindings use PyO3 macros to expose Rust structs and methods as Python classes and functions.

graph TB
    subgraph "Rust Source"
        StructDef["#[pyclass]\npub struct BaseDataStoreWsClient"]
MethodsDef["#[pymethods]\nimpl BaseDataStoreWsClient"]
NSStruct["#[pyclass]\npub struct NamespaceHasher"]
NSMethods["#[pymethods]\nimpl NamespaceHasher"]
end
    
    subgraph "PyO3 Macro Expansion"
        PyClassMacro["PyClass trait implementation\nType conversion\nReference counting"]
PyMethodsMacro["Method wrappers\nArgument extraction\nReturn value conversion"]
end
    
    subgraph "Python Module"
        PythonClass["class BaseDataStoreWsClient:\n def write(...)\n def read(...)"]
PythonNS["class NamespaceHasher:\n def __init__(...)\n def namespace(...)"]
end
    
 
   StructDef --> PyClassMacro
 
   MethodsDef --> PyMethodsMacro
 
   NSStruct --> PyClassMacro
 
   NSMethods --> PyMethodsMacro
    
 
   PyClassMacro --> PythonClass
 
   PyMethodsMacro --> PythonClass
 
   PyClassMacro --> PythonNS
 
   PyMethodsMacro --> PythonNS

PyO3 Class Definitions

Sources: experiments/bindings/python-ws-client/Cargo.lock:832-846 experiments/bindings/python-ws-client/Cargo.lock:1096-1108

Async Runtime Bridge

The Python bindings use pyo3-async-runtimes to bridge Python's async/await with Rust's Tokio runtime. This allows Python code to use standard async/await syntax while the underlying operations are handled by Tokio.

Sources: experiments/bindings/python-ws-client/Cargo.lock:849-860

graph TB
    subgraph "Python Async"
        PyAsyncCall["await client.write(key, data)"]
PyEventLoop["asyncio.run()
or uvloop"]
end
    
    subgraph "pyo3-async-runtimes Bridge"
        Bridge["pyo3_async_runtimes::tokio\nFuture conversion"]
Runtime["LocalSet spawning\nBlock on future"]
end
    
    subgraph "Rust Tokio"
        TokioFuture["async fn write()\nTokio Future"]
TokioRuntime["Tokio Runtime\nThread Pool"]
end
    
 
   PyAsyncCall --> PyEventLoop
 
   PyEventLoop --> Bridge
 
   Bridge --> Runtime
 
   Runtime --> TokioFuture
 
   TokioFuture --> TokioRuntime

The pyo3-async-runtimes crate handles the complexity of converting between Python's async protocol and Rust's Tokio futures, ensuring that async operations work seamlessly across the FFI boundary.


Build and Distribution System

The Python bindings are built using Maturin, which compiles the Rust code and packages it into Python wheels. The build system is configured through pyproject.toml and Cargo.toml.

graph LR
    subgraph "Configuration Files"
        PyProject["pyproject.toml\n[build-system]\nrequires = ['maturin>=1.5']"]
CargoToml["Cargo.toml\n[lib]\ncrate-type = ['cdylib']"]
end
    
    subgraph "Maturin Build Process"
        RustCompile["rustc compilation\n--crate-type=cdylib\nTarget: cpython extension"]
LinkPyO3["Link PyO3 runtime\nPython ABI"]
CreateWheel["Package .so/.pyd\nAdd metadata\nCreate .whl"]
end
    
    subgraph "Distribution Artifacts"
        Wheel["simd_r_drive_ws_client-*.whl\nPlatform-specific binary"]
PyPI["PyPI Registry\npip install simd-r-drive-ws-client"]
end
    
 
   PyProject --> RustCompile
 
   CargoToml --> RustCompile
 
   RustCompile --> LinkPyO3
 
   LinkPyO3 --> CreateWheel
 
   CreateWheel --> Wheel
 
   Wheel --> PyPI

Maturin Build Pipeline

Sources: experiments/bindings/python-ws-client/pyproject.toml:29-35

Build System Configuration

The build system is configured through several key sections in pyproject.toml:

ConfigurationLocationPurpose
[build-system]pyproject.toml:29-31Specifies Maturin as build backend
[tool.maturin]pyproject.toml:33-35Maturin-specific settings (bindings, Python version)
[project]pyproject.toml:1-27Package metadata for PyPI
[dependency-groups]pyproject.toml:37-46Development dependencies

Sources: experiments/bindings/python-ws-client/pyproject.toml:1-47

Supported Python Versions and Platforms

The package supports Python 3.10 through 3.13 on multiple platforms:

Sources: experiments/bindings/python-ws-client/pyproject.toml:7-27 experiments/bindings/python-ws-client/README.md:18-23


graph TB
    subgraph "Runtime Dependencies"
        PyO3Runtime["PyO3 Runtime\nEmbedded in .whl"]
RustDeps["Rust Dependencies\nsimd-r-drive-ws-client\ntokio, muxio-*"]
end
    
    subgraph "Development Dependencies"
        Maturin["maturin>=1.8.7\nBuild backend"]
MyPy["mypy>=1.16.1\nType checking"]
Pytest["pytest>=8.4.1\nTesting framework"]
NumPy["numpy>=2.2.6\nTesting utilities"]
Other["puccinialin, pytest-benchmark, pytest-order"]
end
    
    subgraph "Lock Files"
        UvLock["uv.lock\nPython dependency tree"]
CargoLock["Cargo.lock\nRust dependency tree"]
end
    
 
   Maturin --> RustDeps
 
   RustDeps --> CargoLock
    
 
   MyPy --> UvLock
 
   Pytest --> UvLock
 
   NumPy --> UvLock
 
   Other --> UvLock

Dependency Management

The Python bindings use uv for fast dependency resolution and management. Dependencies are split into runtime (minimal) and development dependencies.

Dependency Structure

Sources: experiments/bindings/python-ws-client/pyproject.toml:37-46 experiments/bindings/python-ws-client/Cargo.lock:1-1380 experiments/bindings/python-ws-client/uv.lock:1-299

The runtime has minimal Python dependencies (essentially none beyond the Python interpreter), as all dependencies are statically compiled into the binary wheel. Development dependencies include testing, type checking, and build tools.


Type Stubs and IDE Support

The package includes comprehensive type stubs (.pyi files) that provide full type information for IDEs and type checkers like MyPy.

graph TB
    subgraph "Type Stub File"
        StubImports["from typing import Optional, Union, Dict, Any, List\nfrom .simd_r_drive_ws_client import BaseDataStoreWsClient"]
ClientStub["@final\nclass DataStoreWsClient(BaseDataStoreWsClient):\n def __init__(self, host: str, port: int) -> None\n def write(self, key: bytes, data: bytes) -> None\n ..."]
NSStub["@final\nclass NamespaceHasher:\n def __init__(self, prefix: bytes) -> None\n def namespace(self, key: bytes) -> bytes"]
end
    
    subgraph "IDE Features"
        Autocomplete["Auto-completion\nMethod signatures"]
TypeCheck["Type checking\nmypy validation"]
Docstrings["Inline documentation\nMethod descriptions"]
end
    
 
   StubImports --> ClientStub
 
   StubImports --> NSStub
    
 
   ClientStub --> Autocomplete
 
   ClientStub --> TypeCheck
 
   ClientStub --> Docstrings
    
 
   NSStub --> Autocomplete
 
   NSStub --> TypeCheck
 
   NSStub --> Docstrings

Type Stub Structure

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219

The type stubs include:

  • Full method signatures with type annotations
  • Comprehensive docstrings explaining each method
  • Generic type support for structured operations
  • @final decorators to prevent subclassing

graph LR
    subgraph "Namespace Creation"
        Prefix["prefix = b'users'"]
PrefixHash["XXH3(prefix)\n→ 8 bytes"]
end
    
    subgraph "Key Hashing"
        Key["key = b'user123'"]
KeyHash["XXH3(key)\n→ 8 bytes"]
end
    
    subgraph "Namespaced Key"
        Combined["prefix_hash // key_hash\n16 bytes total"]
end
    
 
   Prefix --> PrefixHash
 
   Key --> KeyHash
 
   PrefixHash --> Combined
 
   KeyHash --> Combined

NamespaceHasher Utility

The NamespaceHasher class provides deterministic key namespacing using XXH3 hashing. It ensures keys are scoped to specific namespaces, preventing collisions across logical domains.

Namespacing Mechanism

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:170-219

Usage Pattern

The NamespaceHasher is typically used as follows:

StepCodeResult
1. Create hasherhasher = NamespaceHasher(b"users")Hasher scoped to "users" namespace
2. Generate keykey = hasher.namespace(b"user123")16-byte namespaced key
3. Store dataclient.write(key, data)Data stored under namespaced key
4. Read dataclient.read(key)Data retrieved from namespaced key

This pattern ensures that keys like b"settings" in the b"users" namespace don't collide with b"settings" in the b"system" namespace.

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:182-218


graph TB
    subgraph "Test Script: integration_test.sh"
        Setup["1. Navigate to experiments/\n2. Build server if needed"]
StartServer["3. cargo run --package simd-r-drive-ws-server\nBackground process\nPID captured"]
SetupPython["4. uv venv\n5. uv pip install pytest maturin\n6. uv pip install -e . --group dev"]
ExtractTests["7. extract_readme_tests.py\nExtract code blocks from README.md"]
RunPytest["8. pytest -v -s\nTEST_SERVER_HOST=$SERVER_HOST\nTEST_SERVER_PORT=$SERVER_PORT"]
Cleanup["9. kill -9 $SERVER_PID\n10. rm /tmp/simd-r-drive-pytest-storage.bin"]
end
    
 
   Setup --> StartServer
 
   StartServer --> SetupPython
 
   SetupPython --> ExtractTests
 
   ExtractTests --> RunPytest
 
   RunPytest --> Cleanup

Integration Test Infrastructure

The Python bindings include a comprehensive integration test suite that validates the entire stack from Python user code down to the WebSocket server.

Test Workflow

Sources: experiments/bindings/python-ws-client/integration_test.sh:1-91

Test Categories

The test infrastructure includes multiple test sources:

Test SourcePurposeGenerated By
tests/test_readme_blocks.pyValidates README examplesextract_readme_tests.py
Other test filesUnit and integration testsManual test authoring
Pytest fixturesSetup/teardown infrastructurePytest framework

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:1-46

graph LR
    subgraph "Input"
        README["README.md\n```python code blocks"]
end
    
    subgraph "Extraction Process"
        Regex["Regex: r'```python\\n(.*?)```'"]
Parse["Extract all Python blocks"]
Wrap["Wrap each block in\ndef test_readme_block_N():\n ..."]
end
    
    subgraph "Output"
        TestFile["tests/test_readme_blocks.py\ntest_readme_block_0()\ntest_readme_block_1()\n..."]
end
    
 
   README --> Regex
 
   Regex --> Parse
 
   Parse --> Wrap
 
   Wrap --> TestFile

README Test Extraction

The extract_readme_tests.py script automatically converts Python code blocks from the README into pytest test functions:

Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:14-45

This ensures that all code examples in the README are automatically tested, preventing documentation drift from the actual API behavior.


graph TB
    subgraph "Internal Modules"
        RustBinary["simd_r_drive_ws_client_py.so/.pyd\nBinary compiled module"]
RustSymbols["BaseDataStoreWsClient\nNamespaceHasher\nsetup_logging\ntest_rust_logging"]
PythonWrapper["data_store_ws_client.py\nDataStoreWsClient"]
end
    
    subgraph "Package __init__.py"
        ImportRust["from .simd_r_drive_ws_client import\n setup_logging, test_rust_logging"]
ImportPython["from .data_store_ws_client import\n DataStoreWsClient, NamespaceHasher"]
AllList["__all__ = [\n 'DataStoreWsClient',\n 'NamespaceHasher',\n 'setup_logging',\n 'test_rust_logging'\n]"]
end
    
    subgraph "Public API"
        UserCode["from simd_r_drive_ws_client import DataStoreWsClient"]
end
    
 
   RustBinary --> RustSymbols
 
   RustSymbols --> ImportRust
 
   PythonWrapper --> ImportPython
    
 
   ImportRust --> AllList
 
   ImportPython --> AllList
 
   AllList --> UserCode

Package Exports and Public API

The package's public API is defined through the __init__.py file, which controls what symbols are available when users import the package.

Export Structure

Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14

The __all__ list explicitly defines the public API surface, preventing internal implementation details from being accidentally imported by users. This follows Python best practices for package design.