Python Integration
Relevant source files
- experiments/bindings/python-ws-client/Cargo.lock
- experiments/bindings/python-ws-client/README.md
- experiments/bindings/python-ws-client/extract_readme_tests.py
- experiments/bindings/python-ws-client/integration_test.sh
- experiments/bindings/python-ws-client/pyproject.toml
- experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py
- experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py
- experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi
- experiments/bindings/python-ws-client/uv.lock
This page documents the Python bindings for SIMD R Drive, which provide a high-level Python interface to the storage engine via WebSocket RPC. The bindings are implemented in Rust using PyO3 and packaged as Python wheels using Maturin.
For details on the Python WebSocket Client API specifically, see Python WebSocket Client API. For information on building and distributing the Python package, see Building Python Bindings. For testing infrastructure, see Integration Testing.
Architecture Overview
The Python integration uses a multi-layer architecture that bridges Python code to the native Rust WebSocket client. The bindings are not pure Python—they are Rust code compiled to native extensions that Python can import.
Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:1-63
graph TB
subgraph "Python User Code"
UserScript["User Application\n*.py files"]
Imports["from simd_r_drive_ws_client import DataStoreWsClient"]
end
subgraph "Python Package Layer"
InitPy["__init__.py\nPackage Exports"]
DataStoreWsClientPy["data_store_ws_client.py\nDataStoreWsClient class"]
TypeStubs["data_store_ws_client.pyi\nType Annotations"]
end
subgraph "PyO3 Binding Layer"
RustModule["simd_r_drive_ws_client_py\nRust Binary Module\n.so / .pyd"]
BaseClass["BaseDataStoreWsClient\n#[pyclass]"]
NamespaceHasherClass["NamespaceHasher\n#[pyclass]"]
end
subgraph "Native Rust Implementation"
WsClient["simd-r-drive-ws-client\nNative WebSocket Client"]
MuxioRPC["muxio-tokio-rpc-client\nRPC Client Runtime"]
end
UserScript --> Imports
Imports --> InitPy
InitPy --> DataStoreWsClientPy
DataStoreWsClientPy --> BaseClass
DataStoreWsClientPy -.type hints.-> TypeStubs
BaseClass --> WsClient
NamespaceHasherClass --> WsClient
RustModule --> BaseClass
RustModule --> NamespaceHasherClass
WsClient --> MuxioRPC
The architecture consists of four distinct layers:
| Layer | Technology | Purpose |
|---|---|---|
| Python User Code | Pure Python | Application-level logic using the client |
| Python Package | Pure Python wrapper | Convenience methods and type annotations |
| PyO3 Bindings | Rust compiled to native extension | FFI bridge exposing Rust functionality |
| Native Implementation | Rust (simd-r-drive-ws-client) | WebSocket RPC client implementation |
Python API Surface
The Python package exposes two primary classes that users interact with: DataStoreWsClient and NamespaceHasher. The API is defined through a combination of Rust PyO3 bindings and Python wrapper code.
graph TB
subgraph "Python Space"
DSWsClient["DataStoreWsClient\nexperiments/.../data_store_ws_client.py"]
NSHasher["NamespaceHasher\nRust #[pyclass]"]
end
subgraph "Rust PyO3 Bindings"
BaseClient["BaseDataStoreWsClient\nRust #[pyclass]\nsrc/lib.rs"]
NSHasherRust["NamespaceHasher\nRust Implementation\nsrc/lib.rs"]
end
subgraph "Method Categories"
WriteOps["write()\nbatch_write()\ndelete()"]
ReadOps["read()\nbatch_read()\nexists()"]
MetaOps["__len__()\n__contains__()\nis_empty()\nfile_size()"]
PyOnlyOps["batch_read_structured()\nPure Python Logic"]
end
DSWsClient -->|inherits| BaseClient
NSHasher -.exposed as.-> NSHasherRust
BaseClient --> WriteOps
BaseClient --> ReadOps
BaseClient --> MetaOps
DSWsClient --> PyOnlyOps
Class Hierarchy and Implementation
Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:11-63 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:8-219
Core Operations
The DataStoreWsClient provides the following operation categories:
| Operation Type | Methods | Implementation Location |
|---|---|---|
| Write Operations | write(), batch_write(), delete() | Rust (BaseDataStoreWsClient) |
| Read Operations | read(), batch_read(), exists() | Rust (BaseDataStoreWsClient) |
| Metadata Operations | __len__(), __contains__(), is_empty(), file_size() | Rust (BaseDataStoreWsClient) |
| Structured Reads | batch_read_structured() | Python (DataStoreWsClient) |
Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-168
Python-Rust Method Mapping
Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.py:12-62 experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:27-129
The batch_read_structured() method is implemented entirely in Python as a convenience wrapper. It decompiles dictionaries or lists of dictionaries to extract a flat list of keys, calls the fast Rust batch_read() method, and then rebuilds the original structure with the fetched values.
PyO3 Binding Architecture
PyO3 provides the Foreign Function Interface (FFI) that allows Python to call Rust code. The bindings use PyO3 macros to expose Rust structs and methods as Python classes and functions.
graph TB
subgraph "Rust Source"
StructDef["#[pyclass]\npub struct BaseDataStoreWsClient"]
MethodsDef["#[pymethods]\nimpl BaseDataStoreWsClient"]
NSStruct["#[pyclass]\npub struct NamespaceHasher"]
NSMethods["#[pymethods]\nimpl NamespaceHasher"]
end
subgraph "PyO3 Macro Expansion"
PyClassMacro["PyClass trait implementation\nType conversion\nReference counting"]
PyMethodsMacro["Method wrappers\nArgument extraction\nReturn value conversion"]
end
subgraph "Python Module"
PythonClass["class BaseDataStoreWsClient:\n def write(...)\n def read(...)"]
PythonNS["class NamespaceHasher:\n def __init__(...)\n def namespace(...)"]
end
StructDef --> PyClassMacro
MethodsDef --> PyMethodsMacro
NSStruct --> PyClassMacro
NSMethods --> PyMethodsMacro
PyClassMacro --> PythonClass
PyMethodsMacro --> PythonClass
PyClassMacro --> PythonNS
PyMethodsMacro --> PythonNS
PyO3 Class Definitions
Sources: experiments/bindings/python-ws-client/Cargo.lock:832-846 experiments/bindings/python-ws-client/Cargo.lock:1096-1108
Async Runtime Bridge
The Python bindings use pyo3-async-runtimes to bridge Python's async/await with Rust's Tokio runtime. This allows Python code to use standard async/await syntax while the underlying operations are handled by Tokio.
Sources: experiments/bindings/python-ws-client/Cargo.lock:849-860
graph TB
subgraph "Python Async"
PyAsyncCall["await client.write(key, data)"]
PyEventLoop["asyncio.run()
or uvloop"]
end
subgraph "pyo3-async-runtimes Bridge"
Bridge["pyo3_async_runtimes::tokio\nFuture conversion"]
Runtime["LocalSet spawning\nBlock on future"]
end
subgraph "Rust Tokio"
TokioFuture["async fn write()\nTokio Future"]
TokioRuntime["Tokio Runtime\nThread Pool"]
end
PyAsyncCall --> PyEventLoop
PyEventLoop --> Bridge
Bridge --> Runtime
Runtime --> TokioFuture
TokioFuture --> TokioRuntime
The pyo3-async-runtimes crate handles the complexity of converting between Python's async protocol and Rust's Tokio futures, ensuring that async operations work seamlessly across the FFI boundary.
Build and Distribution System
The Python bindings are built using Maturin, which compiles the Rust code and packages it into Python wheels. The build system is configured through pyproject.toml and Cargo.toml.
graph LR
subgraph "Configuration Files"
PyProject["pyproject.toml\n[build-system]\nrequires = ['maturin>=1.5']"]
CargoToml["Cargo.toml\n[lib]\ncrate-type = ['cdylib']"]
end
subgraph "Maturin Build Process"
RustCompile["rustc compilation\n--crate-type=cdylib\nTarget: cpython extension"]
LinkPyO3["Link PyO3 runtime\nPython ABI"]
CreateWheel["Package .so/.pyd\nAdd metadata\nCreate .whl"]
end
subgraph "Distribution Artifacts"
Wheel["simd_r_drive_ws_client-*.whl\nPlatform-specific binary"]
PyPI["PyPI Registry\npip install simd-r-drive-ws-client"]
end
PyProject --> RustCompile
CargoToml --> RustCompile
RustCompile --> LinkPyO3
LinkPyO3 --> CreateWheel
CreateWheel --> Wheel
Wheel --> PyPI
Maturin Build Pipeline
Sources: experiments/bindings/python-ws-client/pyproject.toml:29-35
Build System Configuration
The build system is configured through several key sections in pyproject.toml:
| Configuration | Location | Purpose |
|---|---|---|
[build-system] | pyproject.toml:29-31 | Specifies Maturin as build backend |
[tool.maturin] | pyproject.toml:33-35 | Maturin-specific settings (bindings, Python version) |
[project] | pyproject.toml:1-27 | Package metadata for PyPI |
[dependency-groups] | pyproject.toml:37-46 | Development dependencies |
Sources: experiments/bindings/python-ws-client/pyproject.toml:1-47
Supported Python Versions and Platforms
The package supports Python 3.10 through 3.13 on multiple platforms:
Sources: experiments/bindings/python-ws-client/pyproject.toml:7-27 experiments/bindings/python-ws-client/README.md:18-23
graph TB
subgraph "Runtime Dependencies"
PyO3Runtime["PyO3 Runtime\nEmbedded in .whl"]
RustDeps["Rust Dependencies\nsimd-r-drive-ws-client\ntokio, muxio-*"]
end
subgraph "Development Dependencies"
Maturin["maturin>=1.8.7\nBuild backend"]
MyPy["mypy>=1.16.1\nType checking"]
Pytest["pytest>=8.4.1\nTesting framework"]
NumPy["numpy>=2.2.6\nTesting utilities"]
Other["puccinialin, pytest-benchmark, pytest-order"]
end
subgraph "Lock Files"
UvLock["uv.lock\nPython dependency tree"]
CargoLock["Cargo.lock\nRust dependency tree"]
end
Maturin --> RustDeps
RustDeps --> CargoLock
MyPy --> UvLock
Pytest --> UvLock
NumPy --> UvLock
Other --> UvLock
Dependency Management
The Python bindings use uv for fast dependency resolution and management. Dependencies are split into runtime (minimal) and development dependencies.
Dependency Structure
Sources: experiments/bindings/python-ws-client/pyproject.toml:37-46 experiments/bindings/python-ws-client/Cargo.lock:1-1380 experiments/bindings/python-ws-client/uv.lock:1-299
The runtime has minimal Python dependencies (essentially none beyond the Python interpreter), as all dependencies are statically compiled into the binary wheel. Development dependencies include testing, type checking, and build tools.
Type Stubs and IDE Support
The package includes comprehensive type stubs (.pyi files) that provide full type information for IDEs and type checkers like MyPy.
graph TB
subgraph "Type Stub File"
StubImports["from typing import Optional, Union, Dict, Any, List\nfrom .simd_r_drive_ws_client import BaseDataStoreWsClient"]
ClientStub["@final\nclass DataStoreWsClient(BaseDataStoreWsClient):\n def __init__(self, host: str, port: int) -> None\n def write(self, key: bytes, data: bytes) -> None\n ..."]
NSStub["@final\nclass NamespaceHasher:\n def __init__(self, prefix: bytes) -> None\n def namespace(self, key: bytes) -> bytes"]
end
subgraph "IDE Features"
Autocomplete["Auto-completion\nMethod signatures"]
TypeCheck["Type checking\nmypy validation"]
Docstrings["Inline documentation\nMethod descriptions"]
end
StubImports --> ClientStub
StubImports --> NSStub
ClientStub --> Autocomplete
ClientStub --> TypeCheck
ClientStub --> Docstrings
NSStub --> Autocomplete
NSStub --> TypeCheck
NSStub --> Docstrings
Type Stub Structure
Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:1-219
The type stubs include:
- Full method signatures with type annotations
- Comprehensive docstrings explaining each method
- Generic type support for structured operations
@finaldecorators to prevent subclassing
graph LR
subgraph "Namespace Creation"
Prefix["prefix = b'users'"]
PrefixHash["XXH3(prefix)\n→ 8 bytes"]
end
subgraph "Key Hashing"
Key["key = b'user123'"]
KeyHash["XXH3(key)\n→ 8 bytes"]
end
subgraph "Namespaced Key"
Combined["prefix_hash // key_hash\n16 bytes total"]
end
Prefix --> PrefixHash
Key --> KeyHash
PrefixHash --> Combined
KeyHash --> Combined
NamespaceHasher Utility
The NamespaceHasher class provides deterministic key namespacing using XXH3 hashing. It ensures keys are scoped to specific namespaces, preventing collisions across logical domains.
Namespacing Mechanism
Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:170-219
Usage Pattern
The NamespaceHasher is typically used as follows:
| Step | Code | Result |
|---|---|---|
| 1. Create hasher | hasher = NamespaceHasher(b"users") | Hasher scoped to "users" namespace |
| 2. Generate key | key = hasher.namespace(b"user123") | 16-byte namespaced key |
| 3. Store data | client.write(key, data) | Data stored under namespaced key |
| 4. Read data | client.read(key) | Data retrieved from namespaced key |
This pattern ensures that keys like b"settings" in the b"users" namespace don't collide with b"settings" in the b"system" namespace.
Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/data_store_ws_client.pyi:182-218
graph TB
subgraph "Test Script: integration_test.sh"
Setup["1. Navigate to experiments/\n2. Build server if needed"]
StartServer["3. cargo run --package simd-r-drive-ws-server\nBackground process\nPID captured"]
SetupPython["4. uv venv\n5. uv pip install pytest maturin\n6. uv pip install -e . --group dev"]
ExtractTests["7. extract_readme_tests.py\nExtract code blocks from README.md"]
RunPytest["8. pytest -v -s\nTEST_SERVER_HOST=$SERVER_HOST\nTEST_SERVER_PORT=$SERVER_PORT"]
Cleanup["9. kill -9 $SERVER_PID\n10. rm /tmp/simd-r-drive-pytest-storage.bin"]
end
Setup --> StartServer
StartServer --> SetupPython
SetupPython --> ExtractTests
ExtractTests --> RunPytest
RunPytest --> Cleanup
Integration Test Infrastructure
The Python bindings include a comprehensive integration test suite that validates the entire stack from Python user code down to the WebSocket server.
Test Workflow
Sources: experiments/bindings/python-ws-client/integration_test.sh:1-91
Test Categories
The test infrastructure includes multiple test sources:
| Test Source | Purpose | Generated By |
|---|---|---|
tests/test_readme_blocks.py | Validates README examples | extract_readme_tests.py |
| Other test files | Unit and integration tests | Manual test authoring |
| Pytest fixtures | Setup/teardown infrastructure | Pytest framework |
Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:1-46
graph LR
subgraph "Input"
README["README.md\n```python code blocks"]
end
subgraph "Extraction Process"
Regex["Regex: r'```python\\n(.*?)```'"]
Parse["Extract all Python blocks"]
Wrap["Wrap each block in\ndef test_readme_block_N():\n ..."]
end
subgraph "Output"
TestFile["tests/test_readme_blocks.py\ntest_readme_block_0()\ntest_readme_block_1()\n..."]
end
README --> Regex
Regex --> Parse
Parse --> Wrap
Wrap --> TestFile
README Test Extraction
The extract_readme_tests.py script automatically converts Python code blocks from the README into pytest test functions:
Sources: experiments/bindings/python-ws-client/extract_readme_tests.py:14-45
This ensures that all code examples in the README are automatically tested, preventing documentation drift from the actual API behavior.
graph TB
subgraph "Internal Modules"
RustBinary["simd_r_drive_ws_client_py.so/.pyd\nBinary compiled module"]
RustSymbols["BaseDataStoreWsClient\nNamespaceHasher\nsetup_logging\ntest_rust_logging"]
PythonWrapper["data_store_ws_client.py\nDataStoreWsClient"]
end
subgraph "Package __init__.py"
ImportRust["from .simd_r_drive_ws_client import\n setup_logging, test_rust_logging"]
ImportPython["from .data_store_ws_client import\n DataStoreWsClient, NamespaceHasher"]
AllList["__all__ = [\n 'DataStoreWsClient',\n 'NamespaceHasher',\n 'setup_logging',\n 'test_rust_logging'\n]"]
end
subgraph "Public API"
UserCode["from simd_r_drive_ws_client import DataStoreWsClient"]
end
RustBinary --> RustSymbols
RustSymbols --> ImportRust
PythonWrapper --> ImportPython
ImportRust --> AllList
ImportPython --> AllList
AllList --> UserCode
Package Exports and Public API
The package's public API is defined through the __init__.py file, which controls what symbols are available when users import the package.
Export Structure
Sources: experiments/bindings/python-ws-client/simd_r_drive_ws_client/init.py:1-14
The __all__ list explicitly defines the public API surface, preventing internal implementation details from being accidentally imported by users. This follows Python best practices for package design.