Hey everyone!
I've built and open-sourced ar-io-mlfow. This is a plugin that adds cryptographic provenance across the ML lifecycle (training runs, model registration, stage promotions, inference, and datasets).
What it does
- Creates signed Ed25519 cryptographic proofs of your runs, models, and predictions.
- Permanently anchors these small proofs (~500 bytes) to the **ar.io network** (built on Arweave decentralized storage).
- Provides a `VerifiedModel` wrapper that checks integrity *before* loading/serving and raises an error on tampering.
- Includes `ArioMlflowClient` (drop-in replacement) and a simple `ario_mlflow.anchor()` call.
- Language-neutral verification + CLI tools for auditors.
The idea is to provide teams with lightweight independent verification layer to their workflows.
Quick Examples
Training run anchoring
import mlflow
import ario_mlflow
with mlflow.start_run():
# ... your normal training + mlflow.log_model() ...
result = ario_mlflow.anchor()
print("Anchored transaction:", result["tags"]["ario.training_tx"])
Verified Inference
from ario_mlflow import VerifiedModel
vm = VerifiedModel("models:/my_model/1") # works with Unity Catalog too
prediction = vm.predict(data)
print(prediction.proof_status) # will raise IntegrityError if tampered
Full installation, docs, architecture, and more examples are here: https://github.com/ar-io/ar-io-mlflow
It's an early version so would love any feedback from the community so I can improve 🙂
A few questions if that's ok:
- Does verifiable decentralized provenance solve a real pain point for your MLOps or governance workflows?
- Any specific integration ideas or feature requests?
Thanks in advance if you got this far. Happy to answer any questions, review issues/PRs, or collaborate.
Will