Ultralytics YOLO [1] (You Only Look Once) is one of the most widely used computer vision frameworks. It is fast, accurate, and well supported, with a range of model sizes (from nano to extra-large) so you can trade off speed and accuracy for edge or server deployment. Training and inference are straightforward with a Python API and practical documentation, and the ecosystem features readily available pretrained weights, support for standard datasets (e.g. COCO), and ongoing active model development, as exemplified by recent advancements in YOLO11.
For teams adopting computer vision (CV) tasks on Databricks, Ultralytics YOLO is a practical choice for both prototyping and production pipelines. The framework supports multiple CV tasks— object detection, classification, segmentation, pose estimation, and oriented bounding boxes (OBB) —each with models in several sizes (nano to extra-large, often denoted as n, s, m, l, x).
Figure 1: Common computer vision tasks and their associated annotation type.
This post demonstrates a single-node workflow for training an object detection model on Databricks AI Runtime — scalable, serverless NVIDIA GPU compute. We use the nano YOLO model, YOLO11n, for real-time performance that outputs bounding boxes, class labels, and confidence scores. The process covers training YOLO11n on the COCO128 dataset (demo-only; refer to Data preparation for production guidance) and deploying it to Model Serving. Deployment includes a custom MLflow Pyfunc wrapper to handle base64 image input to the YOLO model and structured bounding-box output from model prediction.
Critically, running YOLO on Databricks AI Runtime lets you train and iterate without provisioning or managing clusters: you get GPU compute on demand, pay for what you use, and when you are done the compute is terminated. This makes it ideal for experimentation, proof-of-concept, and small-to-medium training jobs—and MLflow and Unity Catalog keep experiments and artifacts organized.
Single-node (one GPU instance) keeps the workflow simple and sufficient for many object-detection use cases. YOLO11n is a small model; training on datasets in the low thousands to tens of thousands of images often fits comfortably on one GPU (e.g. A10). A single node avoids distributed-training setup, multi-worker debugging, and extra cost—so you can focus on data, labels, and the MLflow-to-serving path.
When your dataset or model grows and training time becomes a bottleneck, you can move to multi-GPU or multi-node patterns; the same registration and deployment steps in this post still apply.
Add the notebook via Import a notebook or Clone a Git repo (Repos), then attach to Serverless GPU:
Navigate to Compute dropdown, Connect to and Configure the Notebook AI Runtime. Within the Environment Panel found on the right hand edge of the notebook, select A10 for the Accelerator and AI v4 for Base environment. Finally, click Apply and then Confirm, as shown in the figure below.
Figure 2: Connecting to AI Runtime Serverless GPU cluster and configuring the Notebook Environment:
Connect → Serverless GPU → Environment → Accelerator: A10, Environment: AI v4 → Apply & Confirm.
Given that packages and dependencies are installed in the first notebook cell, there is no need to install within the cluster environment panel.
The notebook walks through six steps in order; each builds on the previous one.
Figure 3. An end-to-end workflow.
After attaching to Serverless GPU (see above: Connect → Serverless GPU → Accelerator A10, Environment AI v4 → Apply and Confirm), the first steps are to install the required Python packages and configure your Unity Catalog project structure.
Install MLflow, Ultralytics, and supporting packages (e.g. nvidia-ml-py, threadpoolctl). Restart Python after the first %pip cell, then set a writable YOLO config directory to avoid permission issues.
# Package installation for AI Runtime (run once, then restart Python)
%pip install -U mlflow>=3.0
%pip install ultralytics==8.3.204
%pip install nvidia-ml-py==13.580.82
%pip install threadpoolctl==3.1.0
dbutils.library.restartPython()
# Set writable YOLO config dir (avoids permission errors)
import os, uuid
config_dir = f'/tmp/yolo_config_{uuid.uuid4().hex[:8]}'
os.environ['YOLO_CONFIG_DIR'] = config_dir
os.makedirs(config_dir, exist_ok=True)
Next, create or use a catalog, schema, and Unity Catalog Volume for data, raw models, and model checkpoints from training runs. Use widgets for catalog, schema, volume, and model name so the same notebook can be reused across workspaces.
# Widgets for catalog, schema, volume, model name
catalog_name = dbutils.widgets.get("catalog_name") # e.g. "main"
schema_name = dbutils.widgets.get("schema_name") # e.g. "default"
volume_name = dbutils.widgets.get("volume_name") # e.g. "yolo_sgc"
spark.sql(f"CREATE SCHEMA IF NOT EXISTS `{catalog_name}`.`{schema_name}`")
spark.sql(f"CREATE VOLUME IF NOT EXISTS `{catalog_name}`.`{schema_name}`.`{volume_name}`")
project_location = f'/Volumes/{catalog_name}/{schema_name}/{volume_name}/'
os.makedirs(f'{project_location}runs/', exist_ok=True)
os.makedirs(f'{project_location}data/', exist_ok=True)
os.makedirs(f'{project_location}raw_model/', exist_ok=True)
The dataset is configured via a YAML file (path, splits, class names). We download the COCO128 config and data to the volume, then split it into train (62.5%), validation (18.75%), and test (18.75%) with a fixed seed, updating the YAML with the new paths. For custom data, you typically adjust the YAML for your paths and classes. We use the Ultralytics COCO128.YAML, downloaded to the UC Volume, but you can substitute your own config (e.g., data.yaml).
# Download COCO128 dataset configuration to UC Volume
import yaml
os.makedirs(f'{project_location}data/coco128', exist_ok=True)
config_url =
"https://github.com/ultralytics/ultralytics/raw/main/ultralytics/cfg/datasets/coco128.yaml"
config_path = f"{project_location}data/coco128.yaml"
download_file(config_url, config_path, "COCO128 config")
# Then load config, set data['path'] to volume path, download/extract dataset if needed, save updated YAML
Split the data and update the YAML with train/val/test image paths:
train_size, val_size, test_size = split_dataset(
source_images_dir=f"{project_location}data/coco128/images/train2017",
source_labels_dir=f"{project_location}data/coco128/labels/train2017",
base_images_dir=f"{project_location}data/coco128/images",
base_labels_dir=f"{project_location}data/coco128/labels",
train_ratio=0.625, # 62.5%
val_ratio=0.1875, # 18.75%
random_seed=42
)
# In the notebook: update data.yaml so 'train', 'val', 'test' point to the new split dirs (e.g. .../images/train, .../images/val, .../images/test)
Important note: COCO128 is used here only for demonstration. With ~128 images it is too small for production and will overfit. For real use cases, use larger datasets (e.g. 100K+ images or 1K+ domain-specific images). The same workflow applies—update data paths and config as needed.
To deploy the trained YOLO model to Model Serving, we need a single, serializable API: the endpoint will receive requests (e.g., base64-encoded images) and return structured responses (e.g., bounding boxes). YOLO’s native API expects file paths or NumPy arrays and returns a rich in-memory object, which is not what the serving layer expects.
The notebook therefore defines a MLflow Custom Pyfunc wrapper YOLOWrapper(mlflow.pyfunc.PythonModel) that accepts a DataFrame with an image_base64 column and returns a DataFrame of detections (class, confidence, bbox columns).
The wrapper class has three methods: load_context loads the .pt artifact into self.model when the model is loaded (e.g. at serving startup); predict accepts a DataFrame with image_base64, decodes each image, runs YOLO, and returns a DataFrame via _format_predictions; _format_predictions converts YOLO’s Results (.boxes, .names) into a single DataFrame with class name, class id, confidence, and bbox columns (xyxy and xywh). We define the wrapper now so it's ready to use immediately after training completes in Step 4.
class YOLOWrapper(mlflow.pyfunc.PythonModel):
"""Custom MLflow wrapper for YOLO models using base64-encoded images."""
def load_context(self, context):
"""Load YOLO model from artifacts (called once when model is loaded)."""
from ultralytics import YOLO
model_path = context.artifacts["yolo_model"]
self.model = YOLO(model_path, task='detect')
def _format_predictions(self, predictions):
"""Convert YOLO Results to a single DataFrame with class, confidence, bbox columns."""
import pandas as pd
all_results = []
for prediction in predictions:
if prediction.boxes is not None:
boxes = prediction.boxes
for i in range(len(boxes)):
box_xyxy = boxes.xyxy[i].cpu().numpy()
box_xywh = boxes.xywh[i].cpu().numpy()
all_results.append({
"class_name": prediction.names[int(boxes.cls[i])],
"class_num": int(boxes.cls[i]),
"confidence": float(boxes.conf[i]),
"bbox_x1": float(box_xyxy[0]), "bbox_y1": float(box_xyxy[1]),
"bbox_x2": float(box_xyxy[2]), "bbox_y2": float(box_xyxy[3]),
"bbox_center_x": float(box_xywh[0]), "bbox_center_y": float(box_xywh[1]),
"bbox_width": float(box_xywh[2]), "bbox_height": float(box_xywh[3]),
})
return pd.DataFrame(all_results)
def predict(self, context, model_input):
"""Accept DataFrame with image_base64; decode, run YOLO, return DataFrame of detections."""
import pandas as pd
import base64
from PIL import Image
import io
import numpy as np
if 'image_base64' not in model_input.columns:
raise ValueError("DataFrame must contain 'image_base64' column")
all_predictions = []
for image_base64 in model_input['image_base64'].tolist():
image_bytes = base64.b64decode(image_base64)
image_array = np.array(Image.open(io.BytesIO(image_bytes)))
predictions = self.model.predict(image_array, verbose=False)
all_predictions.extend(predictions)
return self._format_predictions(all_predictions)
3.1 MLflow Configuration
We infer the model signature from a sample prediction using this custom wrapper (DataFrame with image_base64 input and detection columns output). We also set the MLflow experiment (e.g., under /Workspace/Shared/) and enable system metrics logging, along with YOLO’s MLflow integration and MLflow autologging.
# Infer signature from a sample image (input: base64, output: bbox columns)
signature, input_example = infer_model_signature(model_path, sample_images[0])
# Enable system metrics and set experiment
experiment_name, experiment_id = setup_mlflow_experiment(
use_workspaceUsers_path=False,
expt_name_suffix="Experiments_YOLO_CoCo"
)
The model is then registered to Unity Catalog (mlflow.pyfunc.log_model) using this wrapper and the best checkpoint.pt artifact (called after training in Step 4):
mlflow.pyfunc.log_model(
name="model",
python_model=YOLOWrapper(),
artifacts={"yolo_model": model_path},
signature=signature,
input_example=input_example,
registered_model_name=registered_model_name,
pip_requirements=["ultralytics==...", "cloudpickle==...", "torch", "torchvision", "pillow", "numpy"],
)
Train YOLO11n with your chosen hyperparameters (epochs, batch size, learning rate, patience, dropout, weight decay); these are specified in the config variables within model.train() as shown in the code snippet. Training runs in a unique temp directory, and the results and validation metrics are copied into the volume under a named run folder ({task}_{model}_{dataset}_{timestamp}_run_{run_id}). The best checkpoint is saved and is then registered to Unity Catalog with the custom PyFunc wrapper (base64 in, structured detections out) defined in the previous step.
model = YOLO(model_path)
results = model.train(
task="detect",
batch=4,
device=0, # Single GPU for Serverless AI Runtime
data=data_yaml_path,
epochs=100,
lr0=0.001,
project=project_location,
name=f"run_{timestamp}",
patience=5, # Update where appropriate
dropout=0.2,
weight_decay=0.0005,
save=True,
)
run_id = mlflow.last_active_run().info.run_id
# Register to Unity Catalog with custom PyFunc wrapper (base64 in, bbox out)
registered_model_name = register_yolo_model(
run_id=run_id,
model_path=best_model_path,
catalog_name=catalog_name,
schema_name=schema_name,
model_name=model_name,
signature=signature,
input_example=input_example,
data_yaml_path=data_yaml_path,
)
Evaluate the registered model on validation and test sets (sample predictions and metrics), then run a local serving test by loading the model via mlflow.pyfunc.load_model() and calling it with base64-encoded images to confirm the same interface the endpoint will use.
# Local serving test: same I/O as the deployed endpoint
model_uri = f"models:/{registered_model_name}/{latest_version}"
serving_model = mlflow.pyfunc.load_model(model_uri)
with open(test_image_path, 'rb') as f:
image_base64 = base64.b64encode(f.read()).decode('utf-8')
input_df = pd.DataFrame({"image_base64": [image_base64]})
predictions = serving_model.predict(input_df) # DataFrame with class_name, confidence, bbox_*
After a manual checkpoint (e.g. a “Proceed with Deployment” widget), you can create or update a Custom Model Serving endpoint. The deployment configuration includes:
6.1 Create the endpoint with AI Gateway enabled:
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import (
ServedEntityInput, EndpointCoreConfigInput,
AiGatewayConfig, AiGatewayInferenceTableConfig,
)
w = WorkspaceClient()
w.serving_endpoints.create(
name=endpoint_name,
config=EndpointCoreConfigInput(
served_entities=[
ServedEntityInput(
entity_name=registered_model_name,
entity_version=str(model_version),
workload_size="Small",
scale_to_zero_enabled=True,
)
]
),
ai_gateway=AiGatewayConfig(
inference_table_config=AiGatewayInferenceTableConfig(
catalog_name=catalog_name,
schema_name=schema_name,
table_name_prefix=endpoint_name,
enabled=True,
)
),
)
Test the deployed endpoint by calling it with a base64-encoded image and verify the structured bounding-box response.
6.2 Call the endpoint with base64 input (same as local PyFunc test):
import base64
with open(test_image_path, 'rb') as f:
image_base64 = base64.b64encode(f.read()).decode('utf-8')
response = w.serving_endpoints.query(
name=endpoint_name,
dataframe_records=[{"image_base64": image_base64}],
)
# Response contains DataFrame with class_name, confidence, bbox_x1, bbox_y1, ...
Successful testing confirms the custom PyFunc wrapper's ability to handle base64-encoded image input and return a structured bounding-box output from the Model Serving endpoint, which are critical technical considerations detailed next.
A few details are worth calling out for implementation and operations:
Conclusion
The ability to train or fine-tune and deploy YOLO (You Only Look Once) models on Databricks Data Intelligence Platform provides enterprises with a high-performance, cost-optimized, and easily adoptable Computer Vision (CV) solution. Our walkthrough shows a complete path from raw images to a live YOLO endpoint on Databricks — no cluster provisioning, full MLflow tracking, Unity Catalog governance, and production-ready serving with built-in request logging. Swap COCO128 for your own dataset and the same workflow applies. As your data or model complexity grows, the same registration and deployment steps extend to multi-GPU and multi-node training patterns.
Figure 4: Example validation of YOLO object detection inference on a sample of COCO128 images
Here's how you can try this out:
Stay tuned for the follow-up post on multi-GPU and multi-node YOLO model training on AI Runtime!
[1] Ultralytics YOLO is dual-licensed: AGPL-3.0 (default) or Enterprise for commercial use. Users should review https://www.ultralytics.com/license to determine which applies to their use case.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.