Error when uploading MLFlow artifacts to DBFS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-24-2025 06:12 PM
Hi everyone,
I'm attempting to use MLFlow experiment tracking from a local machine, but I'm encountering difficulties in uploading artifacts.
I've tried a sample code as simple as the following.
import mlflow
import os
os.environ["DATABRICKS_HOST"] = "https://XXXXXX.cloud.databricks.com/"
os.environ["DATABRICKS_TOKEN"] = "dapiXXXXX"
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("XXXX")
with mlflow.start_run() as run:
mlflow.log_param("param1", 5)
mlflow.log_metric("foo", 1, step=0)
mlflow.log_metric("foo", 2, step=1)
mlflow.log_metric("foo", 3, step=2)
mlflow.log_metric("foo", 4, step=3)
mlflow.log_metric("foo", 5, step=4)
mlflow.log_artifact("main.py")This code successfully created a new run in the target MLFlow experiment, and logged the parameters "param1" and metric "foo" correctly. However, it failed to log the artifact and displayed an error message like the following.
mlflow.exceptions.MlflowException: 403 Client Error: Forbidden for url: https://dbstorage-prod-whkxn.s3.ap-southeast-2.amazonaws.com/ws/xxxxxxxxxxxxxxxxx (an AWS presigned URL). Response text: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>xxxxxxxxxxxxxxxx</RequestId><HostId>xxxxxxxxxxxxx</HostId></Error>Do I need any further setting to make artifact logging available?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-25-2025 10:15 AM
A couple things:
1. If you don't own the mlflow experiment you need ot have edit permissions on the experiment (needed for logging). Default artifact locations in DBFS (`dbfs:/databricks/mlflow-tracking/`) require explicit write permissions
2. The location you are writing to, make sure you have proper entitelemnts to write to that location.
3. Unity Catalog volumes require `USE CATALOG` and `USE VOLUME` privileges (if you are using Unity Catalog).
Hope this helps, Louis.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-02-2025 04:23 AM
Hi,
Thank you for the advice! I managed to upload artifacts by creating a Unity Catalog volume and explicitly setting it as the artifact location.
However, I am still wondering if it is possible to upload artifact to the default DBFS artifact location. How can I grant the explicit write permission to the default location?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-02-2025 04:40 AM
It is considered best practice not to store any production data or assets in DBFS (Databricks File System). The primary reason is that DBFS does not provide robust security controls-anyone with workspace access can potentially access items stored there. Instead, Databricks strongly recommends using Unity Catalog for managing and securing your data and AI assets. Unity Catalog offers centralized access control, fine-grained permissions, and enhanced auditing capabilities, making it the preferred solution for production workloads.
DBFS is now considered a legacy storage option and both DBFS mounts and root storage are deprecated due to security risks and their incompatibility with Unity Catalog’s governance model. While there is no official deprecation date yet, it is advisable to migrate your production assets to Unity Catalog Volumes to ensure future compatibility and security.
In summary, use Unity Catalog for all production data and AI assets, and avoid storing anything critical in DBFS.
Hope this help, Big Roux.