Hi @GergoBo,
There are a few approaches depending on whether you need playback in a notebook or within a Dash/Flask web app. Here is a breakdown of each scenario.
OPTION 1: VIDEO PLAYBACK IN A DATABRICKS NOTEBOOK
For notebooks, you can read the MP4 file from the volume path and embed it as a base64-encoded data URI. This works well for files under ~50 MB:
import base64
from IPython.display import HTML
video_path = "/Volumes/<catalog>/<schema>/<volume>/path/to/video.mp4"
with open(video_path, "rb") as f:
data = f.read()
b64_video = base64.b64encode(data).decode()
HTML(f"""
<video width="640" height="480" controls>
<source src="data:video/mp4;base64,{b64_video}" type="video/mp4">
</video>
""")
This loads the entire file into memory, so it is best suited for smaller files.
OPTION 2: DASH/FLASK APP RUNNING AS A DATABRICKS APP
If you are building your Dash app as a Databricks App, you can serve videos directly from Unity Catalog Volumes by creating a Flask route that streams the file content. Here is the approach:
1. Add a Unity Catalog Volume as a resource in your app.yaml configuration. This gives your app's service principal permission to read from the volume.
2. Use the Databricks SDK (databricks-sdk) to download the file and stream it back to the browser. This avoids loading the entire file into memory at once.
Example Flask/Dash route for streaming:
from flask import Response
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
@app.server.route("/video/<path:filename>")
def serve_video(filename):
volume_path = f"/Volumes/<catalog>/<schema>/<volume>/{filename}"
def generate():
resp = w.files.download(volume_path)
with resp.contents as f:
while True:
chunk = f.read(8192)
if not chunk:
break
yield chunk
return Response(generate(), mimetype="video/mp4")
Then in your Dash layout, reference the video with a standard HTML5 video element:
import dash_html_components as html
html.Video(
src="/video/path/to/video.mp4",
controls=True,
width="640",
height="480"
)
This streams the video in 8 KB chunks, so it handles larger files without memory issues.
3. Make sure your requirements.txt includes databricks-sdk.
For the app.yaml resource configuration, you would add something like:
resources:
- name: video-volume
type: unity-catalog-volume
path: /Volumes/<catalog>/<schema>/<volume>
permission: READ
Documentation reference for Databricks Apps resources:
https://docs.databricks.com/aws/en/dev-tools/databricks-apps/resources
OPTION 3: FLASK/DASH ON A CLUSTER (NOT A DATABRICKS APP)
If you are running your Flask/Dash server directly on a cluster (e.g., via a notebook or driver proxy), you can read from the volume using the standard file system path since the cluster has direct FUSE access to /Volumes/:
from flask import Response
@app.server.route("/video/<path:filename>")
def serve_video(filename):
volume_path = f"/Volumes/<catalog>/<schema>/<volume>/{filename}"
def generate():
with open(volume_path, "rb") as f:
while True:
chunk = f.read(8192)
if not chunk:
break
yield chunk
return Response(generate(), mimetype="video/mp4")
The driver proxy URL would be something like:
https://<workspace-url>/driver-proxy/o/<org-id>/<cluster-id>/<port>/video/my_video.mp4
SUPPORTING RANGE REQUESTS (SEEK/SCRUB)
For a better user experience with video scrubbing (seeking to specific positions), you can add HTTP Range request support to your Flask route. This lets the browser request specific byte ranges instead of downloading the entire file:
import os
from flask import Response, request
@app.server.route("/video/<path:filename>")
def serve_video(filename):
volume_path = f"/Volumes/<catalog>/<schema>/<volume>/{filename}"
file_size = os.path.getsize(volume_path)
range_header = request.headers.get("Range")
if range_header:
byte_start = int(range_header.replace("bytes=", "").split("-")[0])
byte_end = min(byte_start + 1024 * 1024, file_size - 1)
content_length = byte_end - byte_start + 1
def generate():
with open(volume_path, "rb") as f:
f.seek(byte_start)
yield f.read(content_length)
return Response(
generate(),
status=206,
mimetype="video/mp4",
headers={
"Content-Range": f"bytes {byte_start}-{byte_end}/{file_size}",
"Accept-Ranges": "bytes",
"Content-Length": content_length,
},
)
def generate():
with open(volume_path, "rb") as f:
while True:
chunk = f.read(8192)
if not chunk:
break
yield chunk
return Response(generate(), mimetype="video/mp4")
Note: Range request support works most naturally with direct file access (cluster or Databricks App with FUSE mount). If using the SDK download method, you would need to handle partial reads differently since the SDK streams the full file.
RELEVANT DOCUMENTATION
- Unity Catalog Volumes overview: https://docs.databricks.com/aws/en/connect/unity-catalog/volumes
- Working with files in volumes: https://docs.databricks.com/aws/en/volumes/volume-files
- Databricks Apps overview: https://docs.databricks.com/aws/en/dev-tools/databricks-apps/index
- Databricks Apps resources: https://docs.databricks.com/aws/en/dev-tools/databricks-apps/resources
- Databricks SDK for Python: https://docs.databricks.com/aws/en/dev-tools/sdk-python
* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.
If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.