This post is written by Pascal Vogel, Solutions Architect and Anton Vlasov, Senior Solutions Engineer.
Databricks Apps makes it easy to build secure data and AI applications on the Databricks Data Intelligence Platform. To build great apps developers need more than a powerful platform - they need a great developer experience.
If you're coming from notebook-based development, you might be missing out on the fast feedback loops, robust debugging tools, and seamless deployments that modern application developers take for granted.
This post offers an opinionated guide to setting up a great developer experience for Databricks Apps with both the power of the Databricks platform and the productivity of using your favorite local integrated development environment (IDE).
If you’re new to developing Python-based web applications and more used to implementing data pipelines and data science projects on Databricks, you may be tempted to start developing apps in a Databricks notebook, the workspace files UI, or using Databricks Connect from your local IDE.
However, these approaches share several drawbacks which make them a bad fit for app development:
❌ No live feedback loop: you need to redeploy your app each time you want to see changes in action.
❌ Limited support for syntax highlighting, debugging, code linting, type checking, and unit testing.
❌ Cumbersome dependency management to match the Databricks Apps environment.
Instead, consider adopting the following best practices to develop applications for Databricks Apps.
To get a fast feedback loop when building Databricks Apps, develop on your local machine.
Developing Databricks Apps does not require any service-specific tooling, so your favorite IDE like Visual Studio Code, PyCharm, and others will work fine. AI-assisted IDEs like Cursor or Windsurf are a great choice for developing Databricks Apps as well.
Most Python application frameworks like Dash, Streamlit, or Flask feature live refresh functionality.
In the case of Streamlit, make sure to enable “Always rerun” in the application UI or set the runOnSave
configuration option to true. Dash and Flask have live refresh enabled out of the box.
Optionally, you can improve your IDE development experience with extensions.
The Databricks extension for Visual Studio Code makes it easy to define, deploy, and run Databricks Asset Bundles (DABs) which allow you to apply CI/CD best practices to Databricks projects, including for apps. In addition, the extension allows you to sync files from your local machine to your Databricks workspace files from where you can easily deploy an app.
The Databricks PyCharm plugin similarly allows you to sync local files to your Databricks workspace.
Most Databricks Apps leverage other Databricks resources, such as SQL warehouses, Model Serving endpoints, or Jobs. When developing apps locally, we recommend connecting directly to live Databricks resources running in a development workspace.
This way, you always develop against live API definitions, rate limits, and other configurations that may be difficult to reproduce with mocked resources.
By using OAuth user-to-machine (U2M) authentication, your app running locally can interact with live Databricks resources using your personal Databricks credentials.
With the Databricks CLI installed, run the following command to trigger the OAuth U2M authentication flow:
databricks auth login --host <workspace-url> --profile DEFAULT
You can find more detailed setup instructions in the Databricks documentation.
Using this approach has two key benefits:
Consider the following example code which uses the Databricks SQL Connector:
# app.py
from databricks import sql
from databricks.sdk.core import Config
cfg = Config()
conn = sql.connect(
server_hostname=cfg.host,
http_path="<your-warehouse-http-path>",
credentials_provider=lambda: cfg.authenticate,
)
query = f"SELECT * FROM main.sandbox.sales_customers LIMIT 1000"
with conn.cursor() as cursor:
cursor.execute(query)
df = cursor.fetchall_arrow().to_pandas()
print(df.head())
conn.close()
This code works both when running the app locally when authenticated with OAuth U2M authentication and when using the service principal credentials of a Databricks App without any code changes.
The same mechanism applies to the Databricks SDK:
# app.py
import streamlit as st
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import ChatMessage, ChatMessageRole
# Uses "default" Databricks CLI profile when running locally, but service principal credentials when deployed to Databricks Apps
w = WorkspaceClient()
response = w.serving_endpoints.query(
name="databricks-llama-4-maverick",
messages = [
ChatMessage(
role=ChatMessageRole.SYSTEM,
content="You are a helpful assistant."
),
ChatMessage(
role=ChatMessageRole.USER,
content="What is Databricks?",
),
]
)
st.json(response.as_dict())
In a Databricks workspace, your Databricks App is deployed in a serverless compute environment with a certain Python version (currently 3.11.12) and selection of Python packages pre-installed. You can find a detailed list of pre-installed packages under Pre-installed Python libraries in the Databricks Apps docs.
When developing locally, create a Python virtual environment and install any packages required by your application there.
We recommend using uv to manage virtual environments and dependencies.
When deploying a Databricks App, the service looks for a requirements.txt
file in the root directory of your application and installs any packages listed there. You can use the requirements.txt
to install any additional packages or adjust the version of pre-installed packages. See Manage dependencies for a Databricks app for more information.
Therefore, when preparing your app for deployment, export a list of all dependencies into a requirements.txt
file. For example using uv:
uv export --no-annotate --no-hashes --format requirements-txt > requirements.txt
Visit the Environment tab in the Databricks Apps UI to see installed packages, Python version, and active environment variables:
The Databricks CLI provides the run-local command which simplifies dependency management, debugging, simulating HTTP headers, and injecting environment variables using the same configuration (app.yaml
and requirements.txt
) as the Databricks Apps service.
The command orchestrates your local development environment by handling:
Configuration management: the command reads your app.yaml
file to determine how to start your app and what environment variables to set. You can also specify an alternative configuration file using the --entrypoint
flag (e.g., app-debug.yml
for debugging scenarios).
Simulating HTTP headers: a lightweight proxy runs locally to forward requests to your app while injecting Databricks-specific headers (prefixed with X-*
) that your app would normally receive when deployed. This includes headers like X-Forwarded-User
and X-Forwarded-Email
that contain your authentication context.
Process management and authentication: the command runs the start commands defined in your app.yaml
file and automatically handles the authentication setup, ensuring your local app can seamlessly connect to live Databricks resources (like SQL warehouses and model serving endpoints) using your personal credentials without any code changes between the local and live Databricks environments.
Run the command in your application root directory:
databricks apps run-local
To specify a custom entrypoint file when running locally run:
databricks apps run-local --entry-point app-debug.yaml
When deploying a Databricks App, any environment variables defined in the app.yaml
file will be injected into the Databricks Apps environment. Similarly, the run-local command automatically picks up any environment variables from your app.yaml
when running locally.
You can override environment variables present in your app.yaml
or inject additional ones by running:
databricks apps run-local --env DB_HOST=localhost
Run the following command to learn more about the available run-local options:
databricks apps run-local --h
Under the hood, run-local leverages uv and debugpy.
During development, you may want to deploy your application to a live development workspace to test it.
To deploy an application, the Databricks Apps service gets the application code from a folder in the Databricks workspace files.
To copy your local application code to a workspace files folder, you can run the following command in the Databricks CLI:
databricks sync . /Workspace/Users/user@example.com/my-app
Add the --sync
option to continuously keep your local directory in sync with your Databricks workspace. The sync command excludes files from syncing based on a .gitignore
file if present. Take a look at the CLI documentation for the sync command group for details.
Once the files are uploaded, create an app resource and deploy the app:
# Run once to create the app compute resource
databricks apps create my-app
# Run for each deployment
databricks apps deploy my-app --source-code-path /Workspace/Users/user@example.com/my-app
Alternatively, you can use Databricks Asset Bundles (DABs) to deploy your app from your local machine to different Databricks environments using the databricks bundle deploy
and databricks bundle run
commands.
Take a look at Manage Databricks apps using Databricks Asset Bundles for detailed instructions.
To set up a full CI/CD pipeline for Databricks Apps with DABs, take a look at Automate your Databricks Apps deployments with GitHub Actions and Databricks Asset Bundles.
Building great Databricks Apps starts with a great developer experience. By developing locally in your preferred IDE, using OAuth U2M authentication to connect to live resources, and leveraging tools like databricks apps run-local, you can achieve the fast feedback loops and seamless deployments that modern development demands.
With this setup, you can focus on what matters most—building powerful data and AI applications on the Databricks platform.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.