Databricks Community

pascal_vogel · ‎06-23-2025

This post is written by Pascal Vogel, Solutions Architect and Anton Vlasov, Senior Solutions Engineer.

Databricks Apps makes it easy to build secure data and AI applications on the Databricks Data Intelligence Platform. To build great apps developers need more than a powerful platform - they need a great developer experience.

If you're coming from notebook-based development, you might be missing out on the fast feedback loops, robust debugging tools, and seamless deployments that modern application developers take for granted.

This post offers an opinionated guide to setting up a great developer experience for Databricks Apps with both the power of the Databricks platform and the productivity of using your favorite local integrated development environment (IDE).

Recognizing development anti-patterns

If you’re new to developing Python-based web applications and more used to implementing data pipelines and data science projects on Databricks, you may be tempted to start developing apps in a Databricks notebook, the workspace files UI, or using Databricks Connect from your local IDE.

However, these approaches share several drawbacks which make them a bad fit for app development:

❌ No live feedback loop: you need to redeploy your app each time you want to see changes in action.
❌ Limited support for syntax highlighting, debugging, code linting, type checking, and unit testing.
❌ Cumbersome dependency management to match the Databricks Apps environment.

Instead, consider adopting the following best practices to develop applications for Databricks Apps.

Develop locally in your preferred IDE

To get a fast feedback loop when building Databricks Apps, develop on your local machine.

Developing Databricks Apps does not require any service-specific tooling, so your favorite IDE like Visual Studio Code, PyCharm, and others will work fine. AI-assisted IDEs like Cursor or Windsurf are a great choice for developing Databricks Apps as well.

Most Python application frameworks like Dash, Streamlit, or Flask feature live refresh functionality.

Live updates.gif

In the case of Streamlit, make sure to enable “Always rerun” in the application UI or set the runOnSave configuration option to true. Dash and Flask have live refresh enabled out of the box.

Optionally, you can improve your IDE development experience with extensions.

The Databricks extension for Visual Studio Code makes it easy to define, deploy, and run Databricks Asset Bundles (DABs) which allow you to apply CI/CD best practices to Databricks projects, including for apps. In addition, the extension allows you to sync files from your local machine to your Databricks workspace files from where you can easily deploy an app.

The Databricks PyCharm plugin similarly allows you to sync local files to your Databricks workspace.

Connect to live Databricks resources during development

Most Databricks Apps leverage other Databricks resources, such as SQL warehouses, Model Serving endpoints, or Jobs. When developing apps locally, we recommend connecting directly to live Databricks resources running in a development workspace.

This way, you always develop against live API definitions, rate limits, and other configurations that may be difficult to reproduce with mocked resources.

By using OAuth user-to-machine (U2M) authentication, your app running locally can interact with live Databricks resources using your personal Databricks credentials.

With the Databricks CLI installed, run the following command to trigger the OAuth U2M authentication flow:

databricks auth login --host <workspace-url> --profile DEFAULT

You can find more detailed setup instructions in the Databricks documentation.

Using this approach has two key benefits:

You do not need to rely on long-lived credentials such as a Databricks Personal Access Token (PAT).
Your app can use the same code to interact with Databricks resources when running locally and when deployed on Databricks Apps.

Note: In the context of developing or deploying a Databricks App, there is no situation where you would need to use a personal access token - OAuth is always the preferred authentication method.

Consider the following example code which uses the Databricks SQL Connector:

# app.py

from databricks import sql
from databricks.sdk.core import Config

cfg = Config()

conn = sql.connect(
    server_hostname=cfg.host,
    http_path="<your-warehouse-http-path>",
    credentials_provider=lambda: cfg.authenticate,
)

query = f"SELECT * FROM main.sandbox.sales_customers LIMIT 1000"

with conn.cursor() as cursor:
    cursor.execute(query)
    df = cursor.fetchall_arrow().to_pandas()
    print(df.head())

conn.close()

This code works both when running the app locally when authenticated with OAuth U2M authentication and when using the service principal credentials of a Databricks App without any code changes.

The same mechanism applies to the Databricks SDK:

# app.py

import streamlit as st
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import ChatMessage, ChatMessageRole

# Uses "default" Databricks CLI profile when running locally, but service principal credentials when deployed to Databricks Apps
w = WorkspaceClient()

response = w.serving_endpoints.query(
    name="databricks-llama-4-maverick",
    messages = [
        ChatMessage(
            role=ChatMessageRole.SYSTEM,
            content="You are a helpful assistant."
        ),
        ChatMessage(
            role=ChatMessageRole.USER,
            content="What is Databricks?",
        ),
    ]
)

st.json(response.as_dict())

Set up the environment and manage dependencies

In a Databricks workspace, your Databricks App is deployed in a serverless compute environment with a certain Python version (currently 3.11.12) and selection of Python packages pre-installed. You can find a detailed list of pre-installed packages under Pre-installed Python libraries in the Databricks Apps docs.

When developing locally, create a Python virtual environment and install any packages required by your application there.

We recommend using uv to manage virtual environments and dependencies.

When deploying a Databricks App, the service looks for a requirements.txt file in the root directory of your application and installs any packages listed there. You can use the requirements.txt to install any additional packages or adjust the version of pre-installed packages. See Manage dependencies for a Databricks app for more information.

Therefore, when preparing your app for deployment, export a list of all dependencies into a requirements.txt file. For example using uv:

uv export --no-annotate --no-hashes --format requirements-txt > requirements.txt

Visit the Environment tab in the Databricks Apps UI to see installed packages, Python version, and active environment variables:

Simplify local app development with the run-local command

The Databricks CLI provides the run-local command which simplifies dependency management, debugging, simulating HTTP headers, and injecting environment variables using the same configuration (app.yaml and requirements.txt) as the Databricks Apps service.

The command orchestrates your local development environment by handling:

Configuration management: the command reads your app.yaml file to determine how to start your app and what environment variables to set. You can also specify an alternative configuration file using the --entrypoint flag (e.g., app-debug.yml for debugging scenarios).

Simulating HTTP headers: a lightweight proxy runs locally to forward requests to your app while injecting Databricks-specific headers (prefixed with X-*) that your app would normally receive when deployed. This includes headers like X-Forwarded-User and X-Forwarded-Email that contain your authentication context.

Process management and authentication: the command runs the start commands defined in your app.yaml file and automatically handles the authentication setup, ensuring your local app can seamlessly connect to live Databricks resources (like SQL warehouses and model serving endpoints) using your personal credentials without any code changes between the local and live Databricks environments.

Run the command in your application root directory:

databricks apps run-local

To specify a custom entrypoint file when running locally run:

databricks apps run-local --entry-point app-debug.yaml

When deploying a Databricks App, any environment variables defined in the app.yaml file will be injected into the Databricks Apps environment. Similarly, the run-local command automatically picks up any environment variables from your app.yaml when running locally.

You can override environment variables present in your app.yaml or inject additional ones by running:

databricks apps run-local --env DB_HOST=localhost

Run the following command to learn more about the available run-local options:

databricks apps run-local --h

Under the hood, run-local leverages uv and debugpy.

Deploy your Databricks Apps

During development, you may want to deploy your application to a live development workspace to test it.

To deploy an application, the Databricks Apps service gets the application code from a folder in the Databricks workspace files.

To copy your local application code to a workspace files folder, you can run the following command in the Databricks CLI:

databricks sync . /Workspace/Users/user@example.com/my-app

Add the --sync option to continuously keep your local directory in sync with your Databricks workspace. The sync command excludes files from syncing based on a .gitignore file if present. Take a look at the CLI documentation for the sync command group for details.

Once the files are uploaded, create an app resource and deploy the app:

# Run once to create the app compute resource
databricks apps create my-app

# Run for each deployment
databricks apps deploy my-app --source-code-path /Workspace/Users/user@example.com/my-app

Alternatively, you can use Databricks Asset Bundles (DABs) to deploy your app from your local machine to different Databricks environments using the databricks bundle deploy and databricks bundle run commands.

Take a look at Manage Databricks apps using Databricks Asset Bundles for detailed instructions.

To set up a full CI/CD pipeline for Databricks Apps with DABs, take a look at Automate your Databricks Apps deployments with GitHub Actions and Databricks Asset Bundles.

Conclusion

Building great Databricks Apps starts with a great developer experience. By developing locally in your preferred IDE, using OAuth U2M authentication to connect to live resources, and leveraging tools like databricks apps run-local, you can achieve the fast feedback loops and seamless deployments that modern development demands.

With this setup, you can focus on what matters most—building powerful data and AI applications on the Databricks platform.

Databricks Community

Setting up your development environment for Databricks Apps

Recognizing development anti-patterns

Develop locally in your preferred IDE

Connect to live Databricks resources during development

Set up the environment and manage dependencies

Simplify local app development with the run-local command

Deploy your Databricks Apps

Conclusion

Metadata-Driven ETL Framework in Databricks (Part-1)

Top 10 query performance tuning tips for Databricks Serverless SQL

Best practices for safe data experimentation with Databricks