cancel
Showing results for 
Search instead for 
Did you mean: 
Technical Blog
Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
cancel
Showing results for 
Search instead for 
Did you mean: 
Taras_Chaikovsk
Databricks Employee
Databricks Employee

Introduction

Databricks Lakeflow enables data teams to design and operate data pipelines at scale, where speed and reliability directly influence the time to market for insights. As pipeline complexity grows, test automation becomes essential to maintaining data quality and ensuring smooth, predictable production workflows. With Databricks Asset Bundles, setting up CI/CD and automated testing on Databricks has become significantly simpler, empowering teams to build with confidence and deliver value faster.

Different layers of testing play a role in building reliable data pipelines:

  • Unit tests validate individual functions or transformations and can run locally (in an IDE or CI agent) using Apache Sparkâ„¢ and other dependencies in local mode, which is ideal for providing fast feedback during development. While unit tests are lightweight and efficient, they cannot cover certain Databricks-specific capabilities, especially newer features not yet available in open source Spark, Delta, or Unity Catalog.  Additionally, they are not intended for testing interactions with external dependencies like Kafka, REST APIs or databases. 
  • Component tests verify how multiple functions or transformations collaborate within a single bounded data pipeline component, ensuring that local logic behaves as intended before integrating with other parts of the data pipeline. 
  • Integration tests then validate complete workflows running on Databricks, confirming that data dependencies, configurations, and platform features behave correctly end-to-end, making them essential for achieving full reliability.
  • Acceptance and performance tests validate business requirements and scalability in production-like environments.

In this blog, we focus on integration testing, examining two different approaches, and present a blueprint for one of them.

 

Integration Testing Approaches

Choosing the right approach for integration testing your Lakeflow jobs in Databricks environments is critical for CI/CD maturity, developer productivity, and increasing confidence when deploying changes to a production environment. Two commonly adopted strategies are: Databricks workflow-based integration testing and local integration tests with tools such as Pytest. Let’s dive deep into both approaches to see how they work, and their advantages and disadvantages.

Approach 1: Using Databricks Lakeflow Jobs for Integration Testing

In this approach, tests are implemented as Databricks notebooks or Python scripts, orchestrated and scheduled via Databricks Lakeflow Jobs. Tests run within the Databricks cloud environment, accessing clusters, data sources, and job code directly. This method requires developers to create and deploy two jobs for each pipeline: a main ETL job and a dedicated integration testing job. The main job handles the ETL tasks, while the integration testing job orchestrates test notebooks to set up environments, execute main jobs, validate results, and clean up resources.

The approach consists of the following phases, each executed as a separate task within the integration test job:

  • Setup task: Create isolated test resources (catalogs, schemas, tables) and populate them with test data
  • Trigger task: Run the actual Lakeflow job against the prepared test environment
  • Validation task: Assert expected outcomes and validate data quality
  • Teardown task: Clean up test resources to prevent catalog/workspace clutter

Typically, the integration testing job is deployed and run only in specific environments (e.g. test, acceptance) through a CI/CD process. To ensure isolation and environment-specific functionality, the main ETL job must be parameterized, allowing all environment-dependent values (catalogs, schemas, paths) to be configurable via notebook parameters. The CI agent is responsible for deploying, executing, and validating the integration test results, thereby determining if the build is successful or a failure.

However, this approach presents several drawbacks, primarily due to the necessity of creating and deploying a secondary job for testing alongside the main job. These include:

  • Defining setup and teardown logic
  • Defining the secondary integration testing job with the correct tasks and order
  • Deploying both jobs to the appropriate environment (either locally or through a CI agent)
  • Polling and parsing job results

Additionally, the approach slows down the feedback loop during both test and job development. Running tests requires deploying both jobs with the right parameters to the right environment, running them, waiting for results, and interpreting those results. This context switching breaks the developer flow.

Approach 2: Using Pytest for Integration Testing

In this approach, tests are written locally using Pytest and developed in your preferred IDE with full debugging capabilities. Tests leverage Databricks Connect to establish a remote execution context on Databricks clusters while orchestrating everything from your local machine or a CI agent. Integration tests are written using Pytest, leveraging its capability for setting up and tearing down resources through fixtures, triggering the job to be tested, and validating the results afterward through Pytest assertions.

The typical workflow is:

  • Test Setup: Pytest fixtures use Databricks Connect to create test-specific resources (catalogs, schemas, tables) using the remote cluster, establishing an isolated test environment.
  • Job Execution: The Lakeflow job is triggered to run on Databricks, operating against the test data and environment.
  • Result Validation: After job completion, Pytest assertions query results through Databricks Connect, validating outputs, data quality, and business logic.
  • Teardown: Test fixtures clean up resources, removing test-created catalogs, schemas, and tables to maintain workspace hygiene and prevent test interference.

Similar to the first approach, the main job under test must be parameterized for all environments with test-specific values configured during test execution. This ensures test isolation, as resources are created specifically for each test run. Developers can execute the test either locally or via a CI agent without deploying additional jobs to Databricks, enabling them to assert job execution results remotely. Pytest also offers test reporting, which helps in validating and quickly identifying successful and failing tests.

This approach provides the benefits of:

  • Rapid Development and Iteration: Tests are developed, executed, and debugged locally without additional deployments or workflow scheduling.
  • Tight Feedback Loops: Developers see results in the same IDE without the need to switch between different tools and UI during active development.
  • Complete Pytest Ecosystem Access: Access to fixtures for complex setup/teardown logic, parametrization for testing multiple scenarios, powerful assertions, and hundreds of plugins for specialized testing needs.
  • Remote Execution: Still executes within the Databricks environment through remote execution using Databricks Connect.

 

Approach 2 Deep Dive

Blueprint Overview

At its core, this blueprint defines a pytest-driven framework to run integration tests on each Lakeflow job (with multiple tasks). Every test case follows a standardized structure focused on reproducibility, isolation, and full verification of pipeline outputs.

Pytest Template Structure

Setup Phase

  • Establishes a predefined, isolated clean state by creating new ephemeral Unity Catalog schemas and volumes required for each test run, ensuring every run is consistent and independent of prior (and parallel) executions. Note that Unity Catalog schema and volume creation privileges are required to execute integration tests.

Trigger Phase

  • Invokes the deployed Databricks workflow via the Workspace client, initiated from within a pytest fixture.
  • Supports parameterization to test multiple pipeline configurations, allowing each test to run in parallel and isolation on its own Unity Catalog schema.
  • The trigger emulates a production job, validating orchestration logic and Databricks-specific dependencies that local Spark tests cannot achieve.

Assertion and Cleanup Phase

  • After workflow completion, assertions validate the resulting state in Unity Catalog tables.
  • Typical validations include:
    • Record counts in Delta or Iceberg tables.
    • Comparing records to stored snapshots.
  • Cleanup logic tears down or deletes the ephemeral resources, such as tables, checkpoints, and volumes.

Each test execution provides consistent validation against real Databricks runtime behavior, from workflow orchestration to data persistence, while preserving the speed and productivity of local workflows.

Orchestrating Tests with databricks-labs-pytester

The databricks-labs-pytester package is a valuable utility for orchestrating integration tests with Databricks, enhancing pytest by providing native Databricks capabilities for resource management, Spark session handling and test isolation.

Key benefits

  • Prebuilt fixtures handle environment setup, such as the creation of ephemeral Unity Catalog schemas and volumes required for each test.
  • Workspace integration exposes convenient APIs (through the ws fixture) to trigger and monitor Databricks workflows.
  • Lifecycle consistency guarantees clean setup and teardown execution even when tests fail.

Here’s an example integration test showing these concepts in action:

@pytest.mark.integration_test
def test_job(spark, make_schema, ws, job_id):
    catalog_name = "main"

    # Ephemeral schema creation
    schema_name = make_schema(catalog_name=catalog_name).name

    # Trigger a Databricks Workflow run
    run_wait = ws.jobs.run_now(
        job_id=job_id,
        python_params=[catalog_name, schema_name]
    )

    # Wait for run completion and validate success
    run_result = run_wait.result()
    result_status = run_result.state.result_state
    assert result_status == RunResultState.SUCCESS

    # Validate result data written to Unity Catalog table
    df = spark.read.table(f"{catalog_name}.{schema_name}.my_table")
    assert df.count() > 0

@pytest.fixture
def job_id(ws, request):
  job_name = request.config.getoption("--job-name")
  job_id = next((job.job_id for job in ws.jobs.list() if job.settings.name == job_name), None)
  if job_id is None:
      raise ValueError(f"Job '{job_name}' not found.")

  return job_id

In this test:

  • make_schema creates an ephemeral Unity Catalog schema required for this specific test.
  • ws gives a handle to the Databricks Workspace client, allowing direct workflow triggering via run_now.
  • Post-execution, data is validated through the spark fixture powered by Databricks Connect, demonstrating full end-to-end validation from orchestration to data verification.

This minimal example demonstrates both Pytester’s fixture-driven simplicity and its tight integration with Lakeflow jobs, enabling well-readable, fully automated tests optimized for CI/CD.

Leveraging DB Connect

While Pytester manages Databricks resource orchestration and improves the test code structure, Databricks Connect powers the actual Spark execution on any Databricks compute, including serverless compute. Databricks Serverless provides instant, on-demand compute, which is critical for fast test completion within both inner and outer development loops.

Through Databricks Connect, tests:

  • Operate natively with Unity Catalog assets on Databricks.
  • Execute Spark transformations and queries for setup or assertion phases remotely on the Databricks runtime.
  • Maintain parity with production configurations without requiring local Spark clusters.

 

Managing Python Environments with the uv package manager

Unit and integration tests depend on different Spark and Databricks runtime contexts. The UV package manager makes it easy to enforce dependency isolation across these layers through pyproject.toml dependency groups.

Example configuration from pyproject.toml:

[dependency-groups]
dev = [
    "pytest>=8.3.4",
    "databricks-labs-pytester"
]
unit-tests = [
    # These dependencies will break integration tests relying on databricks-connect
    "pyspark>=4.0.0,<5.0.0"
]
integration-tests = [
    "databricks-connect==17.1.0"
]

Usage patterns

Run integration tests:

#bash
uv sync --only-group integration-tests
uv run python -m pytest -rsx -m integration_test --job-name="<job-name-to-test>"

Run unit tests:

#bash
uv sync --only-group unit-tests
uv run python -m pytest -m unit_test

With isolated dependency groups, data teams can confidently test across multiple layers while maintaining a single, consistent repository. Shared dependencies under the dev group reduce duplication and ensure identical setups both locally and within CI/CD executions.

 

Demo

The following section showcases how this setup will run in an actual Databricks environment. The full demo code is available in the Databricks Blogposts GitHub repository.

Setting up the environment

Prerequisite:  Install and setup the following dependencies

#bash
uv sync --only-group integration-tests
  • Set up the following environment variables for Pytester
export DATABRICKS_HOST=<your-dev-workspace-url>
export DATABRICKS_CLUSTER_ID=<cluster-id> # Used to run Spark code on Databricks
export DATABRICKS_WAREHOUSE_ID=<warehouse-id> # Used by pytester fixtures to run SQL queries 

# Optionally if you want to use serverless instead of classic compute, replace DATABRICKS_CLUSTER_ID with this
export DATABRICKS_SERVERLESS_COMPUTE_ID=auto

Bundling and deploying the job

Set up a Databricks Assets Bundle for the main job which will make it easy to define and deploy the job to different target environments.

Example job definition: 

# ./resources/workflow_test_automation_blueprint.job.yml
resources:
  jobs:
    workflow_test_automation_blueprint_job:
      name: workflow_test_automation_blueprint_job
      tasks:
        - task_key: calculate_avg_trip_distance
          python_wheel_task:
            entry_point: main
            package_name: ps_test_blueprint
            parameters: ["main","default"]
          environment_key: default
      environments:
        - environment_key: default
          spec:
            environment_version: '3'
            dependencies:
              - ../dist/*.whl

Example databricks.yml file

bundle:
  name: workflow_test_automation_blueprint

include:
  - resources/*.yml

artifacts:
  default:
    type: whl
    build: uv build --wheel --package ps_test_blueprint
    path: .

targets:
  dev:
    mode: development
    default: true

Once the YAML files are defined and configured properly, deploy the main job to be tested toward a Databricks environment where you want to test the job. In this example, it will be deployed to a dev environment.

databricks bundle deploy -t dev

Taras_Chaikovsk_0-1765466452035.png

 

Running the integration tests

Next, execute the integration tests, which will set up the environment through fixtures, trigger the main job that was deployed in the previous step, and validate the results of the job run. This can be run multiple times for the same job deployed in the previous step.

uv run python -m pytest -rsx -m integration_test --job-name="workflow_test_automation_blueprint_job"

Once the test runs, a new test-specific schema is created to store the output tables and is passed to the job as a parameter. The main job started running as well.

Taras_Chaikovsk_1-1765466452036.png

The schema dummy_* is created by Pytester fixture specifically for the tests.

Taras_Chaikovsk_2-1765466452037.png

 

Results and cleanup

The main job ran successfully and the results are returned to the test to be validated. Marking the tests a success and producing the pytest report

Taras_Chaikovsk_3-1765466452037.png

The schema and any resources created specifically for the test are automatically deleted after the test run.

Taras_Chaikovsk_4-1765466452038.png

 

Conclusion

In this blog, we explored two approaches for integration testing of Lakeflow Jobs and presented a practical blueprint for Approach 2 using Pytest and Databricks Connect. By combining Pytest’s  framework, including advanced fixture management through databricks-pytester, with Databricks Connect’s remote job execution, data teams gain faster feedback cycles, higher developer productivity, and more reliable data pipelines. This streamlined workflow empowers teams to test the entire lifecycle, from orchestration and external dependencies to data persistence, directly within their IDE or CI environment.