cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Help with Integration Testing for SQL Notebooks in Databricks

jorperort
New Contributor III

Hi everyone,

I’m looking for the best way to implement integration tests for SQL notebooks in an environment that uses Unity Catalog and workflows to execute these notebooks.

For unit tests on SQL functions, I’ve reviewed the https://docs.databricks.com/en/notebooks/testing.html#language-SQL, and it seems to provide an appropriate approach. However, I’m unsure how to tackle integration testing for complete SQL notebooks.

My goal is to run both unit and integration tests in a CI/CD pipeline using Databricks Asset Bundles. If I were working with Python notebooks, I’d use pytest for unit tests and simulate data in dedicated notebooks for integration tests.

Has anyone faced this kind of challenge with SQL notebooks? How have you approached it? Any advice, experience, or best practices you could share would be greatly appreciated.

Thank you in advance for your help!

1 ACCEPTED SOLUTION

Accepted Solutions

VZLA
Databricks Employee
Databricks Employee

Hello, thank you for your question about integration testing for SQL notebooks in Databricks. Here’s a very concise approach:

For integration testing, simulate the data environment by creating temporary tables or views in Unity Catalog. This allows your SQL notebooks to process controlled test data without impacting production. Use workflows to orchestrate the process, including steps for setting up test data, running the SQL notebook, and validating results. Parameterize the workflows to easily switch between test and production configurations.

To incorporate this into a CI/CD pipeline, deploy notebooks to a test workspace using Databricks Asset Bundles, trigger workflows via the Databricks CLI or REST API, and validate outputs by querying test tables. Finally, ensure your validation logic uses assertions to compare expected and actual results, and surface these as pass/fail indicators in the pipeline.

Hope it helps for a starter!

View solution in original post

1 REPLY 1

VZLA
Databricks Employee
Databricks Employee

Hello, thank you for your question about integration testing for SQL notebooks in Databricks. Here’s a very concise approach:

For integration testing, simulate the data environment by creating temporary tables or views in Unity Catalog. This allows your SQL notebooks to process controlled test data without impacting production. Use workflows to orchestrate the process, including steps for setting up test data, running the SQL notebook, and validating results. Parameterize the workflows to easily switch between test and production configurations.

To incorporate this into a CI/CD pipeline, deploy notebooks to a test workspace using Databricks Asset Bundles, trigger workflows via the Databricks CLI or REST API, and validate outputs by querying test tables. Finally, ensure your validation logic uses assertions to compare expected and actual results, and surface these as pass/fail indicators in the pipeline.

Hope it helps for a starter!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group