Databricks regression test suite
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2023 04:10 AM
Hi, I am new to Databricks and setting up the non-prod environment. I am wanted to know, IS there any way by which I can run a regression suite so that existing setup should not break in case of any feature addition and also how can I make available prod datas in non-prod? Shallow copy?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-13-2023 08:38 AM
@deepak prasad :
Yes, you can run regression tests to ensure that your changes do not break existing functionality. Databricks supports a number of testing frameworks like PyTest, which can be used to automate regression testing. You can write test cases that cover different scenarios and use cases of your application and run them automatically after each code change.
To make production data available in non-production, you can use a number of techniques such as database replication, backup and restore, or data cloning. One approach could be to take regular backups of your production databases and restore them in non-production environments. You can also use data masking and obfuscation techniques to protect sensitive data in non-production environments. Another approach is to use a data virtualization platform that can create a virtualized copy of the production data on demand, without actually copying the data. This can help reduce the storage requirements in non-production environments.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-14-2023 09:58 PM
Hi @Suteja Kanuri ,
I can create the testcases using any framework may be pytest or great_expectation, But how to run regression after any code changes. Is there any blog or documentation for the non-prod setup or regression running? Can you please share some references for this?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-09-2023 02:38 AM
@deepak prasad :
Here you go
- https://docs.greatexpectations.io/docs/
- You can search here - https://www.databricks.com/blog
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-08-2025 09:26 AM
Regression testing after code changes can be automated easily. Once you’ve created test cases with Pytest or Great Expectations, you can set up a CI/CD pipeline using tools like Jenkins or GitHub Actions. For a non-prod setup, Docker is great for replicating the environment consistently.
If you're looking for more details, this blog might help: Regression Testing and Stat Studio. It explains tools and processes for smoother regression testing.
Hope this helps! Let me know if you have any specific questions.

