cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Exploring Data Quality Frameworks in Databricks

jommo
New Contributor

I’m currently investigating solutions for Data Quality (DQ) within the Databricks environment and would love to hear what frameworks or approaches you are using for this purpose.

In the past, I’ve worked with Deequ, but I’ve noticed that it’s not as widely used anymore, and I’ve heard great expectations around other solutions. I’m curious to learn about your experiences:

  1. What frameworks or tools are you using for Data Quality in Databricks today?
  2. How do you approach DQ monitoring, validation, and automation in your pipelines?
  3. Are there any specific challenges or best practices you'd like to share?

Any insights or recommendations would be greatly appreciated. Looking forward to hearing your thoughts!

1 REPLY 1

SparkJun
Databricks Employee
Databricks Employee

Delta Live Tables (DLT): ref: https://docs.databricks.com/en/delta-live-tables/expectations.html

  • Expectations: DLT allows you to define data quality constraints on datasets using expectations. These expectations can be applied to queries using Python decorators or SQL constraint clauses. Actions for invalid records include warning, dropping, or quarantining them.
  • Advanced Validation: You can perform complex data quality checks by defining materialized views using aggregate and join queries.
  • Portability and Reusability: Data quality rules can be maintained separately from pipeline implementations, stored in a Delta table, and applied using tags.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group