cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

What kind of data quality rules that can be run using unity catalog

laksh
New Contributor II

We are trying to build data quality process for initial file level or data ingestion level for bronze and add more specific business times for silver and business related aggregates for golden layer.

5 REPLIES 5

mk1987c
New Contributor III

If i understood your question correctly, i think you are looking for list of data quality checks which can be used with unity catalog.

please find few DQ check which you can use.. based on your project requirements.

  • Duplicate check
  • Null value check
  • format check
  • data check
  • unique values check
  • composite column unique value check
  • check column type
  • col max length
  • col min length check
  • col min/max value check
  • col length in between
  • col values present in list for specific values
  • col values should not present in list
  • col value match with regex check
  • col is numeric
  • col is alphanumeric
  • verify presence of column

laksh
New Contributor II

Hi thanks for the reply. But are these the default quality checks that are available from Unity Catalog? If there are business level rules that are more complex, do we need to use other tools or can we still create more complex rules using the unity catalog.

yatharth
New Contributor III

I am exploring the exact thing which you need, and I found out aws glue provides the same type of thing where it writes its own data quality rules, I tried creating the same with Databricks Assistant but not even close

Anonymous
Not applicable

Hi @arun laksh​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

joarobles
New Contributor III

Hi @laksh!

You could take a look at Rudol Data Quality, it has native Databricks integration and covers both basic an advanced data quality checks. Basic checks can be configured by non-technical roles using a no-code interface, but there's also the option to configure complex validations with SQL and anomaly detection.

Have a high-quality week! 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group