What kind of data quality rules that can be run using unity catalog
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-28-2023 04:23 PM
We are trying to build data quality process for initial file level or data ingestion level for bronze and add more specific business times for silver and business related aggregates for golden layer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-28-2023 08:52 PM
If i understood your question correctly, i think you are looking for list of data quality checks which can be used with unity catalog.
please find few DQ check which you can use.. based on your project requirements.
- Duplicate check
- Null value check
- format check
- data check
- unique values check
- composite column unique value check
- check column type
- col max length
- col min length check
- col min/max value check
- col length in between
- col values present in list for specific values
- col values should not present in list
- col value match with regex check
- col is numeric
- col is alphanumeric
- verify presence of column
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-01-2023 11:33 AM
Hi thanks for the reply. But are these the default quality checks that are available from Unity Catalog? If there are business level rules that are more complex, do we need to use other tools or can we still create more complex rules using the unity catalog.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2024 04:13 AM
I am exploring the exact thing which you need, and I found out aws glue provides the same type of thing where it writes its own data quality rules, I tried creating the same with Databricks Assistant but not even close
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-20-2023 10:05 PM
Hi @arun laksh
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-25-2024 08:45 AM
Hi @laksh!
You could take a look at Rudol Data Quality, it has native Databricks integration and covers both basic an advanced data quality checks. Basic checks can be configured by non-technical roles using a no-code interface, but there's also the option to configure complex validations with SQL and anomaly detection.
Have a high-quality week!

