Data-quality help: Save Data Profile dbutils.data.summarize(df) to table
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-20-2022 12:05 PM
Hi there,
We would like to create a data quality database that helps us understand how complete our data is. We would like to run a job each day that basically outputs the same table data as dbutils.data.summarize(df) for a given table and save it to databricks.
Any ideas on how we could do that?
Thanks,
Avkash
Labels:
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-22-2022 05:17 AM
From what I know there's no easy way to save dbutils.data.summarize() into a df.
You can still create your custom python/pyspark code to profile your data and save the output.

