cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Data-quality help: Save Data Profile dbutils.data.summarize(df) to table

Kash
Contributor III

Hi there,

We would like to create a data quality database that helps us understand how complete our data is. We would like to run a job each day that basically outputs the same table data as dbutils.data.summarize(df) for a given table and save it to databricks.

Any ideas on how we could do that?

Thanks,

Avkash

1 REPLY 1

daniel_sahal
Esteemed Contributor

From what I know there's no easy way to save dbutils.data.summarize() into a df.

You can still create your custom python/pyspark code to profile your data and save the output.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.