cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How do we get logs on read queries from delta lake in Databricks?

User16790091296
Contributor II

I've tried with :

df.write.mode("overwrite").format("com.databricks.spark.csv").option("header","true").csv(dstPath)

and

df.write.format("csv").mode("overwrite").save(dstPath)

but now I have 10 csv files but I need one file and name it.

1 REPLY 1

Ryan_Chynoweth
Honored Contributor III

The header question seems different than your body question. I am assuming that you are asking how to only get a single CSV file when writing?

To do so you should use the coalesce:

df.coalesce(1).write.format("csv").mode("overwrite").save(dstPath)

This will save a single CSV file underneath the directory that you provide. If you want to have a specific name of the file then you will need to rename it. You could just use dbutils.fs.cp() to copy the file with a new name or you can use Python "os" library to rename it.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.