How do we get logs on read queries from delta lake in Databricks?

User16790091296 — Thu, 24 Jun 2021 15:12:47 GMT

I've tried with :

df.write.mode("overwrite").format("com.databricks.spark.csv").option("header","true").csv(dstPath)

and

df.write.format("csv").mode("overwrite").save(dstPath)

but now I have 10 csv files but I need one file and name it.

Re: How do we get logs on read queries from delta lake in Databricks?

Ryan_Chynoweth — Thu, 24 Jun 2021 17:53:04 GMT

The header question seems different than your body question. I am assuming that you are asking how to only get a single CSV file when writing?

To do so you should use the coalesce:

df.coalesce(1).write.format("csv").mode("overwrite").save(dstPath)

This will save a single CSV file underneath the directory that you provide. If you want to have a specific name of the file then you will need to rename it. You could just use dbutils.fs.cp() to copy the file with a new name or you can use Python "os" library to rename it.

topic Re: How do we get logs on read queries from delta lake in Databricks? in Data Engineering

How do we get logs on read queries from delta lake in Databricks?

Re: How do we get logs on read queries from delta lake in Databricks?