How do we get logs on read queries from delta lake in Databricks?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-24-2021 08:12 AM
I've tried with :
df.write.mode("overwrite").format("com.databricks.spark.csv").option("header","true").csv(dstPath)and
df.write.format("csv").mode("overwrite").save(dstPath)but now I have 10 csv files but I need one file and name it.
Labels:
- Labels:
-
Query Results
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-24-2021 10:53 AM
The header question seems different than your body question. I am assuming that you are asking how to only get a single CSV file when writing?
To do so you should use the coalesce:
df.coalesce(1).write.format("csv").mode("overwrite").save(dstPath)This will save a single CSV file underneath the directory that you provide. If you want to have a specific name of the file then you will need to rename it. You could just use dbutils.fs.cp() to copy the file with a new name or you can use Python "os" library to rename it.