Re: Simply writing a dataframe to a CSV file (non-...

chris0706 · ‎10-04-2024

I know this post is a little old, but Chat GPT actually put together a very clean and straightforward solution for me (in scala):

// Set the temporary output directory and the desired final file path

val tempDir = "/tmp/your_file_name"

val finalOutputPath = "/tmp/your_file_name.csv"

// Get a DataFrame that contains the relevant CSV file data

val df = spark.table("your_table_name")

// Write DataFrame to a single partition in the temporary directory

df.coalesce(1)

.write

.mode("overwrite")

.option("header", "true")

.csv(tempDir)

// List the files in the temporary directory to find the CSV file

val csvFile = dbutils.fs.ls(tempDir).filter(file => file.name.endsWith(".csv"))(0).path

// Move and rename the CSV file to the desired location

dbutils.fs.mv(csvFile, finalOutputPath)

// Remove the temporary directory

dbutils.fs.rm(tempDir, true)