cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Write empty dataframe into csv

_not_provid1755
New Contributor

I'm writing my output (entity) data frame into csv file. Below statement works well when the data frame is non-empty.

entity.repartition(1).write.mode(SaveMode.Overwrite).format("csv").option("header", "true").save(tempLocation)

It's not working when it is empty. Empty file is getting created and I'm expecting at least headers will show up so that my Tabular model won't fail with "Invalid column" error.

Anyone experienced this issue?

Thanks!

3 REPLIES 3

mathan_pillai
Databricks Employee
Databricks Employee

Hi,

Thanks for reaching out to Databricks forum,

This is a bug with OSS, which is being fixed in Spark 3 version.

Here is the jira ticket about the issue

https://issues.apache.org/jira/browse/SPARK-26208

Here is the pull request for the fix, which will be merged

https://github.com/apache/spark/pull/23173

Porting the fix to the Databricks runtime versions is in the pipeline.

Please let us know whether it answers your question or if you have follow-up question.

Thanks

Sandeep
Contributor III

Since Spark 2.4, writing a dataframe with an empty or nested empty schema using any file formats (parquet, orc, json, text, csv etc.) is not allowed. An exception is thrown when attempting to write dataframes with empty schema.

Please find more details here: https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-23-to...

mrnov
New Contributor II

the same problem here (similar code and the same behavior with Spark 2.4.0, running with spark submit on Win and on Lin)

dataset.coalesce(1)
        .write()
        .option("charset", "UTF-8")
        .option("header", "true")
        .mode(SaveMode.Overwrite)
        .csv(outputDirPath);

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group