Write empty dataframe into csv
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-18-2019 04:42 PM
I'm writing my output (entity) data frame into csv file. Below statement works well when the data frame is non-empty.
entity.repartition(1).write.mode(SaveMode.Overwrite).format("csv").option("header", "true").save(tempLocation)
It's not working when it is empty. Empty file is getting created and I'm expecting at least headers will show up so that my Tabular model won't fail with "Invalid column" error.
Anyone experienced this issue?
Thanks!
- Labels:
-
Scala spark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-22-2019 12:16 PM
Hi,
Thanks for reaching out to Databricks forum,
This is a bug with OSS, which is being fixed in Spark 3 version.
Here is the jira ticket about the issue
https://issues.apache.org/jira/browse/SPARK-26208
Here is the pull request for the fix, which will be merged
https://github.com/apache/spark/pull/23173
Porting the fix to the Databricks runtime versions is in the pipeline.
Please let us know whether it answers your question or if you have follow-up question.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-25-2019 04:35 AM
Since Spark 2.4, writing a dataframe with an empty or nested empty schema using any file formats (parquet, orc, json, text, csv etc.) is not allowed. An exception is thrown when attempting to write dataframes with empty schema.
Please find more details here: https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-23-to...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-07-2019 07:23 AM
the same problem here (similar code and the same behavior with Spark 2.4.0, running with spark submit on Win and on Lin)
dataset.coalesce(1)
.write()
.option("charset", "UTF-8")
.option("header", "true")
.mode(SaveMode.Overwrite)
.csv(outputDirPath);

