<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Write empty dataframe into csv in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/write-empty-dataframe-into-csv/m-p/28214#M20037</link>
    <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Since Spark 2.4, writing a dataframe with an empty or nested empty schema using any file formats (parquet, orc, json, text, csv etc.) is not allowed. An exception is thrown when attempting to write dataframes with empty schema.&lt;/P&gt;
&lt;P&gt;Please find more details here: &lt;A href="https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-23-to-24" target="test_blank"&gt;https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-23-to-24&lt;/A&gt;&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 25 Mar 2019 11:35:27 GMT</pubDate>
    <dc:creator>Sandeep</dc:creator>
    <dc:date>2019-03-25T11:35:27Z</dc:date>
    <item>
      <title>Write empty dataframe into csv</title>
      <link>https://community.databricks.com/t5/data-engineering/write-empty-dataframe-into-csv/m-p/28212#M20035</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;I'm writing my output (entity) data frame into csv file. Below statement works well when the data frame is non-empty. &lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;entity.repartition(1).write.mode(SaveMode.Overwrite).format("csv").option("header", "true").save(tempLocation)&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;It's not working when it is empty. Empty file is getting created and I'm expecting at least headers will show up so that my Tabular model won't fail with "Invalid column" error. &lt;/P&gt;
&lt;P&gt;Anyone experienced this issue? &lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 18 Mar 2019 23:42:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/write-empty-dataframe-into-csv/m-p/28212#M20035</guid>
      <dc:creator>_not_provid1755</dc:creator>
      <dc:date>2019-03-18T23:42:57Z</dc:date>
    </item>
    <item>
      <title>Re: Write empty dataframe into csv</title>
      <link>https://community.databricks.com/t5/data-engineering/write-empty-dataframe-into-csv/m-p/28213#M20036</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;Thanks for reaching out to Databricks forum,&lt;/P&gt;
&lt;P&gt;This is a bug with OSS, which is being fixed in Spark 3 version.&lt;/P&gt;
&lt;P&gt;Here is the jira ticket about the issue&lt;/P&gt;
&lt;P&gt;&lt;A target="_blank" href="https://"&gt;https://issues.apache.org/jira/browse/SPARK-26208&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Here is the pull request for the fix, which will be merged&lt;/P&gt;
&lt;P&gt;&lt;A target="_blank" href="https://"&gt;https://github.com/apache/spark/pull/23173&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Porting the fix to the Databricks runtime versions is in the pipeline.&lt;/P&gt;
&lt;P&gt;Please let us know whether it answers your question or if you have follow-up question.&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 22 Mar 2019 19:16:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/write-empty-dataframe-into-csv/m-p/28213#M20036</guid>
      <dc:creator>mathan_pillai</dc:creator>
      <dc:date>2019-03-22T19:16:34Z</dc:date>
    </item>
    <item>
      <title>Re: Write empty dataframe into csv</title>
      <link>https://community.databricks.com/t5/data-engineering/write-empty-dataframe-into-csv/m-p/28214#M20037</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Since Spark 2.4, writing a dataframe with an empty or nested empty schema using any file formats (parquet, orc, json, text, csv etc.) is not allowed. An exception is thrown when attempting to write dataframes with empty schema.&lt;/P&gt;
&lt;P&gt;Please find more details here: &lt;A href="https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-23-to-24" target="test_blank"&gt;https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html#upgrading-from-spark-sql-23-to-24&lt;/A&gt;&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2019 11:35:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/write-empty-dataframe-into-csv/m-p/28214#M20037</guid>
      <dc:creator>Sandeep</dc:creator>
      <dc:date>2019-03-25T11:35:27Z</dc:date>
    </item>
    <item>
      <title>Re: Write empty dataframe into csv</title>
      <link>https://community.databricks.com/t5/data-engineering/write-empty-dataframe-into-csv/m-p/28215#M20038</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;the same problem here (similar code and the same behavior with Spark 2.4.0, running with spark submit on Win and on Lin)&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;dataset.coalesce(1)
        .write()
        .option("charset", "UTF-8")
        .option("header", "true")
        .mode(SaveMode.Overwrite)
        .csv(outputDirPath);&lt;/CODE&gt;&lt;/PRE&gt;  
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 07 May 2019 14:23:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/write-empty-dataframe-into-csv/m-p/28215#M20038</guid>
      <dc:creator>mrnov</dc:creator>
      <dc:date>2019-05-07T14:23:29Z</dc:date>
    </item>
  </channel>
</rss>

