<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Schema Evolution Issue in Streaming in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/schema-evolution-issue-in-streaming/m-p/24026#M16665</link>
    <description>&lt;P&gt;mergeSchema doesn't support all operations. In some cases .option("overwriteSchema", "true") is needed. MergeSchema doesn't support:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Dropping a column&lt;/LI&gt;&lt;LI&gt;Changing an existing column's data type (in place)&lt;/LI&gt;&lt;LI&gt;Renaming column names that differ only by case (e.g., “Foo” and “foo”)&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;more on that topic here &lt;A href="https://www.databricks.com/blog/2019/09/24/diving-into-delta-lake-schema-enforcement-evolution.html" target="test_blank"&gt;https://www.databricks.com/blog/2019/09/24/diving-into-delta-lake-schema-enforcement-evolution.html&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 03 Nov 2022 10:30:26 GMT</pubDate>
    <dc:creator>Hubert-Dudek</dc:creator>
    <dc:date>2022-11-03T10:30:26Z</dc:date>
    <item>
      <title>Schema Evolution Issue in Streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/schema-evolution-issue-in-streaming/m-p/24025#M16664</link>
      <description>&lt;P&gt;When there is a schema change while reading and writing to a stream, will the schema changes be automatically handled by spark&lt;/P&gt;&lt;P&gt;or do we need to include the option(mergeschema=True)?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Eg:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;df.writeStream&lt;/P&gt;&lt;P&gt;&amp;nbsp;.option("mergeSchema", "true")&lt;/P&gt;&lt;P&gt;&amp;nbsp;.format("delta")&lt;/P&gt;&lt;P&gt;&amp;nbsp;.outputMode("append")&lt;/P&gt;&lt;P&gt;&amp;nbsp;.option("path","/data/")&lt;/P&gt;&lt;P&gt;&amp;nbsp;.option("checkpointLocation","/checkpoint/")&lt;/P&gt;&lt;P&gt;&amp;nbsp;.start()&lt;/P&gt;&lt;P&gt;&amp;nbsp;.awaitTermination()&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Including the option(mergeschema=True) still throws the error :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;ERROR: A schema mismatch detected when writing to the Delta table&lt;/P&gt;&lt;P&gt;To enable schema migration, please set:&lt;/P&gt;&lt;P&gt;'.option("mergeSchema", "true")'.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Do any additional option /changes needs to be done to the above query? Could you please advise to resolve this issue?&lt;/P&gt;</description>
      <pubDate>Thu, 03 Nov 2022 10:17:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/schema-evolution-issue-in-streaming/m-p/24025#M16664</guid>
      <dc:creator>Sandy21</dc:creator>
      <dc:date>2022-11-03T10:17:42Z</dc:date>
    </item>
    <item>
      <title>Re: Schema Evolution Issue in Streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/schema-evolution-issue-in-streaming/m-p/24026#M16665</link>
      <description>&lt;P&gt;mergeSchema doesn't support all operations. In some cases .option("overwriteSchema", "true") is needed. MergeSchema doesn't support:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Dropping a column&lt;/LI&gt;&lt;LI&gt;Changing an existing column's data type (in place)&lt;/LI&gt;&lt;LI&gt;Renaming column names that differ only by case (e.g., “Foo” and “foo”)&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;more on that topic here &lt;A href="https://www.databricks.com/blog/2019/09/24/diving-into-delta-lake-schema-enforcement-evolution.html" target="test_blank"&gt;https://www.databricks.com/blog/2019/09/24/diving-into-delta-lake-schema-enforcement-evolution.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 03 Nov 2022 10:30:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/schema-evolution-issue-in-streaming/m-p/24026#M16665</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-11-03T10:30:26Z</dc:date>
    </item>
    <item>
      <title>Re: Schema Evolution Issue in Streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/schema-evolution-issue-in-streaming/m-p/24027#M16666</link>
      <description>&lt;P&gt;Thanks @Hubert Dudek​&amp;nbsp;. In the case of writing to a streaming table, do we need to change the checkpoint location as well in addition to adding the option("mergeSchema", "true")&lt;/P&gt;&lt;P&gt;if there is an addition of a new column in the incoming data?&lt;/P&gt;</description>
      <pubDate>Fri, 04 Nov 2022 06:13:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/schema-evolution-issue-in-streaming/m-p/24027#M16666</guid>
      <dc:creator>Sandy21</dc:creator>
      <dc:date>2022-11-04T06:13:29Z</dc:date>
    </item>
  </channel>
</rss>

