<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic DLT pipeline evolution schema error in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-evolution-schema-error/m-p/54596#M30127</link>
    <description>&lt;P&gt;Hello everyone,&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am currently working on my first dlt pipeline, and I stumped on a problem which I am struggling to solve.&lt;/P&gt;&lt;P&gt;I am working on several tables where I have a column called "my_column" with an array of json with two keys : 1 key : score, 2nd key : score_name.&lt;/P&gt;&lt;P&gt;I want to create a column for each score, with the name of the columns being the value of the "score_name" key.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Example :&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;my_column = df.select("my_column").rdd.flatMAp(lambda x:x).collect()&lt;/P&gt;&lt;P&gt;where :&lt;/P&gt;&lt;P&gt;my_column = [[row("score_name" = "name1", score = 2), row("score_name" = "name2", score = 10)],&lt;/P&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;[row("score_name" = "name1", score = -1), row("score_name" = "name2", score = 11)],&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;...,&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ]&lt;/DIV&gt;&lt;DIV&gt;I want :&lt;BR /&gt;name1 = df.select("name1").rdd.flatMAp(lambda x:x).collect()&lt;/DIV&gt;&lt;DIV&gt;name2 = df.select("name2").rdd.flatMAp(lambda x:x).collect()&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;where:&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;name1 =&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;SPAN&gt;[2,&lt;/SPAN&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;-1,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;...,&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ]&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;name2 = &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;SPAN&gt;[10,&lt;/SPAN&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;11,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;...,&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ]&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;End of exemple.&lt;/EM&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;The schema of "my_column" columns is :&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;ArrayType&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;StructType&lt;/SPAN&gt;&lt;SPAN&gt;([&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;StructField&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score_name"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;StringType&lt;/SPAN&gt;&lt;SPAN&gt;()),&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;StructField&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;IntegerType&lt;/SPAN&gt;&lt;SPAN&gt;())&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ])&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;So what I do for that is the following :&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;gold_table1&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;table &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; dlt.&lt;/SPAN&gt;&lt;SPAN&gt;read&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"my_silver_table1"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;table_to_join &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; table.&lt;/SPAN&gt;&lt;SPAN&gt;select&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"uniqueId"&lt;/SPAN&gt;&lt;SPAN&gt;), &lt;/SPAN&gt;&lt;SPAN&gt;explode&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"my_column"&lt;/SPAN&gt;&lt;SPAN&gt;)).&lt;/SPAN&gt;&lt;SPAN&gt;alias&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"explosion"&lt;/SPAN&gt;&lt;SPAN&gt;))&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;table_to_join &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;table_to_join.&lt;/SPAN&gt;&lt;SPAN&gt;select&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"uniqueId"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"explosion.score_name"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"explosion.score"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;table_to_join &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;table_to_join.&lt;/SPAN&gt;&lt;SPAN&gt;withColumn&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;cast&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;IntegerType&lt;/SPAN&gt;&lt;SPAN&gt;()))&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;table_to_join &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;table_to_join.&lt;/SPAN&gt;&lt;SPAN&gt;groupBy&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"uniqueId"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;pivot&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score_name"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;sum&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;return table.join(table_to_join, on="uniqueId")&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;This join process works pretty well when being tested in a regular spark notebook environment, however when using it in the context of a dlt pipeline, I receive the following error message :&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;EM&gt;"""To overwrite your schema or change partitioning, please set: '.option("overwriteSchema", "true")'."""&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;From the error message, I believe it has something to do with authorizing schema change or not using in the correct way the schema change.&lt;/P&gt;&lt;P&gt;If you know the solution or have any advice on how to solve this issue I would love to hear it. Thanks !&lt;/P&gt;</description>
    <pubDate>Mon, 04 Dec 2023 10:36:49 GMT</pubDate>
    <dc:creator>TiagoMag</dc:creator>
    <dc:date>2023-12-04T10:36:49Z</dc:date>
    <item>
      <title>DLT pipeline evolution schema error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-evolution-schema-error/m-p/54596#M30127</link>
      <description>&lt;P&gt;Hello everyone,&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am currently working on my first dlt pipeline, and I stumped on a problem which I am struggling to solve.&lt;/P&gt;&lt;P&gt;I am working on several tables where I have a column called "my_column" with an array of json with two keys : 1 key : score, 2nd key : score_name.&lt;/P&gt;&lt;P&gt;I want to create a column for each score, with the name of the columns being the value of the "score_name" key.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Example :&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;my_column = df.select("my_column").rdd.flatMAp(lambda x:x).collect()&lt;/P&gt;&lt;P&gt;where :&lt;/P&gt;&lt;P&gt;my_column = [[row("score_name" = "name1", score = 2), row("score_name" = "name2", score = 10)],&lt;/P&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;[row("score_name" = "name1", score = -1), row("score_name" = "name2", score = 11)],&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;...,&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ]&lt;/DIV&gt;&lt;DIV&gt;I want :&lt;BR /&gt;name1 = df.select("name1").rdd.flatMAp(lambda x:x).collect()&lt;/DIV&gt;&lt;DIV&gt;name2 = df.select("name2").rdd.flatMAp(lambda x:x).collect()&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;where:&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;name1 =&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;SPAN&gt;[2,&lt;/SPAN&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;-1,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;...,&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ]&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;name2 = &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;SPAN&gt;[10,&lt;/SPAN&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;11,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;...,&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ]&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;End of exemple.&lt;/EM&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;The schema of "my_column" columns is :&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;ArrayType&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;StructType&lt;/SPAN&gt;&lt;SPAN&gt;([&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;StructField&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score_name"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;StringType&lt;/SPAN&gt;&lt;SPAN&gt;()),&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;StructField&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;IntegerType&lt;/SPAN&gt;&lt;SPAN&gt;())&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ])&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;So what I do for that is the following :&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;gold_table1&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;table &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; dlt.&lt;/SPAN&gt;&lt;SPAN&gt;read&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"my_silver_table1"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;table_to_join &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; table.&lt;/SPAN&gt;&lt;SPAN&gt;select&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"uniqueId"&lt;/SPAN&gt;&lt;SPAN&gt;), &lt;/SPAN&gt;&lt;SPAN&gt;explode&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"my_column"&lt;/SPAN&gt;&lt;SPAN&gt;)).&lt;/SPAN&gt;&lt;SPAN&gt;alias&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"explosion"&lt;/SPAN&gt;&lt;SPAN&gt;))&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;table_to_join &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;table_to_join.&lt;/SPAN&gt;&lt;SPAN&gt;select&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"uniqueId"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"explosion.score_name"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"explosion.score"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;table_to_join &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;table_to_join.&lt;/SPAN&gt;&lt;SPAN&gt;withColumn&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;cast&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;IntegerType&lt;/SPAN&gt;&lt;SPAN&gt;()))&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;table_to_join &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;table_to_join.&lt;/SPAN&gt;&lt;SPAN&gt;groupBy&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"uniqueId"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;pivot&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score_name"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;sum&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"score"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;return table.join(table_to_join, on="uniqueId")&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;This join process works pretty well when being tested in a regular spark notebook environment, however when using it in the context of a dlt pipeline, I receive the following error message :&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;EM&gt;"""To overwrite your schema or change partitioning, please set: '.option("overwriteSchema", "true")'."""&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;From the error message, I believe it has something to do with authorizing schema change or not using in the correct way the schema change.&lt;/P&gt;&lt;P&gt;If you know the solution or have any advice on how to solve this issue I would love to hear it. Thanks !&lt;/P&gt;</description>
      <pubDate>Mon, 04 Dec 2023 10:36:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-evolution-schema-error/m-p/54596#M30127</guid>
      <dc:creator>TiagoMag</dc:creator>
      <dc:date>2023-12-04T10:36:49Z</dc:date>
    </item>
    <item>
      <title>Re: DLT pipeline evolution schema error</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-evolution-schema-error/m-p/54683#M30142</link>
      <description>&lt;P&gt;Thank you for your answer! I found a way to complete the pipeline; I had to use&amp;nbsp;&lt;STRONG&gt;spark_conf = {"spark.databricks.delta.schema.autoMerge.enabled": "true"}&lt;/STRONG&gt;,&amp;nbsp;inside the &lt;EM&gt;decorator dlt.table&lt;/EM&gt; of my table.&lt;/P&gt;&lt;P&gt;I still don't know exactly why the schema change here creates an error. When I create new columns through other pathways, such as with the method from_json, I don't get similar errors.&lt;/P&gt;&lt;P&gt;I will also try to follow your advice and see if it works or if there is a difference!&lt;/P&gt;</description>
      <pubDate>Tue, 05 Dec 2023 17:29:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-evolution-schema-error/m-p/54683#M30142</guid>
      <dc:creator>TiagoMag</dc:creator>
      <dc:date>2023-12-05T17:29:28Z</dc:date>
    </item>
  </channel>
</rss>

