<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: DLT Pipeline and Pivot tables in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-and-pivot-tables/m-p/130921#M48943</link>
    <description>&lt;P&gt;Thank you for the reply - I have tried this (it was suggested in earlier solutions); but that may well be a side effect of the above function.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;query&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;STRONG&gt;SELECT pivot_key,&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;STRONG&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;{select_clause}&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; FROM&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; data_to_pivot&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; GROUP BY&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; pivot_key&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; """&lt;BR /&gt;&lt;BR /&gt;However on Pipeline initialisation it failed with an invalid SQL error as the {select_clause} was empty. I believe this is the root cause as there is no schema defined at this point in the process; so DLT just assumes an empty string.&lt;BR /&gt;&lt;BR /&gt;When the autoMerge was added - the job worked, but no columns from the select statement were added.&lt;BR /&gt;&lt;BR /&gt;For a beginner this is all very strange; but I assume linked to the way DLT relies on Sparks lazy loading (hence certain functions that require full data loading are prohibited e..g collect(), pivot())?&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;</description>
    <pubDate>Fri, 05 Sep 2025 05:27:43 GMT</pubDate>
    <dc:creator>stucas</dc:creator>
    <dc:date>2025-09-05T05:27:43Z</dc:date>
    <item>
      <title>DLT Pipeline and Pivot tables</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-and-pivot-tables/m-p/130643#M48859</link>
      <description>&lt;P&gt;&lt;STRONG&gt;TLDR:&lt;/STRONG&gt;&lt;BR /&gt;Can DLT determine a dynamic schema - one which is generated from the results of a pivot?&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Issue&lt;/STRONG&gt;&lt;BR /&gt;I know you cant use spark `.pivot` in DLT pipeline and that if you wish to pivot data you need to do that outside of the DLT decorated functions. I have created a function a pivot on a Dataframe (from a DLT table) that pivots it turning values to columns - but as I am not the "owner" of this data I cannot know the full set of values and thus create a schema....&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .... without knowing the schema I dont think DLT is able to know/derive the columns returned from the pitovt function and my data is not added to output. A common thread (though not&amp;nbsp;&lt;EM&gt;explicitly&lt;/EM&gt; stated) in other pivot questions/solutions is that schemas are always provided.&lt;/P&gt;&lt;P&gt;This has all been tested in regular python/notebooks and works; only when using within a DLT context does it fail.&lt;/P&gt;&lt;P&gt;It's entirely possible that another issue in DLT that affected the pivot function, casuing it to apparently be generated without columns is aslo causing an issue. Coould it be that the SQL is generated by DLT in the intialisation phase?&lt;/P&gt;&lt;P&gt;As my data is source from a DLT table and needs to create a DLT table... How canI create the necessary schema (if that indeed, is the right inference on my part)&lt;BR /&gt;&lt;BR /&gt;Pivot function:&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="100%"&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;# ----------------------------------------------------------------------------------------------------&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;# Pivot a table without using "pivot"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;# ----------------------------------------------------------------------------------------------------&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;dlt_pivot_table&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;df_unpivoted&lt;/SPAN&gt;&lt;SPAN&gt;: DataFrame, &lt;/SPAN&gt;&lt;SPAN&gt;keys&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;list&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;pivot_column&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;data_column&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;) -&amp;gt; DataFrame:&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; Pivots a given DataFrame using specified keys, pivoting the values in the column &lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; pivot_column as a column and the its values&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; from data_column.&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; Parameters&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; df_unpivoted : DataFrame&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; The DataFrame to pivot&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; keys: list&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; A list containing all the keys to pivot on&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; pivot_column : str&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; The name of the column to be pivoted&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; data_column: str&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; The name of the column that contains the data to be pivoted&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; Returns&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; DataFrame&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; A DataFrame pivoted by the columns in keys&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; """&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# create a pivot key in the datasert&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_work &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df_unpivoted.&lt;/SPAN&gt;&lt;SPAN&gt;withColumn&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"pivot_key"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;concat_ws&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"_"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;keys))&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# df_back_up will hold the pivot key and original keys&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_back_up &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df_work&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# drop the original keys (we're pivoting on the pivot key generated from the given keys)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_work &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df_work.&lt;/SPAN&gt;&lt;SPAN&gt;drop&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;keys)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# create a temporary view that we can select from&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_work.&lt;/SPAN&gt;&lt;SPAN&gt;createOrReplaceTempView&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"data_to_pivot"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# derive the values that will be used to create the new columns&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; pivot_cols &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; [row[pivot_column] &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; row &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; df_work.&lt;/SPAN&gt;&lt;SPAN&gt;select&lt;/SPAN&gt;&lt;SPAN&gt;(pivot_column).&lt;/SPAN&gt;&lt;SPAN&gt;distinct&lt;/SPAN&gt;&lt;SPAN&gt;().&lt;/SPAN&gt;&lt;SPAN&gt;collect&lt;/SPAN&gt;&lt;SPAN&gt;()]&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# create the clauses that will cause the pivot and convert to sql like string&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; # NOTE: DLT seems to generate a blank select statement - so we append the key after....&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; select_clauses &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; [&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"FIRST(CASE WHEN &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;pivot_column&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt; = '&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;name&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;' THEN &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;data_column&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt; END, TRUE) AS &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;name&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; name &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; pivot_cols] &amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; select_clauses.&lt;/SPAN&gt;&lt;SPAN&gt;append&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"pivot_key"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; select_clause &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;",&lt;/SPAN&gt;&lt;SPAN&gt;\n&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;join&lt;/SPAN&gt;&lt;SPAN&gt;(select_clauses)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# create the query embedding the clauses&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; query &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; SELECT &lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;select_clause&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; FROM&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; data_to_pivot&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; GROUP BY &lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; pivot_key&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; """&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Execute the query&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_pivot &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; util_spark&lt;/SPAN&gt;&lt;SPAN&gt;.sql&lt;/SPAN&gt;&lt;SPAN&gt;(query)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# df_pivot contains the pivoted data, it does not though have the original key values (only the pivot_key)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# So we rejoin the original keys to the pivoted data, by joining the table to the df_back_up. As df_back_up&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# is not pivoted, the key list, must be made 'distinct' else the join will have multiple rows per key&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; col_list &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; [&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;keys]&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; col_list.&lt;/SPAN&gt;&lt;SPAN&gt;append&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"pivot_key"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_keys &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df_back_up.&lt;/SPAN&gt;&lt;SPAN&gt;select&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;col_list).&lt;/SPAN&gt;&lt;SPAN&gt;distinct&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_pivot &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df_keys.&lt;/SPAN&gt;&lt;SPAN&gt;join&lt;/SPAN&gt;&lt;SPAN&gt;(df_pivot, &lt;/SPAN&gt;&lt;SPAN&gt;on&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"pivot_key"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;how&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"left"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# drop the pivot key from the pivoted dataframe&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df_pivot &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df_pivot.&lt;/SPAN&gt;&lt;SPAN&gt;drop&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"pivot_key"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; df_pivot&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;</description>
      <pubDate>Wed, 03 Sep 2025 12:10:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-and-pivot-tables/m-p/130643#M48859</guid>
      <dc:creator>stucas</dc:creator>
      <dc:date>2025-09-03T12:10:49Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline and Pivot tables</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-and-pivot-tables/m-p/130835#M48925</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/148523"&gt;@stucas&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Adding following configuration spark.databricks.delta.schema.autoMerge.enabled = true&lt;BR /&gt;to the DLT pipeline will allow new pivoted columns to be merged into the target table automatically. However, DLT still requires a defined schema at initialization, so this won’t fully solve the issue of handling a completely dynamic pivot schema. It can help if your pivoted values evolve gradually, but DLT doesn’t natively support unknown schemas upfront.&lt;/P&gt;</description>
      <pubDate>Thu, 04 Sep 2025 12:43:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-and-pivot-tables/m-p/130835#M48925</guid>
      <dc:creator>SP_6721</dc:creator>
      <dc:date>2025-09-04T12:43:04Z</dc:date>
    </item>
    <item>
      <title>Re: DLT Pipeline and Pivot tables</title>
      <link>https://community.databricks.com/t5/data-engineering/dlt-pipeline-and-pivot-tables/m-p/130921#M48943</link>
      <description>&lt;P&gt;Thank you for the reply - I have tried this (it was suggested in earlier solutions); but that may well be a side effect of the above function.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;query&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"""&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;STRONG&gt;SELECT pivot_key,&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;STRONG&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;{select_clause}&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; FROM&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; data_to_pivot&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; GROUP BY&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; pivot_key&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; """&lt;BR /&gt;&lt;BR /&gt;However on Pipeline initialisation it failed with an invalid SQL error as the {select_clause} was empty. I believe this is the root cause as there is no schema defined at this point in the process; so DLT just assumes an empty string.&lt;BR /&gt;&lt;BR /&gt;When the autoMerge was added - the job worked, but no columns from the select statement were added.&lt;BR /&gt;&lt;BR /&gt;For a beginner this is all very strange; but I assume linked to the way DLT relies on Sparks lazy loading (hence certain functions that require full data loading are prohibited e..g collect(), pivot())?&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 05 Sep 2025 05:27:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dlt-pipeline-and-pivot-tables/m-p/130921#M48943</guid>
      <dc:creator>stucas</dc:creator>
      <dc:date>2025-09-05T05:27:43Z</dc:date>
    </item>
  </channel>
</rss>

