Re: DLT Pipeline and Pivot tables

stucas · ‎09-04-2025

Thank you for the reply - I have tried this (it was suggested in earlier solutions); but that may well be a side effect of the above function.

query = f"""

SELECT pivot_key,

{select_clause}

FROM

data_to_pivot

GROUP BY

pivot_key

"""

However on Pipeline initialisation it failed with an invalid SQL error as the {select_clause} was empty. I believe this is the root cause as there is no schema defined at this point in the process; so DLT just assumes an empty string.

When the autoMerge was added - the job worked, but no columns from the select statement were added.

For a beginner this is all very strange; but I assume linked to the way DLT relies on Sparks lazy loading (hence certain functions that require full data loading are prohibited e..g collect(), pivot())?