Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2025 10:27 PM
Thank you for the reply - I have tried this (it was suggested in earlier solutions); but that may well be a side effect of the above function.
query = f"""
SELECT pivot_key,
{select_clause}
FROM
data_to_pivot
GROUP BY
pivot_key
"""
However on Pipeline initialisation it failed with an invalid SQL error as the {select_clause} was empty. I believe this is the root cause as there is no schema defined at this point in the process; so DLT just assumes an empty string.
When the autoMerge was added - the job worked, but no columns from the select statement were added.
For a beginner this is all very strange; but I assume linked to the way DLT relies on Sparks lazy loading (hence certain functions that require full data loading are prohibited e..g collect(), pivot())?
However on Pipeline initialisation it failed with an invalid SQL error as the {select_clause} was empty. I believe this is the root cause as there is no schema defined at this point in the process; so DLT just assumes an empty string.
When the autoMerge was added - the job worked, but no columns from the select statement were added.
For a beginner this is all very strange; but I assume linked to the way DLT relies on Sparks lazy loading (hence certain functions that require full data loading are prohibited e..g collect(), pivot())?