I have created a DLT pipeline which reads data from json files which are stored in databricks volum
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ02-20-2024 05:20 AM
I have created a DLT pipeline which reads data from json files which are stored in databricks volume and puts data into streaming table
This was working fine.
when i tried to read the data that is inserted into the table and compare the values with the precalculated ones in the same dlt pipline its failing.
is it because dlt is treating this as an initialization stage and executing these comparism before setting up tables or inserting data into tables
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ02-20-2024 06:01 PM
Hey @zero234
Yes, your assumption looks aligned to mine your pipeline reads data from JSON files, inserts it into a streaming table, and then tries to compare values in the table with pre-calculated values before any data has been written. This leads to a comparison with an empty table, resulting in the error.
Possible Solution:
- Don't perform the comparison within the same notebook as table creation. Create a separate notebook or trigger that runs after the table has received data. This ensures comparison happens only when there's actual data to compare. You can also set this using a spark job which can help you trigger the dlt pipeline first and comparison afterwards.
- Before comparing, modify the comparison code to explicitly check if the streaming table has received any data. You can use table.isEmpty() or similar logic to confirm if there's data before proceeding.
Palash
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ02-21-2024 04:24 AM
Keep your DLT code separate from your comparison code, and run your comparison code once your DLT data has been ingested.

