โ10-02-2025 07:19 AM
Yesterday (10/1) starting around 12 PM EST we starting getting the following error in our Lakeflow Declarative Pipelines (LDP) process. We get this in environments where none of our code has changed. I found some info on the serverless compute about the Databricks Runtime version before the error and after.
This is the version on our last successful run: dlt:16.4.9-delta-pipelines-dlt-release-dp-20250918-rc0-commit-015f1e5-image-20d2b81
And this is the version on all new runs getting the error: dlt:16.4.9-delta-pipelines-dlt-release-dp-20250925-rc0-commit-a18ed15-image-7ea1ff2
Also, the LDPs are based off of a python meta-data driven process (similar to DLT-META) that dynamically creates tables (in bronze and silver) using the dlt.table and dlt.create_streaming_table functions. Have tried many different things but it seems debugging LDPs is very challenging since you can't do print and logging statements (at least not that I've seen).
โ10-03-2025 07:48 AM
So, we actually found the issue and it was code that was working before, but has just now stopped working. Basically in the python we were creating the calling the function to create the tables in parallel using this:
max_workers = min(len(configs), 20, concurrent.futures.ThreadPoolExecutor()._max_workers or 20)
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
list(executor.map(create_single_table, configs))
Just changed it to do a for loop on the configs and that fixed it. Out of curiousity to change the 20 above to 1 and it still gave the error. So, that would've only done them 1 at a time; so it's more of an issue with that concurrent.futures code than actual parallel processing of calling the create_single_table function.
Here's the error for reference:
While we have solved the issue after many hours of debugging Declarative Pipelines; which is not easy at all. It brings up a bigger is in that we can't change the DBR version and things can just break and the actual error is very difficult to find.
โ10-03-2025 07:00 AM
Hi @cpollock
also can you please share screenshot as pasted in window, attachment is not really working and only scanning
โ10-03-2025 07:48 AM
So, we actually found the issue and it was code that was working before, but has just now stopped working. Basically in the python we were creating the calling the function to create the tables in parallel using this:
max_workers = min(len(configs), 20, concurrent.futures.ThreadPoolExecutor()._max_workers or 20)
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
list(executor.map(create_single_table, configs))
Just changed it to do a for loop on the configs and that fixed it. Out of curiousity to change the 20 above to 1 and it still gave the error. So, that would've only done them 1 at a time; so it's more of an issue with that concurrent.futures code than actual parallel processing of calling the create_single_table function.
Here's the error for reference:
While we have solved the issue after many hours of debugging Declarative Pipelines; which is not easy at all. It brings up a bigger is in that we can't change the DBR version and things can just break and the actual error is very difficult to find.
โ10-06-2025 12:28 AM
hi @cpollock Indeed Lakeflow/Delta Live Tables pipeline definitions are not thread-safe. Avoid using threading, multiprocessing, or async code for pipeline definition. Thanks
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now