Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I am building out a new DLT pipeline and have since had to rebuild it from scratch. Having deleted the old pipeline and constructed a new one I now get this error:Table 'X' is already managed by pipeline 'Y'. As I only have the one pipeline how would...
rename your function from @Dlt.table, for exemple:@Dlt.table( comment="exemple", table_properties={"exemple": "exemple"}, partition_cols=["a", "b", "c"])def modify_this_name():
Hi,I have pipeline running. I have updated one file in delta table which is already processed. Now I am getting errorcom.databricks.sql.transaction.tahoe.DeltaUnsupportedOperationException: Detected a data update. This is currently not supported. If ...
Hi! I have a problem. I'm using an autoloader to ingest data from raw to a Delta Lake, but when my pipeline starts, I want to apply the pipeline only to the new data. The autoloader ingests data into the Delta Lake, but now, how can I distinguish the...
Hi @Alejandro Piury Pinzón We haven't heard from you since the last response from @Tyler Retzlaff , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be he...
Hi @William Scardua Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...
OK. So I think I'm probably missing the obvious and tying myself in knots here.Here is the scenario:batch datasets arrive in json format in an Azure data lakeeach batch is a complete set of "current" records (the complete table)these are processed us...
Hi @Kearon McNicol Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answer...
Hi Guys, You're right the problem is with the child notebook.All my notebooks are failing at this point. I can't seem to be wining with solving this error
For a batch job I can use ADF and Databricks notebook activity to create a pipeline.Similarly what Azure stack I should use to run Structured streaming Databricks notebook for a production ready pipeline.
I have a pipeline with + 20 streams running based on autoloader. The pipeline crashed and after the crash I'm unable to start the streams and they fail with one of the following messages:1): The metadata file in the streaming source checkpoint direct...