Databricks Community

Sandy84 · ‎04-18-2023

In Azure databricks, I have a job that calls a notebook which has multiple cells with sql queries. In case of any cell fails and when we restart the databricks job then how to skip previous cell which already ran and start only from the failed cell? Any lead would be helpful.

Serlal · ‎04-18-2023

Hi Sandy!

My 2 cents on your issue.

This looks more like a design issue rather than a technical issue. From the sound of it, it looks like your notebook is having too many operations and if a failure occurs, everything repeats again which is not ideal or it could cause issues (e.g. data duplication).

A good strategy for every ETL process is that it should be "restartable". Meaning, if it fails, to be able to restart and "clean its own mess" and proceed repeating what it is supposed to do.

So I would say instead of having everything in one notebook and try to figure out how to skip previously executed cells, why not separate the notebooks by logical operations and make sure that each unit is restartable. For instance if on one CMD you create a table and you want to make sure that the command is idempotent, instead of using CREATE TABLE, use CREATE TABLE IF NOT EXIST. So this way if your CMD runs again, if the table is there nothing will happen. That is just an example of course but you get my point I guess.

Sandy84 · ‎04-18-2023

Thank you for you reply .. Yes I agree with your point on the design part. However, in the current project we have 1000s of sqls and set of sqls are kept under a single notebook to perform operations. But yes, your suggestion sounds good for the CREATE TABLE IF NOT EXISTS and just now searched how we can make the INSERT CMD idempotent and got the below https://learn.microsoft.com/en-us/azure/databricks/ingestion/copy-into/

Will try to re-visit and see if we can make use of databrick's retries option.

Anonymous · ‎04-23-2023

Hi @Sandip Rath

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!

Databricks Community

Need help skipping previously executed cells in a failed Databricks job calling a notebook with multiple SQL cells

Join Us as a Local Community Builder!

Announcing Backfill Runs in Lakeflow Jobs for Higher Quality Downstream Data

🚀 New: Databricks Interactive Architecture Design Workshops

Introducing Community Pulse — Your Weekly Databricks Roundup!

Solution Accelerator Series | #5 - Automating Product Review Summarization with LLMs

Databricks DevConnect I Washington D.C.