cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Need help skipping previously executed cells in a failed Databricks job calling a notebook with multiple SQL cells

Sandy84
New Contributor II

In Azure databricks, I have a job that calls a notebook which has multiple cells with sql queries. In case of any cell fails and when we restart the databricks job then how to skip previous cell which already ran and start only from the failed cell? Any lead would be helpful.

3 REPLIES 3

Serlal
New Contributor III

Hi Sandy!

My 2 cents on your issue.

This looks more like a design issue rather than a technical issue. From the sound of it, it looks like your notebook is having too many operations and if a failure occurs, everything repeats again which is not ideal or it could cause issues (e.g. data duplication).

A good strategy for every ETL process is that it should be "restartable". Meaning, if it fails, to be able to restart and "clean its own mess" and proceed repeating what it is supposed to do.

So I would say instead of having everything in one notebook and try to figure out how to skip previously executed cells, why not separate the notebooks by logical operations and make sure that each unit is restartable. For instance if on one CMD you create a table and you want to make sure that the command is idempotent, instead of using CREATE TABLE, use CREATE TABLE IF NOT EXIST. So this way if your CMD runs again, if the table is there nothing will happen. That is just an example of course but you get my point I guess.

Sandy84
New Contributor II

Thank you for you reply .. Yes I agree with your point on the design part. However, in the current project we have 1000s of sqls and set of sqls are kept under a single notebook to perform operations. But yes, your suggestion sounds good for the CREATE TABLE IF NOT EXISTS and just now searched how we can make the INSERT CMD idempotent and got the below https://learn.microsoft.com/en-us/azure/databricks/ingestion/copy-into/

Will try to re-visit and see if we can make use of databrick's retries option.

Anonymous
Not applicable

Hi @Sandip Rath​ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.