cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Does cancelling a job run rollback any actions performed by query plan?

brendanc19
New Contributor III

If I were to stop a rather large job run, say half way thru execution, will any actions performed on our Delta tables persist or will they be rolled back?

Are there any other risks that I need to be aware of in terms of cancelling a job run half way thru?

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

@Brendan Careyโ€‹ :

If you stop a job run in the middle of execution, any actions that have already been committed to Delta tables will persist. However, any uncommitted changes made by the job will be rolled back. This means that any transactions that were in progress at the time of the job interruption will be undone, and the Delta tables will be restored to their previous state.

It's important to note that when you stop a job run in the middle of execution, you may introduce inconsistencies into your data. For example, if the job was updating multiple tables at once and you stop it before all updates are complete, some tables may have been updated while others have not, resulting in inconsistent data. Additionally, if the job was in the process of performing some critical or non-reversible operations, such as deleting or overwriting data, stopping the job run in the middle could have unforeseen consequences.

To mitigate the risks of stopping a job run in the middle of execution, it's a good practice to design your jobs in a way that allows for safe stopping and restarting. For example, you can break your job into smaller, atomic steps that can be run independently, and use checkpoints to ensure that each step completes successfully before moving on to the next. You can also use logging and monitoring tools to track the progress of your job and identify any issues before they become critical.

View solution in original post

6 REPLIES 6

Anonymous
Not applicable

@Brendan Careyโ€‹ :

If you stop a job run in the middle of execution, any actions that have already been committed to Delta tables will persist. However, any uncommitted changes made by the job will be rolled back. This means that any transactions that were in progress at the time of the job interruption will be undone, and the Delta tables will be restored to their previous state.

It's important to note that when you stop a job run in the middle of execution, you may introduce inconsistencies into your data. For example, if the job was updating multiple tables at once and you stop it before all updates are complete, some tables may have been updated while others have not, resulting in inconsistent data. Additionally, if the job was in the process of performing some critical or non-reversible operations, such as deleting or overwriting data, stopping the job run in the middle could have unforeseen consequences.

To mitigate the risks of stopping a job run in the middle of execution, it's a good practice to design your jobs in a way that allows for safe stopping and restarting. For example, you can break your job into smaller, atomic steps that can be run independently, and use checkpoints to ensure that each step completes successfully before moving on to the next. You can also use logging and monitoring tools to track the progress of your job and identify any issues before they become critical.

brendanc19
New Contributor III

Thank you @Suteja Kanuriโ€‹ 

Anonymous
Not applicable

You're welcome! Happy learning

Vartika
Databricks Employee
Databricks Employee

Hey @Brendan Careyโ€‹ 

Hope everything is going great.

Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark @Suteja Kanuriโ€‹'s answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you. 

Cheers!

brendanc19
New Contributor III

Will do, thank you Vartika

fabian_r
New Contributor II

Hi, is there any way to ensure transaction control in delta protocol in 2024 across tables for failing jobs?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group