cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Run failed with error message Cluster was terminated. Reason: JOB_FINISHED (SUCCESS)

holychs
New Contributor III

I am running a notebook through workflow using all purpose cluster("data_security_mode": "USER_ISOLATION"). I am seeing some strange behaviour with the cluster during the run. While the job is still running cluster gets terminated with the Reason: Reason: JOB_FINISHED (SUCCESS). This causes the running job to fail with error, cluster was terminated. I am not able to find any details in the cluster event log or driver log. 

1 REPLY 1

Raman_Unifeye
Contributor III

@holychs - Well, this behaviour needs troubleshooting I imagine.

- What is the auto-termination value. Try increasing it to much higher value and observe if it is the same.

- Does your workflow have multiple notebook tasks? If Task A finishes while Task B is still running, a glitch in the job contxt can sometime trigger a cluster teardown if the cluster was pinned to the job.

- Does your notebook contains conditional logic that calls dbutils.notebook.exit("Success") ?

- Are you triggering this job manually while someone else is using the cluster?

Also, check the Runs on the cluster. Go to the Compute page -> Select your Cluster -> Runs tab. This will show you exactly which jobs/notebooks were attached to that cluster at the moment of termination.

 


RG #Driving Business Outcomes with Data Intelligence