Re: How to restart the Spark session within the no... - Databricks Community - 18763

Register to join the community

Data Engineering

Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.

Hi All,

I want to run an ETL pipeline in a sequential way in my DB notebook. If I run it without resetting the Spark session or restarting the cluster I am getting a data frame key error. I think this might be because of the Spark cache because If I restart the cluster and run the next item of the sequence I don't get this issue. I am clearing the Spark cache at the end of the ETL pipeline for each run but still, I am facing the same issue. I thought stopping the Spark session and starting it again will fix this, but that will reattach the notebook. And execution starts from the beginning not from the next item in the sequence.

Thanks,

Chandan

2 REPLIES 2

Hi @Kaniz Fatma ,

48 KB

Is there a solution to the above problem? I also would like to restart SparkSession to free my cluster's resources, but when calling

spark.stop()

the notebook automatically detach and the following error occurs:

The spark context has stopped and the driver is restarting. Your notebook will be automatically reattached.

Is there a recommended way to restart SparkSession?

never-displayed

You must be signed in to add attachments

never-displayed

Announcements

🚀 Weekly Delta (8 - 14 October): A Look Back at This Week’s Top Community Highlights

Databricks Community Champion - September 2025 - Nayanjyoti Sonowal

🌟 Community Sparks of the Week | September 26 – October 2 🌟

Solution Accelerator Series | #4 - Toxicity Detection for Gaming

Level Up with Databricks Specialist Sessions