Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a week ago
Hi there,
I
Short answer
You should call
spark.stop() when you're done with each session. What you're doing now (not calling it) works, but it's not ideal — you're relying on the server-side idle timeout to clean up after you, and in the meantime each orphaned session consumes memory on the cluster for its SQLConf and SessionState. On a busy API with many requests, that can accumulate until the cluster eventually reclaims them.Why the docs say "don't call stop"
The documentation warning about not calling
stop() is aimed at a different scenario — specifically, when you're running inside a Databricks notebook or workspace environment where the session lifecycle is managed for you. In that context, calling stop() can tear down shared infrastructure you didn't create. It doesn't apply to your situation, where you're an external client creating sessions explicitly via Databricks Connect from an AKS pod. (Databricks Connect in notebooks)What about automatic cleanup?
You may have read that Databricks Connect handles session cleanup automatically — and that's partially true. There are two mechanisms:
-
Process exit / shutdown hooks: PySpark registers an
atexithandler that callsstop()on active sessions when the Python process terminates. If you were running a short-lived script (start, do work, exit), this would clean things up for you automatically. However, your API is a long-lived server process — it doesn't exit between requests. The shutdown hook only fires when the pod itself restarts or scales down, not after each request completes. -
Server-side idle timeout: The Spark Connect server passively cleans up idle sessions after a period of inactivity. The release notes confirm: "Databricks Connect now automatically closes expired sessions on the client side." So sessions do eventually get reclaimed — but in the meantime they're sitting there consuming driver memory. (Databricks Connect release notes)
For a long-running API server creating a new session per request, neither mechanism gives you prompt cleanup. You'd accumulate sessions until the timeout kicks in.
What spark.stop() actually does in Databricks Connect
Since version 14.2.0, calling
stop() on a Databricks Connect session sends a ReleaseSession RPC to the server, which:- Interrupts any running operations tied to that session
- Releases server-side resources (memory, cached state)
- Closes the gRPC channel on the client side
Since version 15.1.0,
stop() is also idempotent — calling it on an already-closed or expired session won't throw an error. So it's safe to call in a finally block without worrying about race conditions with the idle timeout. (Databricks Connect release notes)Recommended pattern
from databricks.connect import DatabricksSession
def handle_request():
spark = DatabricksSession.builder.create() # new session per request
try:
# your Spark work here
result = spark.sql("SELECT ...")
return result.collect()
finally:
spark.stop() # clean up immediately
A few notes:
- Use
.create()rather than.getOrCreate()— you've already figured this out. Thecreate()API was introduced in 16.0 specifically for this use case (always creates a fresh session rather than returning an existing one). (Databricks Connect release notes) - Wrap
stop()in afinallyblock so it runs even if your Spark work throws an exception. - If you're on 18.1.1 as you mentioned, you have all the idempotent-stop and transient-retry improvements, so this is straightforward.
What happens if you don't call stop
It's not catastrophic — the server will eventually clean up idle sessions via the timeout. But you'll accumulate orphaned sessions in the interim, each holding memory on the driver. Under sustained load this can contribute to driver memory pressure.
That said, if your request volume is modest and you're not seeing issues, the idle timeout is probably handling things adequately. It's more of a "doing it properly" thing than a "this will definitely break" thing.
Relevant docs
- Databricks Connect overview — general setup and architecture
- Databricks Connect release notes — where the
create()API, idempotentstop(), and session expiry handling are documented - Compute configuration for Databricks Connect — session builder configuration options
- Databricks Connect in notebooks (workspace behaviour) — explains when
stop()should not be called (i.e. not your case) - Spark Connect vs Spark Classic — best practices for the Spark Connect protocol your client uses