Getting client.session.cache.size warning in pyspark code using databricks connect
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-18-2024 04:28 AM
Hi Community,
I have setup a jupyter notebook in a server and installed databricks connect in its kernel to leverage my databricks cluster compute in the notebook and write pyspark code.
Whenever I run my code it gives me below warning:
```WARN SparkClientManager: DBConnect client for session <session_id> has been closed as the client cache reached the maximum size: 20. You can change the cache size by changing the conf value for spark.databricks.service.client.session.cache.size```
Is this a concerning warning? And what does it mean?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-18-2024 07:21 AM
The warning indicates that the client cache (used to manage connections between your local environment and the Databricks cluster) has reached its maximum size (20 sessions). When this limit is reached, the oldest session is closed to make room for a new one.
As Suggested - spark.databricks.service.client.session.cache.size to increase the cache size
While this warning itself is not critical. If you frequently open and close sessions, you may encounter performance issues due to cache management.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-19-2024 11:13 PM
Thank you @Riyakh

