Databricks Community

Dave1967 · ‎09-30-2024

Can anyone please tell me why df.cache() and df.persist() are not supported in Serevrless compute?

Many Thanks

gchandra · ‎09-30-2024

Global caching functionality (and other global states used on classic clusters) is conceptually hard to represent on serverless computing.

Serverless spark cluster optimizes the cache than the user.

~

View solution in original post

gchandra · ‎09-30-2024

Global caching functionality (and other global states used on classic clusters) is conceptually hard to represent on serverless computing.

Serverless spark cluster optimizes the cache than the user.

~

Dave1967 · ‎09-30-2024

Many Thanks

mrroger · ‎05-13-2025

Hi. I'm not fully convinced that Serverless can optimize Spark cache better than the user, since I still see query plans with recomputed operations. What is the recommended best practice to avoid recomputation in a Serverless environment? Write out intermediate dataframes?

kunalmishra9 · ‎02-27-2025

What I do wish was possible was for serverless to warn that caching is not supported, but not error on a call. It makes switching between compute (serverless & all purpose) brittle and prevents code from easily being interoperable, no matter the compute type, which is significant friction against adopting serverless completely. Even having a parameter (i.e. .cache(try=True) ), would be nice to support this kind of workflow more elegantly.

Databricks Community

Serverless Compute no support for Caching data frames

Join Us as a Local Community Builder!

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples

🌟 Community Pulse: Your Weekly Roundup! October 31 – November 06, 2025

BrickTalks: Serve intelligence from your Lakehouse to your Apps with Lakebase

Free Edition Hackathon

Level Up with Databricks Specialist Sessions