cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

Large datasets in Databricks

maltasa
New Contributor II

How can I efficiently handle large datasets in Databricks when performing group-by operations to avoid out-of-memory errors? Are there any best practices or optimizations for improving performance, such as partitioning or caching, especially when working with Spark DataFrames?

1 REPLY 1

Takuya-Omi
Valued Contributor II

Hi, @maltasa 

I believe this article might help answer your question.

Comprehensive Guide to Optimize Databricks, Spark and Delta Lake Workloads 

--------------------------
Takuya Omi (尾美拓哉)

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now