cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Learning Series | Databricks Performance Optimization

Tushar_Parekar
Databricks Employee
Databricks Employee

Databricks Academy offers the free Databricks Performance Optimization course to help data engineers improve workload performance on the Databricks Data Intelligence Platform. As part of the Advanced Data Engineering with Databricks series, it focuses on practical ways to optimize Spark and Delta Lake workloads, improve physical data layout, and use the Spark UI to troubleshoot performance issues.

You’ll learn to:

  • Use the Spark UI to debug performance: Learn how to identify jobs, stages, tasks, and key signals like shuffle, spill, file pruning, and storage access.
  • Improve data layout for better query speed: Understand small-file issues, compare Z-Ordering and Liquid Clustering, and use Databricks features to keep file layouts healthy.
  • Optimize code and query execution: Diagnose skew, shuffle, and UDF bottlenecks, then apply techniques like AQE, broadcast joins, and native functions to improve performance.
  • Choose the right compute for the workload: Learn how to select cluster types, instance families, and Photon settings using practical Databricks rules of thumb.

Designed for:

  • Data engineers working with Spark and Delta Lake workloads
  • Users with basic Databricks development skills
  • Learners with intermediate PySpark and Delta Lake experience

Course format & details:

  • Series: Part of the Advanced Data Engineering with Databricks series
  • Syllabus: 4 sections | 19 lessons
  • Duration: 2 hours
  • Skill level: Professional
  • Cost: Free
  • Includes labs: No

🔗 Enroll Now

0 REPLIES 0