cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Announcements
Stay up-to-date with the latest announcements from Databricks. Learn about product updates, new features, and important news that impact your data analytics workflow.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Learning Series | Data Preparation for Machine Learning

Tushar_Parekar
Databricks Employee
Databricks Employee

Databricks Academy offers the free Data Preparation for Machine Learning course to help associate-level data scientists and ML practitioners prepare data for traditional machine learning on the Databricks Data Intelligence Platform.
As the first course in the โ€œ
Machine Learning with Databricksโ€ series, it focuses on practical steps for exploring, cleaning, and organizing data so itโ€™s ready for your models.

Youโ€™ll learn to:

  • Understand how Databricks supports machine learning: Learn how core storage, governance, and collaboration features help you prepare data and run ML safely at scale.
  • Explore and clean your data at scale: Use Spark and visualizations to profile your data, spot issues like missing values or outliers, and apply practical cleaning steps.
  • Engineer and manage useful features: Build, transform, and organize features with Spark, then keep them consistent across training and inference with Feature Store and Lakebase.
  • Use modern Databricks ML workflows: Streamline analysis and development with Genie Code (Agent Mode), Serverless Compute, and updated notebooks that reduce setup time.

Recent updates:

  • Integrated Genie Code (Agent Mode) into the course to support more guided, conversational exploratory analysis and streamlined development
  • Updated notebooks for full compatibility with Serverless Compute, removing dependencies on classic clusters and simplifying setup
  • Expanded coverage of Lakebase in the online feature store discussion, to reflect the latest Databricks capabilities for managing and serving features at scale

Designed for:

  • ML practitioners and associate-level data scientists on Databricks who want stronger data preparation skills
  • Learners who have completed Get Started with Databricks for Machine Learning (Onboarding) or have equivalent Databricks ML experience
  • Users comfortable with Python and common data libraries, basic ML concepts, and core lakehouse fundamentals

Course format & details:

  • Syllabus: 3 sections | 17 lessons
  • Duration: 2 hours
  • Skill level: Associate
  • Cost: Free

 

0 REPLIES 0