Databricks Community

Syllabus: 3 sections | 24 lessons
Duration: &nbsp;2 hours
Skill level: Associate
Cost: Free

Tushar_Parekar · 3 weeks ago

Databricks Academy offers the free DevOps Essentials for Data Engineering course, designed to help data engineers apply software engineering best practices and DevOps principles on the Databricks Data Intelligence Platform. Instead of going deep into every tool, this course focuses on the core habits that make data pipelines easier to build, test, and maintain over time.

You’ll learn to:

Explain the core principles of software engineering best practices for data engineering, including code quality, version control, documentation, and testing
Describe what DevOps means for data teams, including its main components, benefits, and how CI/CD fits into day-to-day workflows
Apply modularity principles in PySpark to break code into reusable functions and components
Design and run unit tests for PySpark functions with pytest, and perform integration testing for Databricks data pipelines using Spark Declarative Pipeline and Jobs
Use Git operations in Databricks with Git Folders to support basic continuous integration workflows
Compare different deployment options for Databricks assets (REST API, CLI, SDK, DABs) so you know which approaches exist and when they might be useful

Designed for:

Data engineers on Databricks who want to improve the quality and reliability of their pipelines
Practitioners with solid Databricks Platform experience (workspaces, Delta Lake, Medallion Architecture, Unity Catalog, Delta Live Tables, Workflows)
Users comfortable with PySpark, intermediate SQL, Python, and basic Git version control

Course format & details: