cancel
Showing results for 
Search instead for 
Did you mean: 
Technical Blog
Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
cancel
Showing results for 
Search instead for 
Did you mean: 
dgomezm
Databricks Employee
Databricks Employee

In an era where data drives innovation and competitive advantage, protecting it becomes a non-negotiable priority. Particularly when it involves sensitive information, even minor lapses can translate into significant risks and losses. For organizations leveraging Databricks, regular Databricks Runtime (DBR) migrations aren’t simply about staying current—they’re essential to safeguarding your data, ensuring optimal performance, and driving business value from your analytics and AI investments. 

For serverless workloads—Databricks offers a versionless experience—removing the need for customers to manage or upgrade runtime versions altogether. But not all workloads are suited for serverless today. For teams managing classic compute environments, staying current with DBR versions remains a best practice—and at scale, it can be challenging. 

In this blog, we delve into the DBR migration process, address the challenges organizations may face, and offer actionable best practices along with automation techniques to streamline the transition. Keeping your dependencies up to date not only enhances performance but also serves as a critical defense against vulnerabilities, ensuring that your data remains secure. 

Why do we need to migrate?

At its core, the Databricks Runtime is built upon Apache Spark and enriched by additional libraries and components designed to simplify your analytics and AI workloads. Regular updates help ensure your workflows run securely and smoothly, enabling your teams to focus on driving impactful insights and innovations rather than managing disruptions. 

Migrations also play a critical role in addressing vulnerabilities. Each release includes essential security patches and performance enhancements that strengthen the stability and resiliency of your Databricks environment—future-proofing your analytics and AI strategy. 

Important: End-of-support does not mean your workloads will stop running. However, once a DBR version reaches end-of-support, it no longer receives security updates, bug fixes, or technical support.

 Understanding the DBR Lifecycle:

 

DBR Version Spark Version Release Date EoS Date Key Changes
15.4 LTS 3.5.0 Aug 19, 2024 Aug 19, 2027 Stability improvements for large-scale workloads
14.3 LTS 3.5.0 Feb 1, 2024 Feb 1, 2027 Predictive Optimization GA; automated Delta table maintenance.
13.3 LTS 3.4.1 Aug 22, 2023 Aug 22, 2026 Scala support for Unity Catalog shared clusters, volumes support for storing artifact.
12.2 LTS 3.3.2 Mar 1, 2023 Mar 1, 2026 Delta Lake performance optimizations; new techniques for joins and aggregations.
11.3 LTS 3.3.0 Oct 19, 2022 Oct 19, 2025 Predictive I/O support for accelerated reads (Photon engine)

 

In addition to mitigating risks, upgrading to the latest DBR version unlocks a host of significant benefits. Each new release brings value enhancements, such as improved query performance, optimized resource utilization, and advanced data governance features like enhanced metadata management, robust access controls, and lineage tracking. These upgrades not only boost workload efficiency but also help ensure your entire data estate remains compliant with industry standards. For instance, DBR 12.2 leverages Unity Catalog to enable powerful features like row-level and column-level security.

Planning Your DBR Migration: 

Assess your workspace

A successful migration begins by assessing your existing workspace and clearly identifying the resources affected by the upcoming runtime transition. To simplify this critical step, we’ve developed an assessment dashboard [link] to help you quickly identify and prioritize workloads based on DBR versions and job spend, minimizing risk and accelerating your migration. 

Note: This dashboard is not officially supported by Databricks and is provided as a community contribution. Estimates may not reflect actual billing and require system tables and a Unity Catalog-enabled workspace.

dashboard-img.jpg

Note: Be sure to account for external dependencies—such as Azure Data Factory, Apache Airflow, or other third-party tools directly triggering job clusters—as these will need to be updated accordingly. The impact to these external dependencies falls outside the scope of this dashboard.

Establishing a Development Environment:

A dedicated development environment is essential for safe testing and validation. This isolated workspace lets you test DBR upgrades, identify potential issues, and iterate without risking production workloads.

If you don’t already have a dev environment, we recommend setting up a separate Databricks workspace specifically for testing purposes. This setup offers better governance, separation of duties, and reproducibility of test results. [Create a Databricks workspace]

Note: Skip this section if a development environment already exists, but ensure jobs exactly match your target environments to guarantee accurate testing.

Manually Mirroring Jobs into a Development Environment

Carefully replicate existing job configurations into your new development environment:

  1. Navigate to the targeted jobs in your current workspace.
  2. Within the Jobs UI, click on the ellipsis menu (...) and choose “View JSON”.
  3. Click “Create” to automatically generate a Databricks CLI command to recreate this job.
  4. Authenticate with your development workspace via Databricks CLI and run the command to replicate the job. 

 

 

dbr-video-guide.gif

 

 Warning: Ensure all dependencies—including notebooks and libraries—are appropriately migrated.

Testing Jobs

With jobs mirrored into your development environment, validate compatibility proactively to identify potential issues early without impacting production workloads. Update each job’s DBR version and closely monitor performance, diagnosing and resolving issues directly within this isolated environment. 

Important: Watch for subtle behavioral changes, not just explicit errors. For example, Spark’s size() function behaves differently across versions in ANSI mode:
ANSI Mode Enabled ANSI Mode Disabled
 size(NULL) → NULL
 size(NULL) → -1

Thorough testing with representative datasets ensures smoother migrations and robust CI/CD practices. For comprehensive testing guidelines, refer to Databricks’ documentation

Once validated, migrate these configurations confidently to your production environment, ensuring minimal risk and seamless continuity.

Exploring automation opportunities

Manual migration quickly becomes resource-intensive, especially at scale. Databricks strongly encourages automation to simplify complex migrations and democratize data operations by reducing complexity. 

  • Databricks Terraform Exporter: Automates extraction and replication of workspace resources, including users, groups, jobs, and notebooks. [Find more here.]

We’re actively developing new automation tools to further streamline your DBR migration experience. Stay tuned for updates and new developments. 

Common Pitfalls and Best Practices:

Avoid migration complexity by:

  • Thoroughly documenting each step: Job configurations, compatibility issues, dependencies, and solutions. 

If you encounter challenges, reach out to your Databricks account team. They provide valuable resources and direct support to keep your migration on track. 

Transforming Migrations Into Strategic Advantage

Regular DBR migrations aren’t just technical housekeeping; they’re strategic opportunities. For example, companies migrating to DBR 14 leveraged Predictive Optimization to significantly reduce query costs and accelerate insights, unlocking new analytics-driven opportunities.

Conclusion

Migrating  your Databricks Runtime is not merely routine—it’s a strategic imperative. Proactively managing migrations enhances performance, data security, and simplifies governance, empowering your organization to leverage data and AI more effectively to solve your toughest challenges. Stay proactive, informed, and ensure your Databricks environment remains secure, agile, and ready to support your evolving data initiatives.

 

 

 

 

1 Comment