cancel
Showing results for 
Search instead for 
Did you mean: 
Technical Blog
Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
cancel
Showing results for 
Search instead for 
Did you mean: 
kamalendubiswas
Databricks Employee
Databricks Employee

Database or Schema versioning is the practice of tracking and managing changes to a database schema over time. It is important for several reasons:

  1. Consistency: Ensures all environments (development, testing, production) have the same schema structure.
  2. Traceability: Provides a history of changes, making it easier to troubleshoot issues.
  3. Collaboration: Allows multiple developers to work on the database simultaneously without conflicts.
  4. Rollback capability: Enables reverting to previous versions if needed.
  5. Automation: Facilitates automated deployments and continuous integration/continuous delivery (CI/CD) pipelines.

Two popular tools for schema migration and versioning are:

  • Flyway: An open-source schema migration tool that supports a wide range of database systems. It uses SQL or Java-based migrations and can be integrated into various build and deployment processes.
  • Liquibase: Another open-source tool for tracking, managing, and applying database schema changes. It supports multiple formats for defining changes, including XML, YAML, JSON, and SQL.

Both Flyway and Liquibase also offer Enterprise/Pro versions with advance features like drift detection, policy or rule based change control, UI application, premium support and many more.

The Databricks SQL SME team has published two informative blogs that provide detailed guidance on integrating Flyway and Liquibase with Databricks for schema versioning and migrations.

These blogs cover the following key points:

  1. An overview of Flyway/Liquibase and their benefits for schema versioning:
    • Explanation of schema versioning concepts
    • Advantages of using tools like Flyway and Liquibase for managing schema changes
    • How these tools improve collaboration, traceability, and automation

  2. Steps to set up Flyway/Liquibase with Databricks:
    • Detailed configuration instructions for connecting to Databricks
    • Driver setup and compatibility information
    • Sample configuration files and connection strings

  3. Examples of common migration scenarios:
    • Creating new tables and modifying existing schemas
    • Handling data migrations
    • Managing environment-specific configurations

  4. Tips for integrating into Databricks workflows and CI/CD pipelines:
    • Best practices for organising migration scripts
    • Strategies for incorporating database changes into existing development processes
    • Guidance on automating migrations as part of CI/CD pipelines

These blogs provide comprehensive guidance for Databricks users looking to implement robust database/schema versioning and migration practices using industry-standard tools like Flyway and Liquibase.

2 Comments
maciejm
New Contributor

Hi, 

I have some part of the SQL scripts done in python. I noticed that this part of the code is being ignored with Flyway. Any ideas how to get it executed as part of the Flyway migration? 

Thanks! 

zl-redgate
New Contributor

(1) By “SQL scripts in Python,” are you putting the Python code inside a .sql file for migrations? If so, this won’t work, since Flyway just passes them as sql statements to the Databricks JDBC driver. 

You can create a separate .py python file and make that a migration script, which Flyway then should run. Just not a combination inside the same .sql file. More info: of creating a separate script migration with the .py - https://documentation.red-gate.com/flyway/flyway-concepts/migrations/script-migrations

Another question (2) - since Databricks has many options, what is running - e.g., Java in Oracle Database or .NET in SQL Server?