cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

table deployment (DDL) from one catalog to another

dpc
New Contributor III

Hello

We have a development, a test and a production environment

How do you generally deploy DDL changes?

So, alter a table in development and apply to test then production

e.g.

table1 has column1, column2, column3

I add column4

I now want to deploy this change to production

I also want to retain data

 

Thanks.

2 REPLIES 2

filipniziol
Contributor

Hi @dpc ,

  • Managing DDL changes accross environments (dev, test, prod) will be a part of your CI/CD pipeline.
  • Create a Deployments folder, with subfolders. Each deployment will be represented by a timestamped folder. Example folder structure:

 

/Deployments
    /Deployments_20240924
        /01_add_column4_table1.sql
        /02_alter_other_table.sql
    /Deployments_20240930
        /01_create_new_table.sql
        /02_update_table1.sql

 

  • Inside the folder, each script (e.g., 01_add_column4_table1.sql) represents individual DDL changes (like adding a column or altering a table).
  • You can then integrate your CI/CD pipeline with the deployment folder
    1. Add deployment folder name to your CI/CD pipeline
    2. After your notebooks are deployed, add to the pipeline the step to run the Notebooks that are inside the provided deployment pipeline
  • Alternatively, for every deployment you can create a deployment Job. Each job will have multiple tasks, where each task represent a specific DDL change. In the example, you would create Deployment_20240924 job, that would have 2 steps: to run 01_add_column4_table1.sql and then to run 02_alter_other_table.sql.
    1. Running the job could also be a step in your CI/CD pipeline
    2. Alternatively, the job could already be scheduled to run at a given time (but you need to know when you deployment is going to happen
    3. Alternatively, the job could be run manually after the CI/CD pipeline finishes deploying the notebooks

dpc
New Contributor III

Thanks.

I'll step through this solution and see if I can get it working

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group