cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Changing paths to tables

IONA
New Contributor III

Hi

My organization has many notebooks that reference tables in schemas with the three part path

catalog.schema.tablename

With a lack of foresight we hardcoded all of these paths in the code and now the inevitable is happening and there is a need to restructure the catalogs and schemas that if done immediately would break these paths, and thus they have to be changed. I expect in this new version we will parameterize these paths so they are sensitive to the environment they run in so that future changes would not present the headache we face now!

For you information our notebooks are linked to an azure git repository and it is from this repository that our workflow task point to to run the jobs

My thoughts were to create a migration branch of our repository and make the necessary code changes there. Then, when the new catalog is in place and ready to be used we could merge this migration branch into the main branch and the paths in the notebooks would be correct.

However, while work goes on in the migration branch to parameterize these paths, normal code changes to the main branch must continue in in order to service organizational needs. This would risk conflicts when the migration branch is merged into the main and we cannot afford to spend time trying to sort these out on merge day.

So really we want to be making the path changes in migration as well as continued development in main and pull the new dev work in main into migration so that come the pull from migration to main the files are similar all but for the path changes made in migration.

My question. Is this feasible? a regular merge of Main => Migration to keep notebooks in sync that way before the final Migration => Main takes place

Or is there a better way that my limited dev ops experience is missing?

Thanks

Gary

 

 

1 REPLY 1

szymon_dybczak
Esteemed Contributor III

Hi @IONA ,

Definitely, I would say that’s even a common practice. Create a feature branch and make the necessary changes there. But once a day, merge into that feature branch all the changes that have appeared on your main branch. That way, you will avoid the typical problems related to long-running feature branches. If you merge frequently, it’s much easier to handle

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now