cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Does "databricks bundle deploy" clean up old files?

xhead
New Contributor II

I'm looking at this page (Databricks Asset Bundles development work tasks) in the Databricks documentation.

When repo assets are deployed to a databricks workspace, it is not clear if the "databricks bundle deploy" will remove files from the target workspace that aren't in the source repo. For example, if a repo contained a notebook named "test1.py" and had been deployed, but then "test1.py" was removed from the repo and a new notebook "test2.py" was created, what is the content of the target workspace after? I believe it will contain both "test1.py" and "test2.py".

Secondly, the description of "databricks bundle destroy" does not indicate that it would remove all files from the workspace - only that it will remove all the artifacts referenced by the bundle. So when the "test1.py" file has been removed from the repo, and the "databricks bundle destroy" is run, will it only remove "test2.py" (which has not yet been deployed)?

I am trying to determine how to ensure that the shared workspace contains only the files that are in the repo - that whatever I do in a release pipeline, I will only have the latest assets in the workspace that are in the repo, and none of the old files that were previously in the repo.

The semantics of "databricks bundle deploy" (in particular the term "deploy") would indicate to me that it should do a clean up of assets in the target workspace as part of the deployment.

But if that is not the case, then if I did a "databricks bundle destroy" prior to the "databricks bundle deploy", would that adequately clean up the target workspace? Or do I need to do something with "databricks fs rm" to delete all the files in the target workspace folder prior to the bundle deploy?

15 REPLIES 15

ganapati
New Contributor II

@JamesGraham this issue is related to "databricks bundle deploy" command itself, when run inside ci/cd pipeline, i am still seeing old configs in bundle.tf.json. Ideally it should be updated to changes done from previous run. But i am still seeing error for old configs. If this is still unrelated i can create new thread๐Ÿ˜€

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now