3 weeks ago
I'm looking at this page (Databricks Asset Bundles development work tasks) in the Databricks documentation.
When repo assets are deployed to a databricks workspace, it is not clear if the "databricks bundle deploy" will remove files from the target workspace that aren't in the source repo. For example, if a repo contained a notebook named "test1.py" and had been deployed, but then "test1.py" was removed from the repo and a new notebook "test2.py" was created, what is the content of the target workspace after? I believe it will contain both "test1.py" and "test2.py".
Secondly, the description of "databricks bundle destroy" does not indicate that it would remove all files from the workspace - only that it will remove all the artifacts referenced by the bundle. So when the "test1.py" file has been removed from the repo, and the "databricks bundle destroy" is run, will it only remove "test2.py" (which has not yet been deployed)?
I am trying to determine how to ensure that the shared workspace contains only the files that are in the repo - that whatever I do in a release pipeline, I will only have the latest assets in the workspace that are in the repo, and none of the old files that were previously in the repo.
The semantics of "databricks bundle deploy" (in particular the term "deploy") would indicate to me that it should do a clean up of assets in the target workspace as part of the deployment.
But if that is not the case, then if I did a "databricks bundle destroy" prior to the "databricks bundle deploy", would that adequately clean up the target workspace? Or do I need to do something with "databricks fs rm" to delete all the files in the target workspace folder prior to the bundle deploy?
3 weeks ago
Hi @xhead ,
When deploying repo assets to a Databricks workspace using the โdatabricks bundle deployโ command, itโs essential to understand how it interacts with existing files in the target workspace.
Letโs address your concerns:
The behaviour of โdatabricks bundle deployโ:
โdatabricks bundle destroyโ:
Ensuring Workspace Consistency:
Semantic Implications:
Remember to tailor your approach based on your specific requirements and workflow. Happy bundling!
3 weeks ago
Hi @xhead ,
When deploying repo assets to a Databricks workspace using the โdatabricks bundle deployโ command, itโs essential to understand how it interacts with existing files in the target workspace.
Letโs address your concerns:
The behaviour of โdatabricks bundle deployโ:
โdatabricks bundle destroyโ:
Ensuring Workspace Consistency:
Semantic Implications:
Remember to tailor your approach based on your specific requirements and workflow. Happy bundling!
3 weeks ago
One further question:
Which bundle configuration files? The ones in the repo? Or are there bundle configuration files in the target workspace location that are used? If the previous version of the bundle contained a reference to test1.py and it has been deployed to a shared workspace, and the new version of the repo no longer contains test1.py, will the destroy command remove test1.py from the shared workspace?
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.