<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Does &amp;quot;databricks bundle deploy&amp;quot; clean up old files? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/53194#M29725</link>
    <description>&lt;P&gt;I'm looking at &lt;A href="https://learn.microsoft.com/en-us/azure/databricks/dev-tools/bundles/work-tasks" target="_blank" rel="noopener"&gt;this page (Databricks Asset Bundles development work tasks)&lt;/A&gt;&amp;nbsp;in the Databricks documentation.&lt;/P&gt;&lt;P&gt;When repo assets are deployed to a databricks workspace, it is not clear if the "databricks bundle deploy" will remove files from the target workspace that aren't in the source repo. For example, if a repo contained a notebook named "test1.py" and had been deployed, but then "test1.py" was removed from the repo and a new notebook "test2.py" was created, what is the content of the target workspace after? I believe it will contain both "test1.py" and "test2.py".&lt;/P&gt;&lt;P&gt;Secondly, the description of "databricks bundle destroy" does not indicate that it would remove all files from the workspace - only that it will remove all the artifacts referenced by the bundle. So when the "test1.py" file has been removed from the repo, and the "databricks bundle destroy" is run, will it only remove "test2.py" (which has not yet been deployed)?&lt;/P&gt;&lt;P&gt;I am trying to determine how to ensure that the shared workspace contains only the files that are in the repo - that whatever I do in a release pipeline, I will only have the latest assets in the workspace that are in the repo, and none of the old files that were previously in the repo.&lt;/P&gt;&lt;P&gt;The semantics of "databricks bundle deploy" (in particular the term "deploy") would indicate to me that it should do a clean up of assets in the target workspace as part of the deployment.&lt;/P&gt;&lt;P&gt;But if that is not the case, then if I did a "databricks bundle destroy" prior to the "databricks bundle deploy", would that adequately clean up the target workspace? Or do I need to do something with "databricks fs rm" to delete all the files in the target workspace folder prior to the bundle deploy?&lt;/P&gt;</description>
    <pubDate>Mon, 20 Nov 2023 21:43:21 GMT</pubDate>
    <dc:creator>xhead</dc:creator>
    <dc:date>2023-11-20T21:43:21Z</dc:date>
    <item>
      <title>Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/53194#M29725</link>
      <description>&lt;P&gt;I'm looking at &lt;A href="https://learn.microsoft.com/en-us/azure/databricks/dev-tools/bundles/work-tasks" target="_blank" rel="noopener"&gt;this page (Databricks Asset Bundles development work tasks)&lt;/A&gt;&amp;nbsp;in the Databricks documentation.&lt;/P&gt;&lt;P&gt;When repo assets are deployed to a databricks workspace, it is not clear if the "databricks bundle deploy" will remove files from the target workspace that aren't in the source repo. For example, if a repo contained a notebook named "test1.py" and had been deployed, but then "test1.py" was removed from the repo and a new notebook "test2.py" was created, what is the content of the target workspace after? I believe it will contain both "test1.py" and "test2.py".&lt;/P&gt;&lt;P&gt;Secondly, the description of "databricks bundle destroy" does not indicate that it would remove all files from the workspace - only that it will remove all the artifacts referenced by the bundle. So when the "test1.py" file has been removed from the repo, and the "databricks bundle destroy" is run, will it only remove "test2.py" (which has not yet been deployed)?&lt;/P&gt;&lt;P&gt;I am trying to determine how to ensure that the shared workspace contains only the files that are in the repo - that whatever I do in a release pipeline, I will only have the latest assets in the workspace that are in the repo, and none of the old files that were previously in the repo.&lt;/P&gt;&lt;P&gt;The semantics of "databricks bundle deploy" (in particular the term "deploy") would indicate to me that it should do a clean up of assets in the target workspace as part of the deployment.&lt;/P&gt;&lt;P&gt;But if that is not the case, then if I did a "databricks bundle destroy" prior to the "databricks bundle deploy", would that adequately clean up the target workspace? Or do I need to do something with "databricks fs rm" to delete all the files in the target workspace folder prior to the bundle deploy?&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2023 21:43:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/53194#M29725</guid>
      <dc:creator>xhead</dc:creator>
      <dc:date>2023-11-20T21:43:21Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/53306#M29776</link>
      <description>&lt;P&gt;One further question:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The purpose of&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;“databricks bundle destroy”&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;is to remove all previously-deployed jobs, pipelines, and artifacts that are defined in &lt;EM&gt;the bundle configuration files&lt;/EM&gt;.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Which bundle configuration files? The ones in the repo? Or are there bundle configuration files in the target workspace location that are used? If the previous version of the bundle contained a reference to test1.py and it has been deployed to a shared workspace, and the new version of the repo no longer contains test1.py, will the destroy command remove test1.py from the shared workspace?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Nov 2023 14:53:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/53306#M29776</guid>
      <dc:creator>xhead</dc:creator>
      <dc:date>2023-11-21T14:53:59Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/64038#M32440</link>
      <description>&lt;P&gt;With thew newer Datbricks CLI (v0.215.0) this seems to be broken.&amp;nbsp; Now I can't destroy a bundle if it doesn't exist - it used to be idempotent.&amp;nbsp; Now I get this error (shortned my deploy area to &amp;lt;ws&amp;gt; below:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Starting plan computation&lt;BR /&gt;Planning complete and persisted at &amp;lt;ws&amp;gt;/dab-stage/pytest/.databricks/bundle/new-cluster/terraform/plan&lt;/P&gt;&lt;P&gt;No resources to destroy in plan. Skipping destroy!&lt;BR /&gt;Error: open &amp;lt;ws&amp;gt;/dab-stage/pytest/.databricks/bundle/new-cluster/terraform/terraform.tfstate: no such file or directory&lt;BR /&gt;make: *** [test-on-cluster] Error 1&lt;/P&gt;</description>
      <pubDate>Mon, 18 Mar 2024 23:18:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/64038#M32440</guid>
      <dc:creator>fbaxter</dc:creator>
      <dc:date>2024-03-18T23:18:39Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/64825#M32671</link>
      <description>&lt;P&gt;Will you add a synchronization option that does not remove existing jobs and pipelines?&lt;/P&gt;&lt;P&gt;We are using DAB for DBT and generally it works well, however, lifecycling models is a bit of a issue at the moment &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Mar 2024 16:54:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/64825#M32671</guid>
      <dc:creator>db_allrails</dc:creator>
      <dc:date>2024-03-27T16:54:03Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/82573#M36688</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Quick update on this&lt;/STRONG&gt;: Now if you remove a file locally (or from GIT in the case of CI/CD) and run "bundle deploy" from the CLI, it will remove the corresponding file from your Databricks workspace.&amp;nbsp;&lt;/P&gt;&lt;P&gt;e.g.&lt;BR /&gt;1. Add new file locally, run "bundle deploy"&lt;BR /&gt;2. File appears in Databricks workspace&lt;BR /&gt;3. Remove file locally, run "bundle deploy"&lt;BR /&gt;4. File is removed automatically from the Databricks workspace&lt;/P&gt;&lt;P&gt;Therefore, I don't think there's a need to manually do a cleanup of files.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Aug 2024 15:21:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/82573#M36688</guid>
      <dc:creator>jgraham0325</dc:creator>
      <dc:date>2024-08-09T15:21:52Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/82726#M36728</link>
      <description>&lt;P&gt;xhead&amp;nbsp;I think the configuration files it's referring to is the local ones in your repo. It checks these against what has been deployed in the workspace and will remove anything that you've got rid of in you repo in the new version. Behind the scenes it uses a terraform state file to keep track of what has been deployed, which is saved in the workspace along with your other files in the bundle.&amp;nbsp;&lt;/P&gt;&lt;P&gt;In your example, yes it should remove test1.py from the shared workspace.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Aug 2024 10:33:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/82726#M36728</guid>
      <dc:creator>JamesGraham</dc:creator>
      <dc:date>2024-08-12T10:33:05Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/105532#M42171</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/103679"&gt;@JamesGraham&lt;/a&gt;&amp;nbsp;that makes sense depending on the workflow that was implemented. When deploying bundles from a local clone of the repo, the tfstate will be local and (hopefully) kept intact, and then the behavior will be what you describe.&lt;BR /&gt;&lt;BR /&gt;But what happens when &lt;STRONG&gt;databricks bundle&lt;/STRONG&gt; is issued from inside a CI/CD pipeline on an ephemeral environment? The .tfstate in that ephemeral env will be lost at the end of the pipeline, and then, if a newer version is later deployed with changes to the bundle definition, any previously deployed resource that got removed would be abandoned in the environment instead of cleaned up.&lt;/P&gt;</description>
      <pubDate>Tue, 14 Jan 2025 07:57:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/105532#M42171</guid>
      <dc:creator>RobertoBruno</dc:creator>
      <dc:date>2025-01-14T07:57:59Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/106434#M42491</link>
      <description>&lt;P&gt;Similar issue. Databricks bundle is issued from inside a CI/CD pipeline. If we rename a job, the old job will not be deleted in test or production workspaces. How do we fix it, optimally the job would be the same with a new name, but the alternative is that at least the old job would be deleted.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jan 2025 10:49:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/106434#M42491</guid>
      <dc:creator>pernilak</dc:creator>
      <dc:date>2025-01-21T10:49:54Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/108657#M43106</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/115511"&gt;@jgraham0325&lt;/a&gt;&amp;nbsp;What CLI version are you using?&lt;/P&gt;</description>
      <pubDate>Mon, 03 Feb 2025 20:56:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/108657#M43106</guid>
      <dc:creator>js54123875</dc:creator>
      <dc:date>2025-02-03T20:56:10Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/109249#M43256</link>
      <description>&lt;P&gt;I'm using v0.240.0&lt;/P&gt;</description>
      <pubDate>Thu, 06 Feb 2025 17:00:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/109249#M43256</guid>
      <dc:creator>JamesGraham</dc:creator>
      <dc:date>2025-02-06T17:00:50Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/109256#M43259</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/26535"&gt;@RobertoBruno&lt;/a&gt;&amp;nbsp;The tfstate used is actually the one stored in the Databricks Workspace, not on the local filesystem. So providing you keep using the same root_path for your DAB, it should still correctly clean-up any jobs you remove from your code in GIT.&lt;/P&gt;
&lt;P&gt;e.g. root_path for staging should be fixed for each DAB, and only the Service Principle running the CI process should be able to run this process:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;  staging:
    workspace:
      host: https://adb-123456789.1.azuredatabricks.net/
      root_path: /Workspace/DAB/${bundle.name}/${bundle.target}&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;A way to demonstrate this is:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Create new job in DAB locally&lt;/LI&gt;
&lt;LI&gt;Run bundle deploy&lt;/LI&gt;
&lt;LI&gt;Job appears in Databricks Workspace's list of jobs&lt;/LI&gt;
&lt;LI&gt;Delete new job locally&lt;/LI&gt;
&lt;LI&gt;Delete local .databricks folder (containing local tfstate). This simulates a new CI/CD run on a fresh build agent.&lt;/LI&gt;
&lt;LI&gt;Run bundle deploy&lt;/LI&gt;
&lt;LI&gt;Result: Job is deleted from the Databricks Workspace&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Thu, 06 Feb 2025 17:18:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/109256#M43259</guid>
      <dc:creator>JamesGraham</dc:creator>
      <dc:date>2025-02-06T17:18:56Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/109258#M43260</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/92609"&gt;@pernilak&lt;/a&gt;&amp;nbsp;what are you using for the root_path in your test and production workspaces?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When renaming a job, it should create a new job with the new name and delete the old job.&lt;/P&gt;
&lt;P&gt;This is providing:&lt;/P&gt;
&lt;P&gt;- root_path is kept the same&lt;/P&gt;
&lt;P&gt;- only 1 user is doing the deployment, ideally a service principle&lt;/P&gt;</description>
      <pubDate>Thu, 06 Feb 2025 17:25:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/109258#M43260</guid>
      <dc:creator>JamesGraham</dc:creator>
      <dc:date>2025-02-06T17:25:27Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/120046#M46040</link>
      <description>&lt;P&gt;I am facing below issue.&lt;/P&gt;&lt;P&gt;I am deploying a dlt pipeline using ci/cd. I tried to parameterize pipeline name based on environment.&amp;nbsp;&lt;/P&gt;&lt;P&gt;But when i deploy pipeline using cicd, i see below error.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="ganapati_0-1747989273212.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/17074i09780C62A3BD92F8/image-size/medium?v=v2&amp;amp;px=400" role="button" title="ganapati_0-1747989273212.png" alt="ganapati_0-1747989273212.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;i have already fixed this target.parameter.env and opted for differrent solution. what is the fix for this issue?&lt;/P&gt;</description>
      <pubDate>Fri, 23 May 2025 08:40:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/120046#M46040</guid>
      <dc:creator>ganapati</dc:creator>
      <dc:date>2025-05-23T08:40:27Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/120053#M46042</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/165652"&gt;@ganapati&lt;/a&gt;&amp;nbsp;that looks like an unrelated issue to this thread, I'd suggest creating a new thread for it&lt;/P&gt;</description>
      <pubDate>Fri, 23 May 2025 09:38:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/120053#M46042</guid>
      <dc:creator>JamesGraham</dc:creator>
      <dc:date>2025-05-23T09:38:47Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/120054#M46043</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/103679"&gt;@JamesGraham&lt;/a&gt;&amp;nbsp;this issue is related to "databricks bundle deploy" command itself, when run inside ci/cd pipeline, i am still seeing old configs in bundle.tf.json. Ideally it should be updated to changes done from previous run. But i am still seeing error for old configs. If this is still unrelated i can create new thread&lt;span class="lia-unicode-emoji" title=":grinning_face:"&gt;😀&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 23 May 2025 10:01:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/120054#M46043</guid>
      <dc:creator>ganapati</dc:creator>
      <dc:date>2025-05-23T10:01:49Z</dc:date>
    </item>
    <item>
      <title>Re: Does "databricks bundle deploy" clean up old files?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/126697#M47741</link>
      <description>&lt;P&gt;In our case, Root path is kept the same and deployment is run via service principle, still: After deploying a new job if it is deleted from repository it does not get deleted from workspace on running deployment again.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Jul 2025 12:41:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-quot-databricks-bundle-deploy-quot-clean-up-old-files/m-p/126697#M47741</guid>
      <dc:creator>atikiwala</dc:creator>
      <dc:date>2025-07-28T12:41:02Z</dc:date>
    </item>
  </channel>
</rss>

