cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

DataBricks Asset Bundles - Don't deploy to the workspace, update only the repo

Chalki
New Contributor III

Hello Guys,

 

So basically me and my team have bunch of jobs, which are pointing to a remote repo directly - they are not pointing to the workspace of the related environment. Is there a way to update the repo part in our databricks environment, instead of deploying to the workspace. We don't need our code to reside in the workspace for the sake of job execution.
Also I couldn't quite understand what is the difference between using the "Python wheel as part of a Databricks Asset Bundle" and directly deploy the bundle with the CLI? Basically the workflows and the commands are identical.

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @Chalki, Let’s address both aspects of your question:

 

Updating the Repo in Databricks Environment:

  • If your jobs are currently pointing directly to a remote repository and you want to update the code without deploying it to the workspace, you can achieve this using Databricks Asset Bundles.
  • Databricks Asset Bundles allow you to package and manage your code, dependencies, and configurations separately from the workspace. You can create a bundle that includes your Python code (e.g., a Python wheel) and other necessary files.
  • By using Databricks Asset Bundles, you can update the code in your jobs without deploying it to the workspace. The bundle acts as a container for your code, and you can reference it directly in your jobs.
  • This approach provides flexibility and isolation, especially when you don’t want to clutter your workspace with code files.

Difference Between Python Wheel in an Asset Bundle and Direct Deployment with CLI:

  • Both methods allow you to deploy Python wheels, but they serve different purposes:
    • Python Wheel as Part of a Databricks Asset Bundle:
      • Bundles are a way to package and manage your code, dependencies, and configurations.
      • You create a bundle that includes your Python wheel (built using setuptools or Poetry) and any other necessary files.
      • Bundles can be versioned, and you can reference them directly in your jobs or pipelines.
      • This approach is useful when you want to keep your code separate from the workspace and manage it as an isolated unit.
    • Direct Deployment with the Databricks CLI:
      • When you deploy a Python wheel directly using the Databricks CLI, you’re essentially updating the code in a specific job or pipeline.
      • This method is more straightforward and doesn’t involve creating a separate bundle.
      • It’s suitable when you need to quickly update a job or pipeline without managing additional bundle artifacts.
      • However, the code resides in the workspace, which may not be ideal if you want to keep it separate.

In summary, if you prefer isolation and versioning, consider using Databricks Asset Bundles. If simplicity and direct deployment are your priorities, stick with the CLI approach. Choose the one that best aligns with your team’s workflow and requirements! 🚀

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.