cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

DataBricks Asset Bundles - Don't deploy to the workspace, update only the repo

Chalki
New Contributor III

Hello Guys,

 

So basically me and my team have bunch of jobs, which are pointing to a remote repo directly - they are not pointing to the workspace of the related environment. Is there a way to update the repo part in our databricks environment, instead of deploying to the workspace. We don't need our code to reside in the workspace for the sake of job execution.
Also I couldn't quite understand what is the difference between using the "Python wheel as part of a Databricks Asset Bundle" and directly deploy the bundle with the CLI? Basically the workflows and the commands are identical.

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @Chalki, Letโ€™s address both aspects of your question:

 

Updating the Repo in Databricks Environment:

  • If your jobs are currently pointing directly to a remote repository and you want to update the code without deploying it to the workspace, you can achieve this using Databricks Asset Bundles.
  • Databricks Asset Bundles allow you to package and manage your code, dependencies, and configurations separately from the workspace. You can create a bundle that includes your Python code (e.g., a Python wheel) and other necessary files.
  • By using Databricks Asset Bundles, you can update the code in your jobs without deploying it to the workspace. The bundle acts as a container for your code, and you can reference it directly in your jobs.
  • This approach provides flexibility and isolation, especially when you donโ€™t want to clutter your workspace with code files.

Difference Between Python Wheel in an Asset Bundle and Direct Deployment with CLI:

  • Both methods allow you to deploy Python wheels, but they serve different purposes:
    • Python Wheel as Part of a Databricks Asset Bundle:
      • Bundles are a way to package and manage your code, dependencies, and configurations.
      • You create a bundle that includes your Python wheel (built using setuptools or Poetry) and any other necessary files.
      • Bundles can be versioned, and you can reference them directly in your jobs or pipelines.
      • This approach is useful when you want to keep your code separate from the workspace and manage it as an isolated unit.
    • Direct Deployment with the Databricks CLI:
      • When you deploy a Python wheel directly using the Databricks CLI, youโ€™re essentially updating the code in a specific job or pipeline.
      • This method is more straightforward and doesnโ€™t involve creating a separate bundle.
      • Itโ€™s suitable when you need to quickly update a job or pipeline without managing additional bundle artifacts.
      • However, the code resides in the workspace, which may not be ideal if you want to keep it separate.

In summary, if you prefer isolation and versioning, consider using Databricks Asset Bundles. If simplicity and direct deployment are your priorities, stick with the CLI approach. Choose the one that best aligns with your teamโ€™s workflow and requirements! ๐Ÿš€

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group