cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Asset Bundle (DAB) from a Git repo?

dbx_687_3__1b3Q
New Contributor III

My earlier question was about creating a Databricks Asset Bundle (DAB) from an existing workspace. I was able to get that working but after further consideration and some experimenting, I need to alter my question. My question is now "how do I create a DAB from a Git repo?" 

We have a shared workspace where we've been building our project's Databricks assets (notebooks, python scripts, DLT pipelines, and workflows). The assets are eventually committed to our Azure DevOps Git repo. We create pull requests, code gets reviewed, and eventually the assets are merged into our release branch cleverly named "release". I need to use the release branch as the source for my DAB.

Based on the the example provided in the response, I dug a little deeper and discovered there is a workspace command to export an entire folder (export-dir). I modified the command to use export-dir and passed the path to the repo, "/Repos/release/repo_name_here". Along the way I also needed to change the arguments but, in the end, I was unable to successfully run the command because the specified path could not be found. I think it makes sense since it's likely looking in the workspace and I'm trying to use assets in a repo.

How do I build a DAB from a specific branch in an Azure DevOps Git repo?

1 ACCEPTED SOLUTION

Accepted Solutions

nicole_lu_PM
Contributor III

We are very close to having an end-to-end solution for deploying DABs from a Git folder (Repo) in the Workspace!

Check out my talk on DAIS24 here https://github.com/databricks/dais-cow-bff (video link on README). We are waiting for the feature that allows `databricks bundle deploy`  to work in the workspace web terminal. It should be available within the next month.


 


View solution in original post

9 REPLIES 9

erima21
New Contributor III

Any help with this?

 

erima21
New Contributor III

This is how I solved this:. Hope this works for you.
- bundle:

erima21_2-1701460647993.png


- jobs:

erima21_3-1701460677983.png

 

 

ะ•mil
New Contributor III

Hi @erima21 

Do you run the job under service principal account? 
I have git authentication issue as it appears there is no way to get the git provider to use Microsoft Entra ID (formerly Azure Active Directory) authentication. When the job is executed it appears it can not access git...
I already posted about this issue here and I wonder if you had similar experience...

Thanks

ะ•mil
New Contributor III

Sorry the link above doesn't work as my post on the databricks community forum was marked as spam. 
I have explained my issue on stackoverflow here.

> I have git authentication issue as it appears there is no way to get the git provider to use Microsoft Entra ID (formerly Azure Active Directory) authentication.

We published a joint-blog with MSFT to clarify the way to get Azure DevOps to use Microsoft Entra ID (formerly Azure Active Directory) authentication. Check this out!

https://www.databricks.com/blog/integrating-entra-id-azure-devops-and-databricks-better-security-cic...

nicole_lu_PM
Contributor III

We are very close to having an end-to-end solution for deploying DABs from a Git folder (Repo) in the Workspace!

Check out my talk on DAIS24 here https://github.com/databricks/dais-cow-bff (video link on README). We are waiting for the feature that allows `databricks bundle deploy`  to work in the workspace web terminal. It should be available within the next month.


 


AFDCO
New Contributor II

Hi, I'm in the same situation as you. Trying to export an entire workspace to deploy it in other environments. I'm having trouble especially with jobs, which I'm just able to do it with the api and not with bundle since it throws an error with paths --Error: Path (******) doesn't start with '/'--, which ironically is a requirement if you use git as source in the jobs.
It also export the pointed notebooks to the source paths, ignoring parent directories...

How did you manage to solve it?

Thank you 

Hi AFDCO,

able to do it with the api and not with bundle since it throws an error with paths --Error: Path (******) doesn't start with '/'--, which ironically is a requirement if you use git as source in the jobs.

Bundles do deploy jobs that read from the Workspace, not Git.  For an example on the currently recommended approach, see this example: https://www.youtube.com/watch?v=DMwilNpDCiQ

Join us for an immersive journey into the future of CICD on Databricks as we explore building projects in Databricks using Databricks Asset Bundles backed by Git to support inner to outer development loops in the Workspace. We'll dive into workflow authoring and productionization using popular ...

nicole_lu_PM
Contributor III

`databricks bundle deploy` now works in the workspace web terminal without using a special version of Databricks CLI! ๐Ÿ”ฅ

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group