Databricks Jobs and CICD
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-02-2022 09:07 AM
Hi,
We currently leverage Azure DevOps to source control our notebooks and use CICD to publish the notebooks to different environments and this works very well.
We do not have the same functionality available for Databricks jobs (the ability to source control or deploy through CICD). Could we call the Databricks Jobs create API to promote the job definitions through environments or is there a better way of doing this?
Look forward to hearing from you
- Labels:
-
CICD
-
Databricks jobs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-23-2022 07:58 AM
Hi @Kaniz Fatma interesting new feature for sure but not sure how this helps me with the challenge. As an aside, there probably also needs to be something similar for Databricks tables as the definition of the tables is essentially code.
Echoing @Anders Bergmål-Manneråk comments from yesterday....
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-22-2022 06:57 AM
We're also missing this feature to have easy version control and CI/CD for Jobs/Workflows. The answer from @Kaniz Fatma does not cover the question as I see it.
Right now the only options we see is using Terraform, or the Jobs API 2.1 as superficially described in this post. But they both require some sort of custom setup. What we really need is a more integrated experience like we have for notebooks using Databricks repos.
It would be great to get some updates from Databricks Workflows team on this. Is this a feature on the roadmap? If not, it would be awesome to get some official advise on handling source control and CI/CD for Workflows/Jobs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-14-2022 01:43 PM
Hi @Brian Labrom and @Anders Bergmål-Manneråk - my team is looking at creating a set of features to provide simpler source control and CI/CD. Can you please send me an email at saad dot ansari at databricks dot com?
I would love to get more feedback incorporated into our approach here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-15-2022 01:36 AM
@Saad Ansari popped you an email now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-15-2022 02:00 AM
@Saad Ansari have done the same
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-26-2023 12:39 PM
My team is currently looking at establishing REPO(s) for source control to start. I know I've seen some documentation for when a MERGE is completed to auto update the main branch in DBX remote repo. Does annyone have a template and/or best practices here?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a month ago
To manage Databricks jobs within a DevOps pipeline, start by exporting the job configuration as a JSON file from the Databricks workspace. Parameterize this JSON by replacing environment-specific values with placeholders. Integrate the parameterized JSON into your DevOps repository and use CI/CD tools to replace these placeholders with actual values during pipeline execution. Create the job using the Databricks CLI command databricks jobs create --json-file path_to_your_json_file.json. Additionally, set up a separate pipeline to delete jobs using the command databricks jobs delete --job-id .

