Tuesday
I’d like to gather insights, best practices, and real-world experiences from anyone who has worked with Databricks Repos for team-based development. Databricks Repos is designed to streamline source control, versioning, and automation, but the workflows can vary depending on how your team structures projects and CI/CD pipelines.
Specifically, I’m looking to understand:
Whether you’ve built a fully automated DevOps pipeline or are just experimenting with Repos for collaboration, your experiences can help others understand what works well and what to avoid.
Looking forward to hearing how you’re using Databricks Repos in real development environments!
Tuesday
From my view, repos should be mandatory for whatever project and platform, otherwise a lot of issues will arise sooner or later and, automation will never be possible.
Having said this, apart from Databricks official documentation you can get, my experience with GIT repos in databricks is very satisfactory when combining it with other external tools such as Visual Studio Code and Azure DevOps. I'll try to summarize:
Let's say I try to use the best of each tool. Visual Studio Code let me develop code very fast with high quality code control (Pylint, Pylance, etc.), AI assisted (GitHub Copilot, AI agents,etc.), high variety of plugins, etc. while I have the option of executing that code in Databricks directly via extension or by pushing into Azure DevOps GIT repo firstly and then, use Databricks Git repos to pull and run directly in Databricks. All in all, depending on the task I switch from Visual Studio Code and Databricks UI or viveversa but always with GIT repo as common/shared place for code. Besides, Visual Studio Code let me deploy easily Databricks Asset Bundles (DAB) for DEV or TEST environment by running proper commands in integrated Databricks CLI tool. This is only my personal opinion, I like to develop most part of code in VS Code and then, switch to Databricks UI to test it in personal/shared cluster, instead of running it directly from VS Code (it's much slower in my case and could spark security concerns in corporate scenarios).
In my case, Azure DevOps is used to CI/CD purposes. It features very straightforward tools to integrate Databricks CLI and deploy DAB in a very simple and secure manner. Very important as well, Azure DevOps is the tool to run and review Pull Requests as Databricks repos is not offering that yet.
yesterday
@Coffee77 I completely agree and got very same setup and usage. However, I do execute on Databricks clusters from VS Code using Databricks Connect and never faced any slow-behaviour.
Another thing, I switch to Databricks UI for DLT coding as the Databricks AI assistant is very useful at times.
yesterday - last edited yesterday
You are very lucky @Raman_Unifeye , issue in my case is related to very strict security rules in my company network (I've not looked deeper into this issue to find out why exactly... ) BUT launching that from VS Code if working fine and fast is superb 🙂
Wednesday
Databricks Repos enables seamless collaborative development by integrating Git repositories for version control, allowing multiple users to work on notebooks and code simultaneously. To integrate CI/CD, link your Databricks workspace with CI/CD tools like Jenkins or GitHub Actions. Use Databricks CLI or REST APIs to automate testing, deployment, and job execution. Manage secrets securely and automate code deployment through the pipeline. This streamlines collaboration, ensures code quality, and automates workflows for efficient data engineering and data science processes.
Wednesday
Databricks Repos enable teams to manage notebooks and code with Git integration. Start by linking your repo (GitHub, GitLab, or Azure DevOps) to Databricks and clone your project. Use feature branches for development and collaborative reviews. For CI/CD, integrate with tools like GitHub Actions or Azure Pipelines to automate testing, validation, and deployment of notebooks and jobs. Regularly pull changes, resolve conflicts, and maintain consistent coding standards to streamline collaboration and production releases.
Wednesday
Databricks Repos make collaborative development easy by connecting notebooks to Git. You can work on branches, track changes, and sync with your team. Plus, they integrate with CI/CD pipelines, allowing automated testing and deployment of notebooks or workflows — making teamwork and production deployments much smoother.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now