cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How does Databricks handle versioning of notebooks or jobs, and what good practices should newcomers

Suheb
New Contributor II

When you create notebooks or jobs in Databricks, how does Databricks keep track of different versions or changes? And what should beginners do to manage versions safely and effectively?

2 REPLIES 2

szymon_dybczak
Esteemed Contributor III

Hi @Suheb ,

Best practice for versioning your assets is to use git folders. This is recommended approach:

What is Databricks Git folders | Databricks on AWS

But out of the box databricks provides for you some versioning capabilities if you don't want to configure git integration for now.

szymon_dybczak_0-1762158476128.png

 

bianca_unifeye
New Contributor II

Hi @Suheb,

Thatโ€™s a great question, version control is one of the most important things to get right early on.

As a best practice, you should never run notebooks directly in production. Instead, notebooks should be treated as development assets, once validated, they should be packaged, version-controlled, and deployed through proper CI/CD.

1. Use Git integration

  • Databricks integrates directly with GitHub, Azure DevOps and GitLab.

  • Always link your workspace to a Git repo and commit your notebook changes regularly, this keeps full version history and supports collaboration.

2. Package and deploy, donโ€™t run manually

  • Convert notebooks into production-ready code (Python modules or .py scripts).

  • Use Databricks Asset Bundles (DAB) or your CI/CD pipeline to deploy jobs, pipelines, and workflows, not raw notebooks.

  • This ensures environments (Dev, Test, Prod) stay consistent and auditable.

3. Automate with Workflows

  • Use Jobs or Workflows to orchestrate your pipelines instead of manual runs.

  • Parameters, retries, and alerts can all be managed centrally.

4. Keep documentation handy

  • Databricks provides extensive documentation for both Git integration and CI/CD with DAB, plenty of examples depending on your setup (GitHub Actions, Azure DevOps, Jenkins, etc.).

In short:

Develop in notebooks, version in Git, deploy with DAB, and run in production via Jobs/Workflows never directly from a notebook.