Transitioning from ADF to Databricks Workflows: Best Practices in a Multi-Workspace (dev-prod)

Darshan137
New Contributor II

Hi Community,

We have a data processing framework running on Azure Databricks with Unity Catalog, and we're evaluating options to consolidate our orchestration entirely within the Databricks ecosystem.

CURRENT ARCHITECTURE:

  • ~20 use cases, each containing 3-6 Python notebooks organized by business domain
  • A shared Python utility package (with init.py) used across all use cases
  • Two Databricks workspaces: Development and Production
  • Unity Catalog for data governance and storage
  • Azure Data Factory for orchestrating notebook execution (task ordering, dependencies)
  • Azure DevOps CI/CD pipelines (one per use case) deploying notebooks to workspaces via Terraform templates
  • Environment-specific configs (Key Vault names, service connections, catalog references) managed through ADO variable groups and YAML templates

WHAT WE WANT TO ACHIEVE:

  • Replace ADF orchestration with native Databricks orchestration (Lakeflow Jobs / Pipelines)
  • Manage environment-specific parameters (dev/prod catalog names, Key Vault, etc.) cleanly across workspaces
  • Keep our shared Python utility package working across all use cases without duplication
  • Zero changes to existing notebook code

QUESTIONS:

  1. Orchestration: What is the recommended Databricks-native approach to replace ADF for orchestrating notebook execution with task dependencies? We need both sequential and parallel task support.

  2. Project structure: With ~20 use cases, what is the recommended way to organize job/pipeline definitions? One monolithic config vs. modular per-use-case definitions?

  3. Shared library code: Our notebooks import from a shared Python package. What is the best way to handle this - sync the entire repo, or package it as a wheel?

  4. Cross-workspace promotion: For promoting from dev to prod workspace, what authentication method works best - Service Principal with OAuth (M2M) or PAT tokens? Any Unity Catalog permission considerations?

  5. CI/CD: We currently use Azure DevOps plus Terraform for deploying notebook code and job definitions to both workspaces. For those who have made a similar migration - does it make sense to replace Azure DevOps with a Databricks-native deployment approach, or do most teams keep an external CI/CD tool alongside Databricks orchestration?

  6. Incremental migration: Can we migrate one use case at a time while others still run via the legacy ADF setup, without conflicts?

Any real-world experience, recommended approaches, or reference architectures would be very helpful. Is there any tutorial available for it then please provide the link also.

Thanks!