Hi Community,
We have a data processing framework running on Azure Databricks with Unity Catalog, and we're evaluating options to consolidate our orchestration entirely within the Databricks ecosystem.
CURRENT ARCHITECTURE:
- ~20 use cases, each containing 3-6 Python notebooks organized by business domain
- A shared Python utility package (with init.py) used across all use cases
- Two Databricks workspaces: Development and Production
- Unity Catalog for data governance and storage
- Azure Data Factory for orchestrating notebook execution (task ordering, dependencies)
- Azure DevOps CI/CD pipelines (one per use case) deploying notebooks to workspaces via Terraform templates
- Environment-specific configs (Key Vault names, service connections, catalog references) managed through ADO variable groups and YAML templates
WHAT WE WANT TO ACHIEVE:
- Replace ADF orchestration with native Databricks orchestration (Lakeflow Jobs / Pipelines)
- Manage environment-specific parameters (dev/prod catalog names, Key Vault, etc.) cleanly across workspaces
- Keep our shared Python utility package working across all use cases without duplication
- Zero changes to existing notebook code
QUESTIONS:
Orchestration: What is the recommended Databricks-native approach to replace ADF for orchestrating notebook execution with task dependencies? We need both sequential and parallel task support.
Project structure: With ~20 use cases, what is the recommended way to organize job/pipeline definitions? One monolithic config vs. modular per-use-case definitions?
Shared library code: Our notebooks import from a shared Python package. What is the best way to handle this - sync the entire repo, or package it as a wheel?
Cross-workspace promotion: For promoting from dev to prod workspace, what authentication method works best - Service Principal with OAuth (M2M) or PAT tokens? Any Unity Catalog permission considerations?
CI/CD: We currently use Azure DevOps plus Terraform for deploying notebook code and job definitions to both workspaces. For those who have made a similar migration - does it make sense to replace Azure DevOps with a Databricks-native deployment approach, or do most teams keep an external CI/CD tool alongside Databricks orchestration?
Incremental migration: Can we migrate one use case at a time while others still run via the legacy ADF setup, without conflicts?
Any real-world experience, recommended approaches, or reference architectures would be very helpful. Is there any tutorial available for it then please provide the link also.
Thanks!