topic Re: Service Principal access notebooks created under /Workspace/Users in Data Engineering

Service Principal access notebooks created under /Workspace/Users

DineshOjha — Mon, 23 Mar 2026 22:53:03 GMT

What permissions does a Service Principal need to run Databricks jobs that reference notebooks created by a user and stored in Git?

Hi everyone,

We are exploring the notebooks‑first development approach with Databricks Bundles, and we’ve run into a workspace‑permissions challenge involving Service Principals.

Our setup

Developers create notebooks under their personal workspace paths:
```
/Workspace/Users/<user_email>/project/notebook
```
These notebooks are synced to Git using Databricks Git folders.
We want to create Databricks Bundles jobs that reference these notebooks and
run them using a Service Principal (SP) for production automation.

The problem

A Service Principal cannot access user workspace paths such as:

/Workspace/Users/<user_email>/...

We also cannot:

Move the notebook into /Workspace/Shared/...
Grant the SP access to individual user workspace directories

So the SP has no way to read or execute the notebook, and therefore cannot run the job.

Our question

How should we structure our workspace, Git folders, or permissions so the Service Principal can run Bundle‑based jobs, without granting SP access to personal user directories?

Re: Service Principal access notebooks created under /Workspace/Users

Ashwin_DSA — Tue, 24 Mar 2026 17:59:55 GMT

Hi @DineshOjha,

This is a good question, and researching this helped me learn some best practices along the way. What you’re seeing is actually expected behaviour. Service principals aren’t meant to execute notebooks directly from users’ personal workspace paths. That limitation is by design for security and isolation reasons.

Given you’re using Databricks Bundles and a notebooks‑first workflow, the recommended pattern is to treat Git as the source of truth. Developers can work on notebooks under their own /Workspace/Users/... paths (or locally) for convenience, then sync them to Git (via Git folders / Repos). Those copies in personal home directories should be considered development artefacts only, not what production jobs execute. In production, jobs should use notebooks deployed from Git into a shared workspace path, or reference Git directly (using jobs with a Git-based notebook source).

Instead of pointing jobs to /Workspace/Users/..., configure your bundle target so that it deploys notebooks into a shared folder where both the service principal has at least read/execute access, and your team can still inspect the deployed artefacts.

For example, in your bundle:

targets:
  prod:
    workspace:
      host: https://&lt;your-workspace-url&gt;
      root_path: /Workspace/Shared/projects/my-project

When you run databricks bundle deploy (ideally from CI/CD, authenticated as the service principal), the notebooks defined in the bundle are materialised under /Workspace/Shared/projects/my-project/... Your bundle’s jobs should reference those deployed notebook paths, not the originals under /Workspace/Users/....

On the Databricks side, you’ll typically want

On /Workspace/Shared/projects/my-project - Can Read (or higher) for the service principal, so it can read/execute the notebooks.
On the jobs created by the bundle - Can Manage or Can Run for the service principal, depending on your governance model.
On compute - Permission for the service principal to use the job cluster or shared compute resource configured for the bundle’s jobs.

With this setup developers continue to use their personal workspace areas for development. Git remains the source of truth. And, the service principal only interacts with the shared, deployed artifacts and never needs access to /Workspace/Users/....

If you prefer to be fully Git‑centric, you can also configure jobs to pull notebooks directly from Git (e.g. via Repos/git_source) and grant the service principal access to the Git repo, plus job permissions as above. However, the core principle is the same in both approaches... Don’t run production jobs against notebooks in /Workspace/Users/.... Use Git as the source of truth, and deploy or reference notebooks from a shared, service‑principal‑readable location.

Hope that helps clarify the pattern.

Please let me know if any of the above is unclear.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Re: Service Principal access notebooks created under /Workspace/Users

DineshOjha — Wed, 25 Mar 2026 22:53:22 GMT

Thank you so much for your response.

We dont prefer to keep the notebooks under Shared or run our jobs pointing to the Shared location. We have more than 200 applications and different teams working on them. Each application has a service principal associated with it and only the service principal has access to the specific applications volume and schema.

Based on your response, we are planning to follow the below approach.

1. Create notebooks under personal user account

2. Push the code to GIT

3. Deploy using bundles

4. In the bundles, provide the run_as AS service principal so that the jobs are owned and run using the service principal.

Questions:

1. Do you think this is a good approach for notebook based implementation or do you suggest anything else?

2. The service principal exists only in Databricks, so what email and PAT should be provided to enable GIT access? 3. How will service principal get access to the Azure GIT repo (ADO repository)?

4. Is there any other access that the service principal needs for this approach, for bundles etc ?

Re: Service Principal access notebooks created under /Workspace/Users

Ashwin_DSA — Thu, 26 Mar 2026 13:55:20 GMT

Hi @DineshOjha,

Given your constraints (per‑application service principals, isolation at the volume/schema level, and not wanting to use /Workspace/Shared), the flow you described aligns with how Bundles are meant to be used in production. Bundles are the recommended CI/CD mechanism, and using service principals as run identities in non‑dev targets is explicitly encouraged.

A couple of clarifications and direct answers to your questions:

1. Do you think this is a good approach for notebook based implementation or do you suggest anything else?

Yes, this is a solid pattern for notebook‑based implementations... Git serves as the source of truth, with personal workspaces intended solely for development purposes. Bundles are responsible for deploying notebooks and job definitions into the workspace. In non-development targets, the run_as parameter is configured to use the per-application service principal, ensuring that all production runs utilise that principal’s permissions. This setup includes access to the appropriate volume/schema, which is critical for maintaining consistency and security throughout the deployment process.

The only design choice you still need is where in the workspace Bundles deploy to. You don’t have to use /Workspace/Shared. You can pick any isolated path, for example /Workspace/.bundle/prod/${bundle.name} or /Workspace/Projects/<app_name>/... and lock that path down so only the application service principal, a small operator group, and optionally CI/CD deployer principals have access. The path naming is up to you. Bundles just need a root_path per target, and you control the permissions there.

So I would keep your 4‑step approach and add a per‑app workspace root (instead of /Shared), with ACLs granting access only to the relevant SP and operators.

2. The service principal exists only in Databricks, so what email and PAT should be provided to enable GIT access?

With the Bundles‑from‑Azure‑DevOps pattern you described, the important nuance is that your Databricks service principal does not need to talk directly to Git to make this work. In a typical Azure DevOps setup.. Azure DevOps pipelines clone the Git repo themselves using the identity configured in DevOps (service connection, PAT, or Microsoft Entra–backed principal). Once the code is on the build agent, the pipeline calls databricks bundle validate/deploy/run using the Databricks service principal to authenticate to Databricks, not to Git.

In that model, you do not need to configure a Git email/PAT on the Databricks SP at all. Git credentials live entirely in Azure DevOps (for checking out the repo).The Databricks SP is only used for workspace authentication (via OAuth M2M, workload identity federation, or an ARM service connection). You only need Git credentials on the Databricks SP if you also want it to use Git folders / Repos in the workspace, or run Git‑backed jobs directly from Databricks (using Git‑with‑jobs / Git folders).

In that case, the email/PAT would belong to a non‑human Azure DevOps identity (service principal or technical user) that has access to the repo. You then link those Git credentials to the Databricks SP via the Git integration tab in the workspace.

3. How will service principal get access to the Azure GIT repo (ADO repository)?

In a two-layer setup, the first layer involves Azure DevOps and a Git repository. In this configuration, you create a service principal or technical user with at least Basic access and repository permissions in Azure DevOps. This identity is utilised for your pipelines to check out the code, and it is managed within Azure DevOps, not in Databricks.

The second layer connects Azure DevOps to Databricks through a Databricks service principal. To set this up, you configure an Azure DevOps service connection that authenticates to Databricks using methods such as OAuth M2M, Azure Resource Manager connection, or the recommended workload identity federation (which avoids long-lived secrets). Your pipeline steps will involve commands like `databricks bundle validate -t prod`, `databricks bundle deploy -t prod`, and `databricks bundle run -t prod <job_name>`, with the Databricks CLI already authenticated as the service principal.

For a Bundles-only flow, the Databricks service principal does not require direct Git access. It simply needs to support CLI/API calls from the pipeline. However, if you want the Databricks service principal to operate on Git folders within the workspace, you must grant the DevOps identity access to the repository (Basic + repo permissions) and link its Git credentials to your Databricks service principal under the settings for Git integration with Azure DevOps (using PAT or Entra-based authentication).

4. Is there any other access that the service principal needs for this approach, for bundles etc ?

For the exact permission model and how to wire this up, the official docs cover it in more detail..

CI/CD on Azure Databricks (incl. Bundles + service principals)
Service principals for CI/CD (what entitlements/permissions they need)
Authorize a Microsoft Entra service principal to access Git folders (if you decide the SP should access Git folders directly)
Authenticate with Azure DevOps on Azure Databricks (how the DevOps pipeline authenticates as the SP)
Bundles run_as configuration (how to set the SP as the run identity in targets)

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Re: Service Principal access notebooks created under /Workspace/Users

DineshOjha — Tue, 31 Mar 2026 15:36:52 GMT

Thank you so much Ashwin, this provides a lot of clarity.

1. Where to deploy Bundles in the workspace

We plan to deploy the bundle using a service principal , so the bundle we plan to deploy under /Workspace/<service_principal>

1. Create notebooks under personal user account
2. Create jobs as .yml files to call these notebooks
3. Push the code to GIT
4. Create bundles
5. Deploy the bundle using azure pipelines using the service principal

This would deploy the bundle under the service_principal account and make it the owner of the jobs as well.
These jobs would be later executed via a seperate secheduling tool called Control-M

2. Source as Azure GIT repo vs Workspace

From your response we understand that the serive principal needs access to GIT if the source type of our jobs is GIT. But if we define jobs with source: WORKSPACE, serive principal
need not have access to GIT.
As these are 2 seperate approaches -> 1. Source type as GIT and 2. Source as Workspace . Is there a benefit of one approach over the other?

3. CI/CD using DAB

We are currently using the python wheel approach , in which we run the pytests as part of the Azure pipeline.
When we are using DAB, whats the best process to run these pytests?
In some places its mentioned that these tests need to be run as a seperate job. I didnt find a place where it defines the best practices for these
pytests when deploying notebooks using DAB

4. Notebooks vs python tasks
If we are deploying purely python script, is there a recommendation of using 1 over the other?
In a python wheel approach, we define an entry point, but dont see an option to do that with notebooks, hence need to call the main function explicity. Is that the correct approach

5. Also, for some reason the links that you provided are not opening correctly, not sure if something got changed while pasting them.

Thank you again for your support, highly appreciate you taking the time to research and respond.

Thanks

Komal

Re: Service Principal access notebooks created under /Workspace/Users

Ashwin_DSA — Tue, 31 Mar 2026 16:26:33 GMT

Hi @DineshOjha,

Updated links below. Will respond to your queries before the end of this week.

CI/CD on Azure Databricks (incl. Bundles + service principals)
Service principals for CI/CD (what entitlements/permissions they need)
Authorize a Microsoft Entra service principal to access Git folders (if you decide the SP should access Git folders directly)
Authenticate with Azure DevOps on Azure Databricks (how the DevOps pipeline authenticates as the SP)
Bundles run_as configuration (how to set the SP as the run identity in targets)

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Re: Service Principal access notebooks created under /Workspace/Users

Ashwin_DSA — Sat, 11 Apr 2026 20:56:46 GMT

Hi @DineshOjha,

I lost track of this... just remembered. Please see the responses below.

1. Where to deploy Bundles in the workspace

Your proposed flow is perfectly compatible with Bundles and CI/CD best practices. On the workspace location... technically, you can set the target’s root_path to something like /Workspace/<service_principal>/<app_name> and deploy there, as long as the deploying identity (CI/CD SP) has permission to write into that path and humans who need to debug (e.g., app team) have at least read access.

The Bundles docs commonly show a pattern like /Workspace/.bundle/${bundle.target}/${bundle.name} (or a similar structured path), and then you secure that folder. So structurally, you have two good options. Per‑SP home, per app subfolder (/Workspace/<service_principal>/<app_name>) and Neutral "system" root for all bundles (/Workspace/.bundle/prod/${bundle.name}).

From Databricks’ perspective, both are fine as long as the ACLs are correct. For a large estate (200+ apps), the neutral .bundle namespace tends to age better for discoverability and governance, but your per‑SP approach is not wrong.. It’s more of an org‑convention choice.

Running the jobs from Control‑M is also fine. You’re just triggering Databricks jobs via API, and the location of deployed assets doesn’t change that.

2. Source as Azure GIT repo vs Workspace

Your understanding is correct. If a job is configured with source = Git (Git‑with‑jobs, or Git folders), then the Databricks identity that pulls from Git (user or SP) needs Git credentials/permissions. If a job is configured with source = Workspace (tasks point at workspace notebook paths), and Azure DevOps does the git checkout and then calls databricks bundle deploy, then the Databricks service principal does not need Git access. DevOps talks to Git.... the SP only talks to Databricks.

Bundles already assume Git is your source of truth and handle deployment from the checked‑out repo into the workspace. In that model, it’s very common to use WORKSPACE as the job source (tasks reference the deployed notebook/script paths), and let Bundles + CI/CD ensure that workspace state is in sync with Git.

With the workspace source (with Bundles) approach, the simpler mental model is Git → CI/CD → Bundles → workspace; jobs read workspace. You also get full power of Bundles: targets, run_as, permissions, deployment modes, etc. And, there is no need to manage Git credentials on the Databricks SP unless you also use Git folders directly.

Git source (Git‑with‑jobs) approach is most useful if you aren’t using Bundles and want jobs to pull from Git directly at run time. More limited job/task types, and job configuration itself isn’t in source control in the same way as with Bundles.

Given you are standardising on Bundles and already using Azure DevOps, you may want to consider workspace source for jobs (deployed by Bundles), and keep Git access concentrated in Azure DevOps and any interactive developer identities.

3. CI/CD using DAB

The core best practice doesn’t change with Bundles. Keep unit tests (pytest) in your CI system, close to the code. This is still the primary mechanism for fast feedback and correctness, regardless of Bundles.

What Bundles add is a good place for integration tests where you define a test/run-unit-tests job as a resource inside the bundle (for example a small job that runs a test notebook or a script calling your wheel).

The official Azure DevOps + Bundles example shows this pattern: build/test artifact, then deploy bundle, then run a test job from the bundle. So... keep pytests in Azure Pipelines as you do today, and optionally add bundle‑defined test jobs for integration/end‑to‑end checks.

4. Notebooks vs python tasks

A common pattern that fits what you’re doing today... Keep all real logic in a wheel (or at least a Python package). In jobs, either run a python_wheel_task directly (no notebook at all), or use a very thin notebook that imports your wheel and calls main() with parameters.

That gives you the best of both worlds. Testability and CI friendliness from the wheel, plus optional notebook ergonomics when you want them.

Hope this helps.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.