cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

GitHub Actions OIDC with Databricks: wildcard subject for pull_request workflows

Valerio
New Contributor

Hi,
I’m configuring GitHub Actions OIDC authentication with Databricks following the official documentation:
https://docs.databricks.com/aws/en/dev-tools/auth/provider-github

When running a GitHub Actions workflow triggered by pull_request, authentication fails unless the Databricks federation policy subject is configured to exactly match the full sub claim issued by GitHub.

For example, configuring the policy with the full subject string from the token allows authentication to succeed. However, this is not practical, as the subject includes PR-specific and workflow-specific components (such as job_workflow_ref and @refs/pull/<id>/merge), which change across pull requests and workflows.

I would like to configure the federation policy using a wildcard subject that supports multiple pull requests and workflows for the same repository and environment.

What is the recommended way to define the subject pattern in the Databricks federation policy to support this use case?

Thanks!

2 REPLIES 2

bianca_unifeye
Databricks MVP

 

There isn’t a Databricks-side wildcard subject pattern to solve this. 

Databricks service principal federation policies don’t support wildcard / glob / regex matching on subject today, the policy’s oidc_policy.subject is effectively an exact match against whatever claim you configure as the “subject claim” (default: sub). The Databricks docs describe subject as the unique identifier for the workload, and only mention switching subject_claim away from sub if sub is not stable. https://docs.databricks.com/aws/en/dev-tools/auth/oauth-federation-policy

So the “recommended” way to make this work across many PRs/workflows is not a wildcard in Databricks, but making the GitHub-issued subject stable.

The recommended solution is to make the GitHub sub stable, use GitHub Environments so sub is repo:<org>/<repo>:environment:<env> even for PR workflows.

If your org has customized sub to include job_workflow_ref/PR refs, update that GitHub customization to a stable template.

Otherwise, fall back to subject_claim = a stable claim like repository, understanding the security implications

 

SteveOstrowski
Databricks Employee
Databricks Employee
Hi @Valerio,

The challenge you are running into is a common one when setting up OIDC federation for pull_request-triggered workflows. Here is a breakdown of the issue and several approaches to solve it.


UNDERSTANDING THE SUBJECT CLAIM FOR PULL REQUESTS

When a GitHub Actions workflow is triggered by a pull_request event, the default sub (subject) claim in the OIDC token takes the form:

repo:YOUR-ORG/YOUR-REPO:pull_request

This value is actually stable across all pull requests in that repository. It does not include the PR number. The dynamic part you may be seeing (like refs/pull/123/merge) appears in the ref claim and in the job_workflow_ref claim, but not in the sub claim itself.

So if you are using a pull_request trigger without a GitHub Environment, the subject value to set in your Databricks federation policy would simply be:

repo:YOUR-ORG/YOUR-REPO:pull_request

However, the issue becomes more complex when the same service principal needs to authenticate from multiple trigger types (push on main, pull_request, workflow_dispatch, etc.), because each trigger type produces a different subject value.


RECOMMENDED APPROACH: USE GITHUB ENVIRONMENTS

Databricks recommends using GitHub Environments for OIDC federation. When a workflow job specifies an environment, the subject claim format changes to:

repo:YOUR-ORG/YOUR-REPO:environment:ENV-NAME

This is stable regardless of whether the workflow was triggered by a push, pull_request, or workflow_dispatch event. The subject stays the same as long as the job references the same environment.

Example workflow:

name: Deploy with OIDC
on:
pull_request:
branches: [main]
push:
branches: [main]

permissions:
id-token: write
contents: read

jobs:
deploy:
runs-on: ubuntu-latest
environment: staging
env:
DATABRICKS_AUTH_TYPE: github-oidc
DATABRICKS_HOST: https://my-workspace.cloud.databricks.com/
DATABRICKS_CLIENT_ID: your-service-principal-client-id
steps:
- uses: actions/checkout@v4
- uses: databricks/setup-cli@main
- run: databricks current-user me

Then create your federation policy with the environment-based subject:

databricks account service-principal-federation-policy create --json '{
"oidc_policy": {
"issuer": "https://token.actions.githubusercontent.com",
"audiences": [""],
"subject": "repo:YOUR-ORG/YOUR-REPO:environment:staging"
}
}'

This way, both pull_request and push triggers use the same stable subject.


MULTIPLE POLICIES FOR MULTIPLE SUBJECTS

If you cannot use environments, or you need to support multiple distinct subject values, you can create multiple federation policies on the same service principal. Databricks supports up to 20 federation policies per service principal. For example, you could create separate policies for:

- repo:YOUR-ORG/YOUR-REPO:pull_request
- repo:YOUR-ORG/YOUR-REPO:ref:refs/heads/main

Each policy would match a different trigger type. The authentication succeeds if the incoming token matches any of the policies on that service principal.

databricks account service-principal-federation-policy create --json '{
"oidc_policy": {
"issuer": "https://token.actions.githubusercontent.com",
"audiences": [""],
"subject": "repo:YOUR-ORG/YOUR-REPO:pull_request"
}
}'

databricks account service-principal-federation-policy create --json '{
"oidc_policy": {
"issuer": "https://token.actions.githubusercontent.com",
"audiences": [""],
"subject": "repo:YOUR-ORG/YOUR-REPO:ref:refs/heads/main"
}
}'


CUSTOM SUBJECT CLAIMS IN GITHUB

GitHub also supports customizing the subject claim template at the organization or repository level using the GitHub REST API. You can configure it to include only the repository, for example:

curl -L \
-X PUT \
-H "Authorization: Bearer " \
https://api.github.com/repos/YOUR-ORG/YOUR-REPO/actions/oidc/customization/sub \
-d '{"use_default": false, "include_claim_keys": ["repo"]}'

With this configuration, the subject claim would become:

repo:YOUR-ORG/YOUR-REPO

This would be the same for all triggers (push, pull_request, workflow_dispatch), making it simple to match with a single federation policy. Be aware that simplifying the subject reduces the granularity of your security controls, so evaluate the tradeoffs for your use case.

GitHub docs on customizing subject claims:
https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/a...


SUMMARY

1. If possible, use GitHub Environments (recommended by Databricks). The subject claim becomes stable across all trigger types for that environment.
2. If environments are not an option, create multiple federation policies (up to 20 per service principal) to cover each trigger type.
3. You can also customize the GitHub OIDC subject template via the GitHub API to produce a simpler, stable subject.

Relevant documentation:
https://docs.databricks.com/aws/en/dev-tools/auth/oauth-federation
https://docs.databricks.com/aws/en/dev-tools/auth/provider-github
https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/a...

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.

If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.