cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

PATs sharing in a global data platform

noorbasha534
Valued Contributor II

Hello all

Checking on how others implement sharing of Databricks personal access tokens for authentication wherein you have atleast 25+ different technologies extracting data via SQL warehouses ((imagine a global data platform that hosts data for usage across company))

1. Some technologies don't support oauth, example - Collibra, Knime - forcing us to generate PATs

2. Some technologies can't read from a key vault where we like to put the PATs centrally - example, Collibra, Knime, Informatica

These situations are resulting into maintenance overhead for us. Though we have PATs expiry alert, these need to be regenerated & sent to the respective stakeholders. How to document stakeholders at scale is another topic where I like to hear the ideas.

Appreciate the mind share...

 

1 REPLY 1

Isi
Honored Contributor III

Hey @noorbasha534 

Honestly, I really understand your pain around token management. I face the same situation myself and it can definitely become a headache, especially when you have multiple technologies in play, some of them open-source, and even cases where you end up overlapping tools that essentially try to do the same job.

 

From my experience, the best approach is to use a central system such as Secret Manager or Azure Key Vault as the secure place to store these PATs. If that’s not possible, then try to rely on role assumptions so that machines or services can fetch the required secrets dynamically without embedding them everywhere.

 

When it comes to rotation, my recommendation is to use the same system for creation and rotation. For example, if you create PATs via Terraform, avoid rotating them with a separate Cloud Function or Lambda, otherwise you’ll constantly introduce drift. A better pattern is to leverage reporting capabilities to identify tokens that are about to expire, and then have a process that both rotates and notifies stakeholders. I’ve implemented this email system with the Graph API client to avoid spams.

 

It’s also worth noting that if you work with Service Principals, you’ll need PATs anyway since they’re not human users. And even if you move to OAuth, you still face expiration periods — meaning you’ll have to reconnect or refresh sessions, which can also break ingestion pipelines or refreshes. For example, I’ve seen this happen with Power BI dashboards, where failed refreshes were reported just because the OAuth token had expired for the assigned user.

Hope this helps, 🙂

Isi

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now