cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

A way to get databricks data via Rest API on behalf of user

maikel
New Contributor III

Hello Databricks Community!

I am trying to figure out the best way to get the data from databricks via REST API (or Python SDK but not preferable) but to do not lose information about users permissions during authentication. 

The use case is that server needs to get data (tables content and volumes) from databricks and return it in a json format to the user. Authentication should be done in a way that I can keep the same permissions as given user has.

Here are the ways which I currently found:

  • using personal access token (PAT) - it requires user to copy it (and then paste in server configuration) after logging in to the data bricks which I would like to avoid to be completely independent from DBX UI.
  • using service principal - the idea which I have is to create SP for each of the users that it has the same set of permissions. Therefore I can create client_secret which might be used with Python SDK.

Questions:

  • are there any better way to do this authentication and to keep user permissions?
  • do you recommend using Python SDK? It is right now in Beta.
  • is there a way to use service principal client_id and client_secret and authenticate through the REST API? (if I do not want to use Python SDK)

Thank you a lot in advance for help!

2 REPLIES 2

amartt
New Contributor III

Use OAuth based on a user or service principal which will respect the given access granted through unity catalog to them:

https://docs.databricks.com/aws/en/dev-tools/auth/oauth-u2m#gsc.tab=0

Whereas a PAT is only for use to the specified user it was granted to.

Recommend SDK over the API as the SDK is more stable in terms of endpoint versions and abstracts some of the specific request handling, so it just makes it more maintainable long-term if your solution is interacting with the API a lot as the SDK is built on top of the API.

Databricks recommends pinning a version of the SDK, so things won't break if it gets upgraded and then dependencies are reinstalled

https://docs.databricks.com/aws/en/dev-tools/sdk-python#gsc.tab=0

nayan_wylde
Esteemed Contributor

The recommended approach is OAuth 2.0 with On-Behalf-Of (OBO) authentication. This allows your server or app to act on behalf of the user, using their identity and permissions to access Databricks resources.

  • How it works:

    • The user authenticates via OAuth and grants consent.
    • Your app receives a short-lived access token scoped to the user.
    • You use this token to call Databricks REST APIs or SQL endpoints.
    • The data access respects the user’s permissions (e.g., Unity Catalog ACLs, SQL warehouse access).
  • Benefits:

    • No need for users to manually copy PATs.
    • Permissions are enforced per user.
    • Secure and scalable for enterprise use.
  • Requirements:

    • Your app must be registered with your identity provider (e.g., Entra ID).
    • Databricks workspace must support OAuth federation.
    • You must enable OBO authentication in your Databricks app or service.

https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-on-behalf-of-flow