cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Accessing data bricks data outside data bricks

maikel
New Contributor

Hi!

What is the best way to access data bricks data, outside data bricks e.g. from Python code? The main problem is authentication so that I can access data to which I have permissions but I would like to generate token outside data bricks (e.g. via REST endpoint).
The data which I would like to access are files but also tables. 

I will be very thankful for your help and if possible examples how to achieve this.

 

 

3 REPLIES 3

Chiran-Gajula
New Contributor

Hello Maikel,

You can use your token generated in Databricks and use databricks SQL connector for python.
More details here: Databricks SQL Connector for Python | Databricks on AWS

 

G.Chiranjeevi

maikel
New Contributor

Hi!
The problem is, I have to authenticate from python code as well with my corporation account. Not copy and paste token to the code.

Any ideas? ๐Ÿ™‚

dkushari
Databricks Employee
Databricks Employee

Hi @maikel - You can set up a Service Principal in Databricks and a client ID and Client Secret. Then set up a Databricks profile and use Python code with that profile. Look at the profile section in step 2, how the profile can be set up with client ID and secret for workspace-level operation.

Databricks uses OAuth 2.0 as the preferred protocol for service principal authorization and authentication outside of the UI. Unified client authentication automates token generation and refresh. When a service principal signs in and is granted consent, OAuth issues an access token for the CLI, SDK, or other tool to use on its behalf. Each access token is valid for one hour, after which a new token is automatically requested.

Here is my example -

Profile -

[fielddemo-sp]
host             = https://workspace.cloud.databricks.com/
client_id        = XXXXXXXXXXXXX
client_secret    = dosedXXXXXXXXXXXXXXXX

Code -

from databricks.sdk import WorkspaceClient

w = WorkspaceClient(profile="fielddemo-sp")

filepath=f"{w.volumes.read("<<catalog>>.<<schema>>.<<volumename>>").as_dict()['storage_location']}"
print(filepath)
vol_path = "/Volumes/catalog/schema/volumename"
for f in w.files.list_directory_contents(vol_path):
    print(f)

 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now