cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta Sharing - Alternative to config.share

alex-syk
New Contributor II

I was recently given a credential file to access shared data via delta sharing. I am following the documentation from https://docs.databricks.com/en/data-sharing/read-data-open.html. The documentation wants the contents of the credential file in a folder in DBFS. I would like to use Azure Key Vault instead.

Therefore, instead of using (under "Step 2: Use a notebook to list and read shared tables" in the above URL):

client = delta_sharing.SharingClient(f"/dbfs/<dbfs-path>/config.share")

client.list_all_tables()

I am using:

credentials = dbutils.secrets.get(scope='redacted', key='redacted')

profile = delta_sharing.protocol.DeltaSharingProfile.from_json(credentials)

client = delta_sharing.SharingClient(profile=profile)

client.list_all_tables()

The above works fine. I can list the tables. Now I would like to load a table using Spark. The documentation suggests using

delta_sharing.load_as_spark(f"<profile-path>#<share-name>.<schema-name>.<table-name>", version=<version-as-of>)

But that relies on having stored the contents of the credential file in a folder in DBFS and using that path for <profile-path>. Is there an alternative way to do this with the "profile" variable I am using? By the way, the code is bold instead of formatted in code blocks because I kept getting errors that prevented me from posting.

7 REPLIES 7

Debayan
Databricks Employee
Databricks Employee

Hi, You can create a secret and store the key inside it OR also you can use a local tool to Base64-encode the contents of your JSON key file, create a secret in a Databricks-backed scope and then you can copy & paste the Base64-encoded text into your secret value. After that, you can reference your secret with the following Spark config of your cluster: credentials {{secrets/<scope-name>/<secret-name>}}

Please tag @Debayan with your next comment, which will get me notified. Thanks!

alex-syk
New Contributor II

Hi @Debayan, thanks for your response!

I'm trying to understand your instructions. The content of my credential file is (I've replaced confidential information with "xyz"):
{"shareCredentialsVersion":1, "bearerToken":"xyz", "endpoint":"xyz", "expirationTime":"2023-09-10T04:10:49.277Z"}

I put that content in a secret in a Databricks-backed scope, and can access it:
credentials = dbutils.secrets.get(scope='redacted', key='redacted')
profile = delta_sharing.protocol.DeltaSharingProfile.from_json(credentials)

Now, instead of doing
delta_sharing.load_as_spark(f"<profile-path>#<share-name>.<schema-name>.<table-name>", version=<version-as-of>)
as suggested in the documentation, I was hoping to use my profile variable that I created and use that in place of <profile-path>. Is that possible? I was thinking there has to be a way because the profile variable has the same information in the share.config file. That is,

print(profile)
DeltaSharingProfile(share_credentials_version=1, endpoint='xyz', bearer_token='xyz', expiration_time='2023-09-10T04:10:49.277Z', type=None, token_endpoint=None, client_id=None, client_secret=None, username=None, password=None)
 

 

Debayan
Databricks Employee
Databricks Employee

Hi, the most feasible way would be to convert the contents of your key file into base64 and only mention the spark config as below: 

credentials <base 64 encoded code>

alex-syk
New Contributor II

Hi @Debayan, do you have have some example code you can share?

Debayan
Databricks Employee
Databricks Employee

Hi, there is no code as such, only in the spark config you have to mention the syntax with the spark config as below: 

credentials <base 64 encoded code>

alex-syk
New Contributor II

Hi @Debayan, how do I mention the syntax with the spark config?

Debayan
Databricks Employee
Databricks Employee

Hi, you can mention something like below with the other spark configs, such as: 

spark.hadoop.google.cloud.auth.service.account.enable true
spark.hadoop.fs.gs.auth.service.account.email <client-email>
spark.hadoop.fs.gs.project.id <project-id>
credentials <base 64 encoded code>

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group