cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How is authentication and authorization maintained in Delta Sharing while sharing across teams and organizations

Srikanth_Gupta_
Valued Contributor
 
1 ACCEPTED SOLUTION

Accepted Solutions

User16826994223
Honored Contributor III
  1. The recipientโ€™s client authenticates to the sharing server (via a bearer token or other method) and asks to query a specific table. The client can also provide filters on the data (e.g. โ€œcountry=USโ€) as a hint to read just a subset of the data.
  2. The server verifies whether the client is allowed to access the data, logs the request, and then determines which data to send back. This will be a subset of the data objects in S3 or other cloud storage systems that actually make up the table.
  3. To transfer the data, the server generates short-lived pre-signed URLs that allow the client to read these Parquet files directly from the cloud provider, so that the transfer can happen in parallel at massive bandwidth, without streaming through the sharing server. This powerful feature available in all the major clouds makes it fast, cheap and reliable to share very large datasets.

View solution in original post

1 REPLY 1

User16826994223
Honored Contributor III
  1. The recipientโ€™s client authenticates to the sharing server (via a bearer token or other method) and asks to query a specific table. The client can also provide filters on the data (e.g. โ€œcountry=USโ€) as a hint to read just a subset of the data.
  2. The server verifies whether the client is allowed to access the data, logs the request, and then determines which data to send back. This will be a subset of the data objects in S3 or other cloud storage systems that actually make up the table.
  3. To transfer the data, the server generates short-lived pre-signed URLs that allow the client to read these Parquet files directly from the cloud provider, so that the transfer can happen in parallel at massive bandwidth, without streaming through the sharing server. This powerful feature available in all the major clouds makes it fast, cheap and reliable to share very large datasets.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.