โ10-10-2023 02:46 AM
Hello,
I utilize an Azure Databricks notebook to access Delta Sharing tables, employing the open sharing protocol. I've successfully uploaded the 'config.share' file to dbfs. Upon executing the commands:
client = delta_sharing.SharingClient(f"/dbfs/path/config.share")
client.list_all_tables()
I can observe all table names and schemas. However, when I attempt to display the data using
spark.read.format("deltaSharing")
I encounter an error labeled 'error content'.
FileReadException: Error while reading file delta-sharing:/dbfsXXXX.
Caused by: IOException: java.util.concurrent.ExecutionException: io.delta.sharing.spark.util.UnexpectedHttpStatus: HTTP request failed with status: HTTP/1.1 403 This request is not authorized to perform this operation. {"error":{"code":"AuthorizationFailure","message":"This request is not authorized to perform this operation.\nRequestId:4b5091fb-e01f-004e-1391-fa30ed000000\nTime:2023-10-09T09:16:31.5069373Z"}}
Caused by: ExecutionException: io.delta.sharing.spark.util.UnexpectedHttpStatus: HTTP request failed with status: HTTP/1.1 403 This request is not authorized to perform this operation. {"error":{"code":"AuthorizationFailure","message":"This request is not authorized to perform this operation.\nRequestId:4b5091fb-e01f-004e-1391-fa30ed000000\nTime:2023-10-09T09:16:31.5069373Z"}}
For Details I use Databricks standard version and Runtime 13.1 ML.
Has anyone else experienced the same error?
โ11-10-2023 01:12 AM
Hi @dbx_deltaSharin,
When querying the individual partitions, the files are being read by using an S3 access point location while it is using the actual S3 name when reading the table as a whole. This information is fetched from the table metadata itself.
It appears, in the source metastore, the table metadata is pointed to the s3 location where as the partitions are defined with the s3 access point location.
Please review the table and partition metadata at the source table. Update the table metadata and point the table location to the s3 access point similar to what's defined for the partitions.
Also, please review the IAM role which was used, Is it defined to allow access with both the s3 name as well as the s3 access point name? Can we add if one is missing? If this is not in the IAM role, the restriction to the actual S3 bucket from outside may be on a higher level (eg: AWS SCP policies).
โ10-10-2023 11:18 PM
Hi @dbx_deltaSharin, The error message you're encountering indicates an "AuthorizationFailure", which means that the request you're making is not authorized. This could be due to a variety of reasons such as incorrect or insufficient permissions, or an issue with the authentication method.
Given the information provided, it's difficult to pinpoint the exact cause of the issue.
However, here are a few things you could check:-
- Ensure that the 'config.share' file contains the correct and valid credentials.
- Check the permissions on the Delta Sharing tables. The account used in 'config.share' file should have the necessary permissions to read the data.
- If the data is being shared from another platform, ensure that the platform allows access from your Databricks workspace.
Unfortunately, without more information about your specific setup and the exact configuration of your 'config.share' file, it's hard to provide a more precise answer. I would recommend checking the above points and if the problem persists, consider reaching out to Databricks support for further assistance by filing a support ticket.
โ10-10-2023 11:41 PM
Hi,
Thank you @Kaniz_Fatma for responding to my question. For additional information, the 'config.share' file follows this format:
{"shareCredentialsVersion":1,"bearerToken":"valuexxxx","endpoint":"endpointUrl","expirationTime":"expirationTimeValue"}
The data shared from another Databricks account. Therefore, I'm wondering how it could be an authorization or permission issue, especially since I can already observe all table names and schemas using the same 'config.share' file.
โ11-10-2023 01:12 AM
Hi @dbx_deltaSharin,
When querying the individual partitions, the files are being read by using an S3 access point location while it is using the actual S3 name when reading the table as a whole. This information is fetched from the table metadata itself.
It appears, in the source metastore, the table metadata is pointed to the s3 location where as the partitions are defined with the s3 access point location.
Please review the table and partition metadata at the source table. Update the table metadata and point the table location to the s3 access point similar to what's defined for the partitions.
Also, please review the IAM role which was used, Is it defined to allow access with both the s3 name as well as the s3 access point name? Can we add if one is missing? If this is not in the IAM role, the restriction to the actual S3 bucket from outside may be on a higher level (eg: AWS SCP policies).
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group