3 weeks ago
I have an Azure Databricks workspace with Unity Catalog setup, using VNet and private endpoints. Serverless works great; however, the regular clusters have problems showing large results:
Failed to store the result. Try rerunning the command. Failed to upload command result to DBFS. Error message: PUT request to create file failed with statusCode=403, error=HttpResponseProxy{HTTP/1.1 403 This request is not authorized to perform this operation.
Also, I can’t list (ls) into DBFS. The event log shows:
DBFS_DOWN.
Perhaps this is an actual Databricks issue. My network and firewall setup has been validated many times. Somehow, the cluster has no access to the DBFS root. But as this is a Databricks-managed resource group, it should all work out of the box, right?
3 weeks ago
What access mode is being used on this clusters?
3 weeks ago
Unrestricted and Single User mode.
3 weeks ago
Can you please try the same in a shared access mode cluster?
Also can you please try setting up spark configuration spark.databricks.driver.enableWriteDbfsCommandResultInDp
to false
.
This will disable the feature that writes DBFS command results directly in the Data Plane
3 weeks ago
Function display and show work now again thanks!
The dbutils.fs.ls("dbfs:/") command still results in an error. I really wonder how databricks has setup those managed resources. I think something is going wrong there.
3 weeks ago
From spark and dbutils I also get this message:
Caused by: com.microsoft.azure.storage.StorageException: This request is not authorized to perform this operation.
3 weeks ago
I don't understand how the cluster authenticates with the storage account, perhaps if someone at databricks could clear this up for me I would be able to better debug the issue.
3 weeks ago
I solved the issue myself. Databricks documentation is hard, but it seems necessary to create private endpoints in the managed storage account of databricks.
a week ago
I'm having the same issue when i try to save a large delta table (80M of rows). Could you please share how do you solved the problem?
a week ago
The dbfs (dbstorage) resource in the managed azure resource group needs to have private endpoints to your virtual network. You can create those manually or through iac (bicep/terraform).
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group