cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Where are delta lake files stored by given path?

a2_ish
New Contributor II

I have below code which works for the path below but fails for path = azure storage account path. i have enough access to write and update the storage account. I would like to know what wrong am I doing and the path below which works , how can i physically access it and see the delta files stored?

How is it done?

%sql
USE CATALOG hive_metastore;
CREATE DATABASE IF NOT EXISTS demo_db;
USE DATABASE demo_db;
 
#write the stream to a sink
 
#this fails with error shaded.databricks.org.apache.hadoop.fs.azure.AzureException: #hadoop_azure_shaded.com.microsoft.azure.storage.StorageException: This request is not #authorized to perform this operation using this permission.
path = "wasbs://astoragecontainer@azauditlogs.blob.core.windows.net"
(bronzeDF.writeStream
  .format('delta') 
  .outputMode("append") 
  .trigger(once=True) 
  .option("mergeSchema", "true")
  .option('checkpointLocation', path+"/bronze_checkpoint")
  .toTable('turbine_bronze')) #.start(path + "/turbine_bronze"))
 
#this path works but where can I find this path /Users/ankit.kumar@xyz.com/demo_db
 
path = "/Users/ankit.kumar@xyz.com/demo_db"
(bronzeDF.writeStream
  .format('delta') 
  .outputMode("append") 
  .trigger(once=True) 
  .option("mergeSchema", "true")
  .option('checkpointLocation', path+"/bronze_checkpoint")
  .toTable('turbine_bronze')) #.start(path + "/turbine_bronze"))

1 REPLY 1

Anonymous
Not applicable

@Ankit Kumarโ€‹ :

The error message you received indicates that the user does not have sufficient permission to access the Azure Blob Storage account. You mentioned that you have enough access to write and update the storage account, but it's possible that the user may not have the required permission to access the storage account via the Databricks environment.

To access the delta files stored in the Azure Blob Storage account, you can use one of the following methods:

  1. Azure Storage Explorer: This is a free, cross-platform tool from Microsoft that allows you to access your Azure Storage account and manage your blobs, files, queues, and tables. You can download and install the Azure Storage Explorer tool on your local machine, and then use it to connect to your Azure Blob Storage account and browse the files.
  2. Azure Storage REST API: You can also use the Azure Storage REST API to programmatically access the delta files stored in the Azure Blob Storage account. The REST API provides a set of HTTP endpoints that allow you to perform various operations on your Azure Storage account, such as listing containers, uploading files, and downloading files.

Regarding the path that works for you, "/Users/ankit.kumar@xyz.com/demo_db", this is a Databricks-specific path that refers to a file system location in the Databricks Workspace. It does not correspond to a physical file system location on a disk or in the cloud storage account. When you use this path, the delta files are stored in a location within the Databricks Workspace. To access this location, you can navigate to the Databricks Workspace in the Azure portal, and then navigate to the file system location corresponding to the path.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group