cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta table and AnalysisException: [PATH_NOT_FOUND] Path does not exist

alex-syk
New Contributor II

I am performing some tests with delta tables. For each test, I write a delta table to Azure Blob Storage. Then I manually delete the delta table. After deleting the table and running my code again, I get this error: 

 

AnalysisException: [PATH_NOT_FOUND] Path does not exist: /mnt/delta-sharing/temp/df.

 

Here is a minimal working example to reproduce my problem and the exact order of operations I am performing.

Minimal working example:

Databricks notebook cell 1:

 

from delta.tables import DeltaTable

 

Databricks notebook cell 2:

 

df = spark.createDataFrame(
    [
        (0, 1)
    ],
    ('col_1', 'col_2')
)

path = '/mnt/delta-sharing/temp/df'

 

Databricks notebook cell 3:

 

# If delta table does not exist, create it
if not DeltaTable.isDeltaTable(spark, path):
    print('Delta table does not exist. Creating it')
    df.write.format('delta').save(path)
    delta_table = DeltaTable.forPath(spark, path)

# Load existing data in the delta table
delta_table = DeltaTable.forPath(spark, path)

 

Order of operations:

  • Step 1: Check in Azure Blob Storage that the path provided in cell 2 is empty:

Capture.PNG

  • Step 2: Run all three cells in the notebook. I get the error: 

 

AnalysisException: [PATH_NOT_FOUND] Path does not exist: /mnt/delta-sharing/temp/df.

 

Capture.PNG

  • Step 3: Don't do anything else except rerun cell 3. I do not get an error, and the delta table is created successfully:

Capture.PNG

Capture.PNG

  • Step 4: Delete the delta table

Capture.PNG

  • Step 5: Rerun cell 3. Get the error: "AnalysisException: [PATH_NOT_FOUND] Path does not exist: /mnt/delta-sharing/temp/df."
  • Step 6: Rerun cell 3. The delta table is created successfully.

As shown above, every time I delete the delta table, I have to rerun cell 3 twice to successfully enter the if statement

 

if not DeltaTable.isDeltaTable(spark, path)

 

I should note that there are some random (at least to me) times when if I restart the cluster or detach and reattach the notebook then the first run of cell 3 works. But then after deleting the delta table I always have to run cell 3 twice for the delta table to be created.

Why is this happening? Is this a problem with delta table or Azure Blob Storage? Is there any solution? Is there a best practice for deleting delta tables that I am violating?

1 REPLY 1

kumar_ravi
New Contributor III

yes it is weird , workaround for this

files = dbutils.fs.ls("s3 bucket or azure blob path")
file_paths = [file.path for file in files]
if target_path not in file_paths:
        dbutils.fs.mkdirs(target_path)

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group