cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta Live Table - Cannot redefine dataset

monojmckvie
New Contributor II

Hi,

I am new to Delta Live Table.

I am trying to create a delta live table from the databricks tutorial.

I have created a notebook and attached an interactive cluster -DBR 14.3-LTS.

I am running the below code.

When I ran it for the 1st time it ran successfully.

When I ran the cell for the 2nd time I am getting error - AnalysisException: Cannot redefine dataset 'sales_orders_raw'

Can you please help me understand why this is happening?

----------------------------------------------------------------

from pyspark.sql.functions import *
from pyspark.sql.types import *
import dlt

@Dlt.create_table(
  comment="The raw sales orders, ingested from /databricks-datasets.",
  table_properties={
    "myCompanyPipeline.quality": "bronze",
    "pipelines.autoOptimize.managed": "true"
  }
)
def sales_orders_raw():
  return (
    spark.readStream.format("cloudFiles") \
      .option("cloudFiles.schemaLocation", "/tmp/john.odwyer/pythonsalestest") \
      .option("cloudFiles.format", "json") \
      .option("cloudFiles.inferColumnTypes", "true") \
      .load("/databricks-datasets/retail-org/sales_orders/")
  )
----------------------------------------------------------------

 

2 REPLIES 2

Walter_C
Valued Contributor II
Valued Contributor II

The error message "AnalysisException: Cannot redefine dataset 'sales_orders_raw'" is indicating that you're trying to create a table that already exists. In Databricks, once a Delta Live Table (DLT) is defined, it cannot be redefined or overwritten. This is to ensure the consistency and reliability of your data pipelines.

If you want to modify the table definition, you will need to delete the existing table first. However, be aware that this will also delete all the data in the table. If you want to keep the data, you should create a new table with a different name.

 

Here's how you can delete a DLT:

 

@dlt.delete_table
def sales_orders_raw():
  pass

After running this, you should be able to redefine your sales_orders_raw table.

Remember to be careful when deleting tables, especially in a production environment, as this action cannot be undone.

Thanks for the suggestion.

But while I am executing delete_table command I am getting error - module 'dlt' has no attribute 'delete_table'

My cluster config - 

monojmckvie_0-1711349958291.png

If you can suggest please.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.