cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta Live Table - Cannot redefine dataset

monojmckvie
New Contributor II

Hi,

I am new to Delta Live Table.

I am trying to create a delta live table from the databricks tutorial.

I have created a notebook and attached an interactive cluster -DBR 14.3-LTS.

I am running the below code.

When I ran it for the 1st time it ran successfully.

When I ran the cell for the 2nd time I am getting error - AnalysisException: Cannot redefine dataset 'sales_orders_raw'

Can you please help me understand why this is happening?

----------------------------------------------------------------

from pyspark.sql.functions import *
from pyspark.sql.types import *
import dlt

@Dlt.create_table(
  comment="The raw sales orders, ingested from /databricks-datasets.",
  table_properties={
    "myCompanyPipeline.quality": "bronze",
    "pipelines.autoOptimize.managed": "true"
  }
)
def sales_orders_raw():
  return (
    spark.readStream.format("cloudFiles") \
      .option("cloudFiles.schemaLocation", "/tmp/john.odwyer/pythonsalestest") \
      .option("cloudFiles.format", "json") \
      .option("cloudFiles.inferColumnTypes", "true") \
      .load("/databricks-datasets/retail-org/sales_orders/")
  )
----------------------------------------------------------------

 

2 REPLIES 2

Walter_C
Databricks Employee
Databricks Employee

The error message "AnalysisException: Cannot redefine dataset 'sales_orders_raw'" is indicating that you're trying to create a table that already exists. In Databricks, once a Delta Live Table (DLT) is defined, it cannot be redefined or overwritten. This is to ensure the consistency and reliability of your data pipelines.

If you want to modify the table definition, you will need to delete the existing table first. However, be aware that this will also delete all the data in the table. If you want to keep the data, you should create a new table with a different name.

 

Here's how you can delete a DLT:

 

@dlt.delete_table
def sales_orders_raw():
  pass

After running this, you should be able to redefine your sales_orders_raw table.

Remember to be careful when deleting tables, especially in a production environment, as this action cannot be undone.

Thanks for the suggestion.

But while I am executing delete_table command I am getting error - module 'dlt' has no attribute 'delete_table'

My cluster config - 

monojmckvie_0-1711349958291.png

If you can suggest please.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group