cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta Live Table - Cannot redefine dataset

monojmckvie
New Contributor II

Hi,

I am new to Delta Live Table.

I am trying to create a delta live table from the databricks tutorial.

I have created a notebook and attached an interactive cluster -DBR 14.3-LTS.

I am running the below code.

When I ran it for the 1st time it ran successfully.

When I ran the cell for the 2nd time I am getting error - AnalysisException: Cannot redefine dataset 'sales_orders_raw'

Can you please help me understand why this is happening?

----------------------------------------------------------------

from pyspark.sql.functions import *
from pyspark.sql.types import *
import dlt

@Dlt.create_table(
  comment="The raw sales orders, ingested from /databricks-datasets.",
  table_properties={
    "myCompanyPipeline.quality": "bronze",
    "pipelines.autoOptimize.managed": "true"
  }
)
def sales_orders_raw():
  return (
    spark.readStream.format("cloudFiles") \
      .option("cloudFiles.schemaLocation", "/tmp/john.odwyer/pythonsalestest") \
      .option("cloudFiles.format", "json") \
      .option("cloudFiles.inferColumnTypes", "true") \
      .load("/databricks-datasets/retail-org/sales_orders/")
  )
----------------------------------------------------------------

 

2 REPLIES 2

Walter_C
Honored Contributor
Honored Contributor

The error message "AnalysisException: Cannot redefine dataset 'sales_orders_raw'" is indicating that you're trying to create a table that already exists. In Databricks, once a Delta Live Table (DLT) is defined, it cannot be redefined or overwritten. This is to ensure the consistency and reliability of your data pipelines.

If you want to modify the table definition, you will need to delete the existing table first. However, be aware that this will also delete all the data in the table. If you want to keep the data, you should create a new table with a different name.

 

Here's how you can delete a DLT:

 

@dlt.delete_table
def sales_orders_raw():
  pass

After running this, you should be able to redefine your sales_orders_raw table.

Remember to be careful when deleting tables, especially in a production environment, as this action cannot be undone.

Thanks for the suggestion.

But while I am executing delete_table command I am getting error - module 'dlt' has no attribute 'delete_table'

My cluster config - 

monojmckvie_0-1711349958291.png

If you can suggest please.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!