Hi,
I have a delta live table workflow with storage enabled for cloud storage to a blob store.
Syntax of bronze table in notebook
===
@dlt.table(
spark_conf = {"spark.databricks.delta.schema.autoMerge.enabled": "true"},
table_properties = {
"quality": "bronze"
}
)
def sap_mdo_sfc_bronze():
return (
spark.readStream
.schema(schema) \
.format("cloudFiles") \
.option("cloudFiles.format", "json") \
.option("cloudFiles.inferColumnTypes", True) \
.option("multiline","true") \
.option("header", "True") \
.option("cloudFiles.schemaLocation", data_path+"/SCHEMA") \
.load(data_path+"/DATA") \
.select("*")
)
===
Once delta live table runs it creates tables in blob storage and also with metadata in the hivemetastore under a specified schema.
Issue: When I start or run the pipeline update for the second time it failed with below error
====
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `tenant_id`.`table_bronze` because it already exists. Choose a different name, drop or replace the existing object, add the IF NOT EXISTS clause to tolerate pre-existing objects, or add the OR REFRESH clause to refresh the existing streaming table.
====
As a work around, first I delete the table from hivemetastore and then I Start pipeline update. Then it runs successfully.
Can anyone help me understand this issue.
Thanks and regards,
Syed Saqib