DLT SQL schema definition

Sudheer_DB
New Contributor II

Hi All,

While defining a schema in creating a table using Autoloader and DLT using SQL, I am getting schema mismatch error between the defined schema and inferred schema. 

CREATE OR REFRESH STREAMING TABLE csv_test
(a0 STRING
,a1 STRING
,a2 STRING
,a3 STRING
,a4 STRING
,a5 STRING
,a6 STRING
,a7 STRING
,a8 STRING
,a9 STRING
,rescue_data STRING
)
AS SELECT
*
 FROM cloud_files("s3://Bucket/test_data/", "csv", map("delimiter", "|", "header", "false"))

Is there a known limitation or am I missing something here?

Sudheer_DB_0-1719375711422.png

 

daniel_sahal
Databricks MVP

@Sudheer_DB 
In your schema there's a column named rescue_data, while the default autoloader column name for faulty data is _rescued_data

Sudheer_DB
New Contributor II

Hi @daniel_sahal ,

Thank you for your response. The whole idea is to define my own column names. Shouldn't I rename the rescued_data column?

daniel_sahal
Databricks MVP

@Sudheer_DB 
You can specify your own _rescued_data column name by setting up rescuedDataColumn option.
https://docs.databricks.com/en/ingestion/auto-loader/schema.html#what-is-the-rescued-data-column