- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
18 hours ago
Hi,
I am using autoloader to load parquet files into my unity catalog with the following settings:
.option("cloudFiles.format", "parquet") .option("cloudFiles.inferColumnTypes", "true") .option("cloudFiles.schemaEvolutionMode", "addNewColumnsWithTypeWidening") .option("cloudFiles.rescuedDataColumn", "_rescued_data")
In one of the newest file I have a file where a column which is a timestamp is now a Long type. I was under the impression that this faulty records would just propagate to `_rescued_data` column. but unfortunately it breaks and I can only fix my pipeline with the badRecordsPath option.
Why is it that this breaks my pipeline with: Expected Spark type timestamp, actual Parquet type INT64. SQLSTATE: KD001, instead of moving the bad data to _rescued_data.
Thanks in advance!