Re: Autoloader [FAILED_READ_FILE.PARQUET_COLUMN_DA...

balajij8 · ‎06-22-2026

The _rescued_data column in Auto Loader works for JSON and CSV formats - not Parquet. Parquet is a strongly typed format where data types are encoded in the file metadata. When you have a timestamp column that becomes INT64 in a new file, it creates a file-format-level incompatibility that occurs during the Parquet reader initialization before Auto Loader's schema evolution or rescued data logic chip in.

FAILED_READ_FILE.PARQUET_COLUMN_DATA_TYPE_MISMATCH: Expected Spark type timestamp, actual Parquet type INT64 is generally from the low level Parquet reader when it detects the metadata mismatch.

In schemaEvolutionMode: addNewColumnsWithTypeWidening - It handles widening (int to long) but timestamp to INT64 is not widening. It's an incompatible change
rescuedDataColumn - Only rescues data for JSON/CSV where type mismatches are detected during parsing, not for Parquet format-level conflicts

You can use badRecordsPath for Parquet files with incompatible type changes. It catches file-level read failures and allows the stream to continue while logging the error files.