- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12 hours ago
String to Integer is an Value-Level Mismatch - Parquet reader successfully reads the STRING physical type from the file. Auto Loader attempts to cast STRING to INTEGER (a Spark-level operation). Cast fails for "invalid" at the value level during Spark's type conversion. Auto Loader's rescued data logic catches this conversion failure and routes it to _rescued_data.
Timestamp to INT64 is a Format-Level Mismatch - Parquet reader examines file metadata and sees conflicting physical type annotations. The Parquet reader rejects this as invalid at the format level before any data is read
Auto Loader never gets a chance to apply rescued data logic because the failure happens in the Parquet reader, not in Spark's type system.