Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-05-2024 09:22 AM
In this case, please make sure you specify the schema explicitly when reading the Parquet files and do not specify any inference options.
Something like
spark.readStream.format("cloudFiles").schema(schema)...
If you want to more easily grab the schema, you can read with the batch reader and capture the schema:
schema = spark.read.parquet("/your/path/here").schema