Re: Autoloader issue

cgrant · ‎12-05-2024

In this case, please make sure you specify the schema explicitly when reading the Parquet files and do not specify any inference options.

Something like

spark.readStream.format("cloudFiles").schema(schema)...

If you want to more easily grab the schema, you can read with the batch reader and capture the schema:

schema = spark.read.parquet("/your/path/here").schema