Unable to infer schema for Parquet at
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-29-2020 10:43 AM
I have this code in a notebook:
val streamingDataFrame = incomingStream.selectExpr("cast (body as string) AS Content") .withColumn("Sentiment", toSentiment($"Content"))import org.apache.spark.sql.streaming.Trigger.ProcessingTime val result = streamingDataFrame .writeStream.format("parquet") .option("path", "/mnt/TwitterSentiment") .option("checkpointLocation", "/mnt/temp/check") .start() </p><p>...and it always results in this error. Am stumped, any advice?</p><pre>org.apache.spark.sql.AnalysisException: Unable to infer schema for Parquet at . It must be specified manually;<br>
Labels:
- Labels:
-
Data Ingestion & connectivity
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2021 11:30 PM
seems like an invalid parquet file. my guess is the incoming data has mixed types (for the same column) or a different/invalid structure.