Hi @ChristianRRL
This is a common issue with Spark Structured Streaming and the display() function.
The error occurs because you're trying to display a streaming DataFrame, which requires special handling. Here are several solutions:
1. Use writeStream instead of display()
For streaming DataFrames, use writeStream to output the data:
# Instead of display(df)
query = (df.writeStream
.format("console") # or "memory", "delta", etc.
.outputMode("append") # or "complete", "update"
.trigger(once=True) # Process once then stop
.start())
query.awaitTermination()
2. Use Memory Sink for Testing:
Create a temporary view to examine streaming data:
# Start the stream writing to memory
query = (df.writeStream
.format("memory")
.queryName("temp_table")
.outputMode("append")
.start())
# Wait a moment for data to be processed
import time
time.sleep(10)
# Now you can query the in-memory table
display(spark.sql("SELECT * FROM temp_table LIMIT 10"))
# Don't forget to stop the query
query.stop()
The key issue is that display() doesn't work with streaming DataFrames - you need to use writeStream to materialize the data first.
LR