Tuesday
Hi!
I'm trying to stream some files using read_files.format("cloudFiles"). However, when new files arrive, the subsequent SQL query and monitoring graphs are not getting updated. Please suggest.
Tuesday
Hi @AanchalSoni ,
How have you set up your stream? Could you provide the code? 😊. Perhaps you've not setup the stream trigger to behave like you want: https://docs.databricks.com/aws/en/structured-streaming/triggers
All the best,
BS
Tuesday
Please check the attachments
Tuesday
It seems there is a problem with attachements on community in recent days. All of them are stuck with this "Virus scan in progress". Could you try copy those images directly into text box.
Hi, sorry for calling you directly @Advika , @Sujitha - but mayby do you know if there have been any changes to the attachment adding system recently?
Tuesday
Tuesday
Hi @AanchalSoni thanks for sharing your code. your job is also not scheduled and running continuously right?
+
You need to write the streaming DataFrame to a table or view for vw_superstore
example:
f_super.writeStream \
.format("delta") \
.option("checkpointLocation", "/Volumes/dlt/default/data/Check_super13/") \
.outputMode("append") \
.table("vw_superstore").start()
Tuesday
Thanks for tagging, @szymon_dybczak! We’ll check and get back on this.
Tuesday
Hi @AanchalSoni If you are using .readStream, make sure you have set a trigger interval (e.g., .trigger(processingTime='1 minute'))
Tuesday
If you don't define trigger by default it will trigger microbatch every 0.5 second. So I guess this is not an issue here.
Tuesday
Hi Saurabh!
If I'm not explicitly mentioning the trigger then by default the trigger should be 500 ms and there should be a quick check for new files. however, even after a few minutes there is no expected activity.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now