cancel
Showing results for 
Search instead for 
Did you mean: 
fperry
New Contributor III
since ‎06-18-2024
‎02-03-2025

User Stats

  • 6 Posts
  • 0 Solutions
  • 0 Kudos given
  • 0 Kudos received

User Activity

Hi everyone,I'm working with Databricks structured streaming and have encountered an issue with stateful operations. Below is my pseudo-code: df = df.withWatermark("timestamp", "1 second") df_header = df.withColumn("message_id", F.col("payload.id"))...
I'm experiencing an issue that I don't understand. I am using Python's arbitrary stateful processing with structured streaming to calculate metrics for each item/ID. A timeout is set, after which I clear the state for that item/ID and display each ID...
For a data processing pipeline I use structured streaming and arbitrary stateful processing. I was wondering if the partitioning over several worker nodes and thus updating the state from different worker nodes has to be considered (e.g. using a lock...