Databricks Community

patojo94 · 11-28-2023

Hi all! I am having the following issue with a couple of pyspark streams. I have some notebooks running each of them an independent file structured streaming using delta bronze table (gzip parquet files) dumped from kinesis to S3 in a previous job....

patojo94 · 05-07-2022

Hi everyone, I have a pyspark streaming reading from an aws kinesis that suddenly failed for no reason (I mean, we did not make any changes in the last time).It is giving the following error: ERROR MicroBatchExecution: Query kinesis_events_prod_bronz...

patojo94 · 05-04-2022

Hi everyone, I am having some troubles to add a deduplication step on a file streaming that is already running. The code I am trying to add is this one:df = df.withWatermark("arrival_time", "20 minutes")\ .dropDuplicates(["event_id", "arrival_time"])...

Databricks Community

User Stats

User Activity

Stream failure JsonParseException

pyspark streaming failed for now reason

Adding deduplication method to spark streaming