Hi,
I have a delta table which is loaded by structured streaming job. When I tried to read this delta table and do a MERGE with foreachBatch, I found sometimes there is a big interval between streaming starts and MERGE starting to run and seems spark is waiting for something. From log I can see
INFO ExecuteGrpcResponseSender: Starting for opId=5ef071b7-xxx, reattachable=true, lastConsumedStreamIndex=0
...
INFO SessionHolder: Session SessionKey(69xxx,04470efa-xxxx) accessed, time 1728792222507.
...
INFO ExecuteGrpcResponseSender: Deadline reached, shutting down stream for opId=5ef071b7-xxx after index 0. totalTime=120001284340ns waitingForResults=120001197790ns waitingForSend=0ns
INFO SessionHolder: Session SessionKey(69xxx,04470efa-xxxx) accessed, time 1728792342527.
INFO ExecuteGrpcResponseSender: Starting for opId=5ef071b7-xxx, reattachable=true, lastConsumedStreamIndex=0
...
there are many "INFO ExecuteGrpcResponseSender: Deadline reached, shutting down stream..." and seems something is time out after 120s. I tried to set
spark.network.timeout: 800s
spark.streaming.backpressure.enabled: true
but still can find those deadline info in log.
What happened here? Is there some config I can make to remove this as seems it slows down the job.
Thanks