cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

messages from event hub does not flow after a time

Jreco
Contributor

Hi Team,

I'm trying to build a Real-time solution using Databricks and Event hubs.

Something weird happens after a time that the process start.

At the begining the messages flow through the process as expected with this rate:

image 

please, note that the last updated time is 50Sec.

However, after a time, the messages don't flow:

imagePlease note that the last updated time is: 11 hours.

If I restart the job, the messages flow again as expected (even recovering the messages that does not were processed in the las 11 hours, for this case)

This is a graph of the example of the issue:

imageThe last peek was when I restarted the job.

Any idea what could happens?

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III
  • please check is .option("checkpointLocation", "/mnt/your_storage/") specified for structured streaming,
  • it can also depend what is than done with stream (writeStream),
  • connection to eventhub is quite straight forward, please verify also flow on Azure side as there we can see streaming messages in real time (go to entities, select Even Hub, in "Futures" select "process data" than "Explore" than "Create")

View solution in original post

6 REPLIES 6

Hubert-Dudek
Esteemed Contributor III
  • please check is .option("checkpointLocation", "/mnt/your_storage/") specified for structured streaming,
  • it can also depend what is than done with stream (writeStream),
  • connection to eventhub is quite straight forward, please verify also flow on Azure side as there we can see streaming messages in real time (go to entities, select Even Hub, in "Futures" select "process data" than "Explore" than "Create")

Jreco
Contributor

Thanks for your answer @Hubert Dudek​ ,

  • Is already specified
  • What do youn mean with this?
  • This is the weird part of this, bucause the data is flowing good, but at any time is like the Job stop the reading or somethign like that and if I restart the job, all continues working well

Hubert-Dudek
Esteemed Contributor III

I mean that you read stream in some purpose, usually to transform it and write it somewhere. So problem can be not with reading but writing part.

I'm assuming that the issue is not the Writing part because the DB does not present any kind of blockers or conflicts.

hi @Jhonatan Reyes​ ,

Do you control/limit the max number of events process per trigger in your event hubs? check "maxEventsPerTrigger" or Whats your trigger internal? also, how many partitions are you reading from? whats your sink?

@Jhonatan Reyes​ 

Do you still need help with this or the issue has been mitigated/solved?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.