cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Setting right processingTime

Ogi
New Contributor II

How to set just the right processingTime for readStream to maximize the performance? Based on which factors it depends and is there a way to measure this?

4 REPLIES 4

Ajay-Pandey
Esteemed Contributor III

When you specify a trigger interval that is too small (less than tens of seconds), the system may perform unnecessary checks to see if new data arrives. Configure your processing time to balance latency requirements and the rate that data arrives in the source.

There is no specific measure for this because it totally depend on your use cases.

NandiniN
Valued Contributor II
Valued Contributor II

Anonymous
Not applicable

Hi @Ognjen Grubacโ€‹ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Ogi
New Contributor II

Thanks @Ajay Pandeyโ€‹ and @Nandini Nโ€‹ for your answers. I wanted to know more about what should I do in order to do it properly. Should I change processing times (1, 5, 10, 30, 60 seconds) and see how it affects running job in terms of time and CPU/memory used? Or is there a fine way how to do it?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.