willam45
New Contributor II

Hello, 
I have some write this 

The behavior you're describing, where the data streamed at 10 AM IST is being logged at 10 PM IST, indicates a significant lag in your PySpark streaming application. The PySpark StreamingListener interface is not responsible for causing this lag; it is used to monitor and collect information about the progress of your streaming application.

The PySpark StreamingListener interface allows you to create custom listeners that can capture events and metrics related to the execution of your streaming application. This can include events like batch processing times, number of records processed, and other execution statistics. However, the listener itself does not affect the timing or behavior of your streaming application.

If you're facing a significant lag in your streaming application's processing time, you should investigate other aspects of your setup to identify the cause.