Databricks Community

YuriS · ‎01-26-2026

After implementing StreamingQueryListener to enable integration with our monitoring solution we have noticed some strange metrics for our DeltaSource streams (based on https://learn.microsoft.com/en-us/azure/databricks/structured-streaming/stream-monitoring)

In some cases for DeltaSource streams metric inputRowsPerSecond is set to 0:

For the particular event (same is visible on Spark UI)

Also would be good to understand what is the different between batch and trigger - are these the same, or difference would be visible only when batches are restarted?

Thank you

SteveOstrowski · ‎03-08-2026

Hi @YuriS,

There are a few things going on here, and I will walk through each one.

INPUTROWSPERSECOND SHOWING 0

The inputRowsPerSecond metric is not calculated from the current batch. It is the rate of data arriving between the end of the previous trigger and the start of the current one. Specifically, Spark computes it as:

inputRowsPerSecond = numInputRows / (elapsed time since previous trigger completed)

On the very first batch after a stream starts (or restarts), there is no previous trigger to measure against, so the elapsed time is either zero or undefined. Spark reports 0.0 in that case. You will also see 0.0 whenever the time gap between triggers is extremely small or when the trigger spent most of its duration on metadata operations (latestOffset, WAL commit, query planning) rather than actively fetching rows.

If you need a reliable throughput metric, processedRowsPerSecond is generally more useful because it divides numInputRows by the actual triggerExecution duration of the current batch:

processedRowsPerSecond = numInputRows / durationMs.triggerExecution

This gives you the actual processing rate for the batch that just completed.

NUMINPUTROWS BEING 3-5X LARGER THAN ACTUAL COMMITTED ROWS

This is the more interesting part of your observation. There are a few possible explanations, and they can stack:

1. SPECULATIVE EXECUTION AND TASK RETRIES
If spark.speculation is enabled (it is by default on many Databricks cluster configurations), Spark may launch duplicate tasks for the same partition. Each speculative copy reads the same input rows. numInputRows counts all rows read across all task attempts, not just the ones from the winning task. If you have 3 speculative copies that each read the same data, numInputRows can report 3x the actual unique rows.

Check whether speculation is enabled:

spark.conf.get("spark.speculation")

If it is "true" and you want numInputRows to reflect unique rows, you can disable it:

spark.conf.set("spark.speculation", "false")

However, weigh this against the benefit speculation gives you for straggler mitigation.

2. TASK FAILURES AND RETRIES
If any tasks fail and are retried (even without speculation), the retried tasks re-read the same input rows. numInputRows accumulates across all attempts.

3. FOREACHBATCH RETRY SEMANTICS
When using foreachBatch, if the batch function throws an exception partway through, the entire micro-batch may be retried. The retried execution re-reads all the input rows from the source, and numInputRows will include both the failed and successful reads.

4. DELTA SOURCE INTERNAL READS
When reading from a Delta table as a streaming source with Change Data Feed or certain merge/update patterns upstream, the Delta source may emit multiple row versions (insert, update_preimage, update_postimage, delete) for a single logical row change. These all count toward numInputRows even if your foreachBatch logic filters some of them out before writing to the target table.

HOW TO RECONCILE THE NUMBERS

For accurate row counts of what actually landed in your target, the approach you are already using (comparing batch_id in the target table) is the correct one. The StreamingQueryListener metrics are operational telemetry for throughput estimation, not precise accounting.

If you need exact input counts inside your foreachBatch logic, use the DataFrame.observe() API to count rows at the point in the pipeline you care about:

from pyspark.sql.functions import count, lit

def process_batch(df, batch_id):
  observed = df.observe("row_count", count(lit(1)).alias("cnt"))
  # your processing logic on observed
  ...

query = (
  spark.readStream
      .format("delta")
      .table("source_table")
      .writeStream
      .foreachBatch(process_batch)
      .option("checkpointLocation", "/path/to/checkpoint")
      .trigger(processingTime="1 minute")
      .start()
)

You can then capture the observed metric in your StreamingQueryListener.onQueryProgress handler via event.progress.observedMetrics.

BATCH VS. TRIGGER

A trigger is the scheduling mechanism that tells Spark when to look for new data (e.g., processingTime="1 minute" fires every 60 seconds). A batch (micro-batch) is the unit of work that actually processes the data found during that trigger.

In practice, they are almost always 1:1: one trigger fires, one micro-batch executes. The distinction surfaces in a few cases:

- If a trigger fires but no new data is available, no new batch is created (batchId does not increment), and an onQueryIdle event fires instead of onQueryProgress.
- After a restart, the first trigger may process a backlog, but it still runs as a single micro-batch.
- The batchDuration and durationMs.triggerExecution values will be identical for a normal micro-batch cycle, which is why they look the same in your metrics.

The key metrics reference for all of these fields is here:
https://learn.microsoft.com/en-us/azure/databricks/structured-streaming/stream-monitoring

SUMMARY

- inputRowsPerSecond = 0 is expected on the first batch and when metadata time dominates. Use processedRowsPerSecond for throughput monitoring.
- numInputRows includes retried and speculative task reads. It is not a count of unique input rows. Check spark.speculation and task retry counts in the Spark UI.
- For precise row accounting, use DataFrame.observe() or compare offsets/target table data as you are already doing.
- Batch and trigger are conceptually different but 1:1 in normal micro-batch operation.

REFERENCES

- Monitoring Structured Streaming queries:

https://learn.microsoft.com/en-us/azure/databricks/structured-streaming/stream-monitoring

- StreamingQueryListener documentation:

https://docs.databricks.com/en/structured-streaming/stream-monitoring.html

- Observable metrics with DataFrame.observe():

https://docs.databricks.com/en/structured-streaming/stream-monitoring.html#defining-observable-metrics-in-structured-streaming

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.

If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.

View solution in original post

hasnat_unifeye · ‎01-26-2026

Firstly - let’s talk about batch vs trigger.

A trigger is the scheduling event that tells Spark when to check for new data (eg processingTime, availableNow, once). A batch (micro-batch) is the actual unit of work that processes data, reads input, and commits results. In many cases there is a 1:1 relationship, so they appear the same, but they are conceptually different. The difference becomes visible during restarts, backlog processing, or when a trigger fires but no data is available.

This video gives a clear explanation of trigger behaviour in Structured Streaming:
https://www.youtube.com/watch?v=t7cRAIgVduQ

Regarding the metrics shown (batchDuration and triggerExecution being equal):
this looks strange, but it is expected for micro-batch streaming when a single batch fully occupies the trigger window. the trigger execution time often includes delta metadata work and waiting, so both values can collapse to the same duration.

This also explains why inputRowsPerSecond can be reported as 0.0 for DeltaSource streams. the metric is derived from numInputRows divided by trigger execution time - so yes its slightly strange. when most of the trigger time is spent waiting or doing metadata operations rather than actively reading rows, spark may report an effective input rate of zero even though rows were processed. i would say for monitoring - numInputRows is the more reliable metric.

So - are they the same? - No. A trigger defines when Spark checks for work (e.g. an interval). A batch runs if and only if data is available when the trigger fires.
Is a difference visible on batch restart - No. The difference is not only visible when batches are restarted.

YuriS · ‎02-12-2026

Could someone from Databricks answer on that question.
On top of the above numInputRows metric is not (imho) reliable information as well.

Checked several stateless streams (simple foreachBatch function with extraction and straightforward validation where "bad" records are moved to "quarantine" table), compared endOffsets minus startOffsets with numInputRows metric, as well as, with actual data processed to the target table:

endOffsets minus startOffsets is the same as the actual number of records committed to target table - if we will not take into account records that were rolled back by consumer - i.e. some offsets are not processed to target table.
numInputRows in 95% of cases is 3 or 5 times bigger than the actual number of processed records (i.e. records sunk to target table - does that mean that batch was re-started 3 or 5 times? it would explain a lot, but there some rare scenarious
where numInputRows does not seems to be in exact correlation with actual processed number of records. For instance, event with numInputRows = 10523, actual committed records for that particular batch in the target table = 3198 and number of processed offsets is 3224 (looks like there were uncommitted events on producer)

Hard to find logic here...

saurabh18cs · ‎02-12-2026

Hi @YuriS Can you share your writestream trigger properties? what is the trigger processing time you have selected there? can you try by using 10 seconds?

.writeStream

.trigger(processingTime = '10')

)

YuriS · ‎02-13-2026

Trigger interval is set to 1 minute; maxOffsetsPerTrigger is set to 20k. That's all.

And once again when i am comparing either offsets (from the StreamingQueryListener events) or actual data in target table (batch_id is in target table, so it is very easy the check what was actually processed) with numInputRows it is x3 times for particular streams:

saurabh18cs · ‎02-13-2026

.option('kafka.request.timeout.ms',150000)

.option('kafka.session.timeout.ms', 200000)

.option('group.max.session.timeout.ms', 7200000)

.option('failOnDataLoss', 'true')

.option('startingOffsets', 'latest')

.option('fetchOffset.numRetries', 1)

.option('maxOffsetsPerTrigger', 1000)

)

my settings see if any option here can help you

SteveOstrowski · ‎03-08-2026

Hi @YuriS,

There are a few things going on here, and I will walk through each one.

INPUTROWSPERSECOND SHOWING 0

The inputRowsPerSecond metric is not calculated from the current batch. It is the rate of data arriving between the end of the previous trigger and the start of the current one. Specifically, Spark computes it as:

inputRowsPerSecond = numInputRows / (elapsed time since previous trigger completed)

On the very first batch after a stream starts (or restarts), there is no previous trigger to measure against, so the elapsed time is either zero or undefined. Spark reports 0.0 in that case. You will also see 0.0 whenever the time gap between triggers is extremely small or when the trigger spent most of its duration on metadata operations (latestOffset, WAL commit, query planning) rather than actively fetching rows.

If you need a reliable throughput metric, processedRowsPerSecond is generally more useful because it divides numInputRows by the actual triggerExecution duration of the current batch:

processedRowsPerSecond = numInputRows / durationMs.triggerExecution

This gives you the actual processing rate for the batch that just completed.

NUMINPUTROWS BEING 3-5X LARGER THAN ACTUAL COMMITTED ROWS

This is the more interesting part of your observation. There are a few possible explanations, and they can stack:

1. SPECULATIVE EXECUTION AND TASK RETRIES
If spark.speculation is enabled (it is by default on many Databricks cluster configurations), Spark may launch duplicate tasks for the same partition. Each speculative copy reads the same input rows. numInputRows counts all rows read across all task attempts, not just the ones from the winning task. If you have 3 speculative copies that each read the same data, numInputRows can report 3x the actual unique rows.

Check whether speculation is enabled:

spark.conf.get("spark.speculation")

If it is "true" and you want numInputRows to reflect unique rows, you can disable it:

spark.conf.set("spark.speculation", "false")

However, weigh this against the benefit speculation gives you for straggler mitigation.

2. TASK FAILURES AND RETRIES
If any tasks fail and are retried (even without speculation), the retried tasks re-read the same input rows. numInputRows accumulates across all attempts.

3. FOREACHBATCH RETRY SEMANTICS
When using foreachBatch, if the batch function throws an exception partway through, the entire micro-batch may be retried. The retried execution re-reads all the input rows from the source, and numInputRows will include both the failed and successful reads.

4. DELTA SOURCE INTERNAL READS
When reading from a Delta table as a streaming source with Change Data Feed or certain merge/update patterns upstream, the Delta source may emit multiple row versions (insert, update_preimage, update_postimage, delete) for a single logical row change. These all count toward numInputRows even if your foreachBatch logic filters some of them out before writing to the target table.

HOW TO RECONCILE THE NUMBERS

For accurate row counts of what actually landed in your target, the approach you are already using (comparing batch_id in the target table) is the correct one. The StreamingQueryListener metrics are operational telemetry for throughput estimation, not precise accounting.

If you need exact input counts inside your foreachBatch logic, use the DataFrame.observe() API to count rows at the point in the pipeline you care about:

from pyspark.sql.functions import count, lit

def process_batch(df, batch_id):
  observed = df.observe("row_count", count(lit(1)).alias("cnt"))
  # your processing logic on observed
  ...

query = (
  spark.readStream
      .format("delta")
      .table("source_table")
      .writeStream
      .foreachBatch(process_batch)
      .option("checkpointLocation", "/path/to/checkpoint")
      .trigger(processingTime="1 minute")
      .start()
)

You can then capture the observed metric in your StreamingQueryListener.onQueryProgress handler via event.progress.observedMetrics.

BATCH VS. TRIGGER

A trigger is the scheduling mechanism that tells Spark when to look for new data (e.g., processingTime="1 minute" fires every 60 seconds). A batch (micro-batch) is the unit of work that actually processes the data found during that trigger.

In practice, they are almost always 1:1: one trigger fires, one micro-batch executes. The distinction surfaces in a few cases:

- If a trigger fires but no new data is available, no new batch is created (batchId does not increment), and an onQueryIdle event fires instead of onQueryProgress.
- After a restart, the first trigger may process a backlog, but it still runs as a single micro-batch.
- The batchDuration and durationMs.triggerExecution values will be identical for a normal micro-batch cycle, which is why they look the same in your metrics.

The key metrics reference for all of these fields is here:
https://learn.microsoft.com/en-us/azure/databricks/structured-streaming/stream-monitoring

SUMMARY

- inputRowsPerSecond = 0 is expected on the first batch and when metadata time dominates. Use processedRowsPerSecond for throughput monitoring.
- numInputRows includes retried and speculative task reads. It is not a count of unique input rows. Check spark.speculation and task retry counts in the Spark UI.
- For precise row accounting, use DataFrame.observe() or compare offsets/target table data as you are already doing.
- Batch and trigger are conceptually different but 1:1 in normal micro-batch operation.

REFERENCES

- Monitoring Structured Streaming queries:

https://learn.microsoft.com/en-us/azure/databricks/structured-streaming/stream-monitoring

- StreamingQueryListener documentation:

https://docs.databricks.com/en/structured-streaming/stream-monitoring.html

- Observable metrics with DataFrame.observe():

https://docs.databricks.com/en/structured-streaming/stream-monitoring.html#defining-observable-metrics-in-structured-streaming

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.

If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.

Databricks Community

StreamingQueryListener metrics strange behaviour (inputRowsPerSecond metric is set to 0)

‌✨‌ DAIS 2026 Community Virtual Contest – Winners Announced! 🏆

DAIS 2026 Day 3 - Last Day, Make It Count

DAIS 2026 Day 2 Recap: Megaphones, Legos, and the volunteers who carried it 🙌

🌟 Community Pulse: Your Weekly Roundup! June 08 – 14, 2026

Solution Accelerator Series | Building a Chatbot With Large Language Models (LLMs)