What does durationMs.commitBatch measure?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-05-2025 05:50 AM
With a structured streamin job from Kafka, we have a metric in durationMs called commitBatch. There is also an example of this in this databricks documentation. I can not find any description of what this measures, and how it relates to the other metrics.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-05-2025 06:10 AM
The commitBatch
metric in the durationMs
object measures the time taken to commit the batch of data being processed. This includes the time required to write the batch data to the sink and update the offsets to reflect the processed data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-05-2025 01:30 PM
Have I understood correct that it is the time to write the data to sink, and also update the checkpoint location?
How does it relate to e.g addBatch, which is "The time taken to execute the microbatch." In the example I linked to we have "addBatch" : 5397, "commitBatch" : 4429'.
Does that mean that computing the actuall microbatch took 5s, and writing it out and committing it took 4,4s for a total of 9,4?
And why is it not always present? E.g. in this example with a delta sink, this example with kafka-to-kafka, or this delta-to-delta?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-06-2025 06:37 AM
The commitBatch
metric is a part of the overall triggerExecution
time, which encompasses all stages of planning and executing the microbatch, including committing the batch data and updating offsets.
The commitBatch
metric may not always be present in every example. Its presence depends on the specific implementation and the metrics that are being tracked for that particular streaming query. For instance, in the examples you mentioned:
- The
rate source to Delta Lake
example does not includecommitBatch
because it may not be relevant or tracked for that specific query. - The
Kafka-to-Kafka
example also does not includecommitBatch
, possibly due to differences in how metrics are collected or reported for Kafka sinks.

