<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: foreachBatch doesn't work in structured streaming in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/150538#M53466</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;&amp;nbsp;in my testing, the print output does not appear anywhere. There is no trace of them anywhere,&amp;nbsp; neither in the notebook or in driver logs.&lt;/P&gt;</description>
    <pubDate>Wed, 11 Mar 2026 07:47:13 GMT</pubDate>
    <dc:creator>Malthe</dc:creator>
    <dc:date>2026-03-11T07:47:13Z</dc:date>
    <item>
      <title>foreachBatch doesn't work in structured streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/84409#M37189</link>
      <description>&lt;P&gt;I' m trying to print out number of rows in the batch, but seems it doesn't work properly. I have 1 node compute optimized cluster and run in notebook this code:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;# Logging the row count using a streaming-friendly approach
def log_row_count(batch_df, batch_id):
  display(batch_df)
  row_count = 0
  if not batch_df.isEmpty():
    row_count = batch_df.count()
  print(f"{row_count} rows have been appended to {FULL_TABLE_NAME}")
  LOGGER.info(f"{row_count} rows have been appended to {FULL_TABLE_NAME}")

# Configure Auto Loader to ingest JSON data to a Delta table
ptv = spark.readStream \
    .format("cloudFiles") \
    .option("cloudFiles.format", "parquet") \
    .option("cloudFiles.schemaLocation", checkpoint_path) \
    .option("cloudFiles.inferColumnTypes", "true") \
    .option("cloudFiles.schemaEvolutionMode", "addNewColumns") \
    .load(file_path)

ptv \
  .writeStream \
  .option("checkpointLocation", checkpoint_path) \
  .outputMode("append") \
  .trigger(availableNow=True) \
  .foreachBatch(log_row_count) \
  .toTable(FULL_TABLE_NAME) \
  .awaitTermination()&lt;/LI-CODE&gt;&lt;P&gt;The only output I get is:&lt;BR /&gt;&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/10641iAB925334618B75AE/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999" role="button" title="Capture.PNG" alt="Capture.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;What should I do to print count in &lt;STRONG&gt;foreachBatch()&lt;/STRONG&gt; ?&lt;/P&gt;</description>
      <pubDate>Tue, 27 Aug 2024 14:05:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/84409#M37189</guid>
      <dc:creator>drag7ter</dc:creator>
      <dc:date>2024-08-27T14:05:24Z</dc:date>
    </item>
    <item>
      <title>Re: foreachBatch doesn't work in structured streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/101437#M40664</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;Can you try this:&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;def log_row_count(batch_df, batch_id):
    row_count = batch_df.count()
    print(f"Batch ID {batch_id}: {row_count} rows have been processed")
    LOGGER.info(f"Batch ID {batch_id}: {row_count} rows have been processed")

ptv.writeStream \
    .option("checkpointLocation", checkpoint_path) \
    .outputMode("append") \
    .trigger(availableNow=True) \
    .foreachBatch(log_row_count) \
    .start() \
    .awaitTermination()

# Write to table separately if needed
ptv.writeStream \
    .option("checkpointLocation", f"{checkpoint_path}_table") \
    .outputMode("append") \
    .trigger(availableNow=True) \
    .toTable(FULL_TABLE_NAME) \
    .awaitTermination()&lt;/LI-CODE&gt;</description>
      <pubDate>Mon, 09 Dec 2024 08:56:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/101437#M40664</guid>
      <dc:creator>Sidhant07</dc:creator>
      <dc:date>2024-12-09T08:56:53Z</dc:date>
    </item>
    <item>
      <title>Re: foreachBatch doesn't work in structured streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/128755#M48343</link>
      <description>&lt;P&gt;Hi, I am facing the exact same error. The method that I'm calling in the foreachBatch is just a very simple print statement that test whether the method is called or no, and the print is not printed out. Here's a code snippet:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;debug_batch&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;batch_df&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;batch_id):&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;print&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"Batch &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;batch_id&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt; started, row count = &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;batch_df.&lt;/SPAN&gt;&lt;SPAN&gt;count&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;)&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;query &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; df.writeStream&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;option&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"checkpointLocation"&lt;/SPAN&gt;&lt;SPAN&gt;, checkpoint_path)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;trigger&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;availableNow&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;foreachBatch&lt;/SPAN&gt;&lt;SPAN&gt;(debug_batch)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;start&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;awaitTermination&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 18 Aug 2025 15:22:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/128755#M48343</guid>
      <dc:creator>saffovski</dc:creator>
      <dc:date>2025-08-18T15:22:14Z</dc:date>
    </item>
    <item>
      <title>Re: foreachBatch doesn't work in structured streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/128758#M48344</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/179919"&gt;@saffovski&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;This is expected behavior. By default if you use print in foreachbatch it will output to driver log. So, check your driver logs &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 18 Aug 2025 15:48:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/128758#M48344</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-08-18T15:48:01Z</dc:date>
    </item>
    <item>
      <title>Re: foreachBatch doesn't work in structured streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/128828#M48351</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/152834"&gt;@Advika&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Could you mark above reply as an answer to the thread? This question keeps popping up and it would be good to have a solution for it. The behaviuor that I mentioned above is described here:&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/aws/en/structured-streaming/foreach#behavior-changes-for-foreachbatch-in-databricks-runtime-140" target="_blank"&gt;Use foreachBatch to write to arbitrary data sinks | Databricks Documentation&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Aug 2025 09:29:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/128828#M48351</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-08-19T09:29:51Z</dc:date>
    </item>
    <item>
      <title>Re: foreachBatch doesn't work in structured streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/128832#M48353</link>
      <description>&lt;P&gt;Thanks for clarifying and sharing the doc&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;!&lt;BR /&gt;I’ve marked your reply as a solution.&lt;/P&gt;</description>
      <pubDate>Tue, 19 Aug 2025 09:57:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/128832#M48353</guid>
      <dc:creator>Advika</dc:creator>
      <dc:date>2025-08-19T09:57:17Z</dc:date>
    </item>
    <item>
      <title>Re: foreachBatch doesn't work in structured streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/128833#M48354</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/152834"&gt;@Advika&lt;/a&gt;&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":thumbs_up:"&gt;👍&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Aug 2025 09:59:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/128833#M48354</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-08-19T09:59:25Z</dc:date>
    </item>
    <item>
      <title>Re: foreachBatch doesn't work in structured streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/129633#M48581</link>
      <description>&lt;P&gt;Hi Szymon, thank you for your answer. However, if I use the froeachBatch for my actual work (reading each microbatch while reading a stream of files), it clearly shows that the foreachBatch part is not executed. I was reading something that it might be related to a serverless compute, but wasn't able to confirm this in the official Dabricks documentation.&lt;BR /&gt;Is this the reason?&lt;/P&gt;</description>
      <pubDate>Mon, 25 Aug 2025 14:30:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/129633#M48581</guid>
      <dc:creator>saffovski</dc:creator>
      <dc:date>2025-08-25T14:30:54Z</dc:date>
    </item>
    <item>
      <title>Re: foreachBatch doesn't work in structured streaming</title>
      <link>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/150538#M53466</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;&amp;nbsp;in my testing, the print output does not appear anywhere. There is no trace of them anywhere,&amp;nbsp; neither in the notebook or in driver logs.&lt;/P&gt;</description>
      <pubDate>Wed, 11 Mar 2026 07:47:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/foreachbatch-doesn-t-work-in-structured-streaming/m-p/150538#M53466</guid>
      <dc:creator>Malthe</dc:creator>
      <dc:date>2026-03-11T07:47:13Z</dc:date>
    </item>
  </channel>
</rss>

