<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Is Zerobus the Future of Ingestion on Databricks? Lessons from a 7B+ Transaction Platform in Community Articles</title>
    <link>https://community.databricks.com/t5/community-articles/is-zerobus-the-future-of-ingestion-on-databricks-lessons-from-a/m-p/148909#M1020</link>
    <description>&lt;P&gt;Between 2019 and 2021, we built a multi-market payments data platform on Databricks that now processes more than &lt;STRONG&gt;7 billion transactions per year&lt;/STRONG&gt; across seven markets.&lt;/P&gt;&lt;P&gt;Ingestion was by far the most operationally complex layer.&lt;/P&gt;&lt;P&gt;To support MongoDB CDC streams, we engineered:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;A custom Python CDC publisher&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Azure Event Hubs as the message backbone&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Avro landing in the raw layer&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;A generic Spark Structured Streaming framework to load Bronze (Delta)&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;The architecture worked and scaled — but it required significant custom engineering, careful orchestration, and continuous operational attention as data volume and dataset count increased.&lt;/P&gt;&lt;P&gt;Looking at the capabilities available today, especially Zerobus, it’s hard not to see how much simpler this ingestion layer could become. While still in preview, Zerobus represents a shift toward reducing message-bus dependency, custom streaming frameworks, and ingestion-specific infrastructure.&lt;/P&gt;&lt;P&gt;If it matures as expected, it has strong potential to become the default solution for near–real-time ingestion on Databricks.&lt;/P&gt;&lt;P&gt;I wrote a detailed breakdown of the original architecture, the scaling challenges we encountered, and why Zerobus may fundamentally change how ingestion is designed going forward.&lt;BR /&gt;&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://medium.com/@wesley.felipe/databricks-lakehouse-without-the-workarounds-part-1-0d2101fb9f34" target="_blank" rel="noopener"&gt;[Medium]&amp;nbsp;Databricks Lakehouse Without the Workarounds — Part 1: Ingestion&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 20 Feb 2026 15:49:08 GMT</pubDate>
    <dc:creator>wesleyfelipe</dc:creator>
    <dc:date>2026-02-20T15:49:08Z</dc:date>
    <item>
      <title>Is Zerobus the Future of Ingestion on Databricks? Lessons from a 7B+ Transaction Platform</title>
      <link>https://community.databricks.com/t5/community-articles/is-zerobus-the-future-of-ingestion-on-databricks-lessons-from-a/m-p/148909#M1020</link>
      <description>&lt;P&gt;Between 2019 and 2021, we built a multi-market payments data platform on Databricks that now processes more than &lt;STRONG&gt;7 billion transactions per year&lt;/STRONG&gt; across seven markets.&lt;/P&gt;&lt;P&gt;Ingestion was by far the most operationally complex layer.&lt;/P&gt;&lt;P&gt;To support MongoDB CDC streams, we engineered:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;A custom Python CDC publisher&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Azure Event Hubs as the message backbone&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Avro landing in the raw layer&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;A generic Spark Structured Streaming framework to load Bronze (Delta)&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;The architecture worked and scaled — but it required significant custom engineering, careful orchestration, and continuous operational attention as data volume and dataset count increased.&lt;/P&gt;&lt;P&gt;Looking at the capabilities available today, especially Zerobus, it’s hard not to see how much simpler this ingestion layer could become. While still in preview, Zerobus represents a shift toward reducing message-bus dependency, custom streaming frameworks, and ingestion-specific infrastructure.&lt;/P&gt;&lt;P&gt;If it matures as expected, it has strong potential to become the default solution for near–real-time ingestion on Databricks.&lt;/P&gt;&lt;P&gt;I wrote a detailed breakdown of the original architecture, the scaling challenges we encountered, and why Zerobus may fundamentally change how ingestion is designed going forward.&lt;BR /&gt;&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":link:"&gt;🔗&lt;/span&gt; &lt;A href="https://medium.com/@wesley.felipe/databricks-lakehouse-without-the-workarounds-part-1-0d2101fb9f34" target="_blank" rel="noopener"&gt;[Medium]&amp;nbsp;Databricks Lakehouse Without the Workarounds — Part 1: Ingestion&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 20 Feb 2026 15:49:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/community-articles/is-zerobus-the-future-of-ingestion-on-databricks-lessons-from-a/m-p/148909#M1020</guid>
      <dc:creator>wesleyfelipe</dc:creator>
      <dc:date>2026-02-20T15:49:08Z</dc:date>
    </item>
  </channel>
</rss>

