<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: What's the best architecture for Structured Streaming and why? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/what-s-the-best-architecture-for-structured-streaming-and-why/m-p/23467#M16195</link>
    <description>&lt;P&gt;@John Constantine​&amp;nbsp;, "Bronze Layer -&amp;gt; which has raw Kafka data"&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you use &lt;A href="https://confluent.io" alt="https://confluent.io" target="_blank"&gt;confluent.io&lt;/A&gt;, you can also utilize a direct sink to DataLake Storage - bronze layer.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;"Silver Layer -&amp;gt; which has deserialized data"&lt;/P&gt;&lt;P&gt;Then use Delta Live Tables to process it to delta silver. (file notification mode recommended)&lt;/P&gt;</description>
    <pubDate>Thu, 07 Apr 2022 10:15:43 GMT</pubDate>
    <dc:creator>Hubert-Dudek</dc:creator>
    <dc:date>2022-04-07T10:15:43Z</dc:date>
    <item>
      <title>What's the best architecture for Structured Streaming and why?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-s-the-best-architecture-for-structured-streaming-and-why/m-p/23466#M16194</link>
      <description>&lt;P&gt;I am building an ETL pipeline which reads data from a Kafka topic ( data is serialized in Thrift format) and writes it to Delta Table in databricks. I want to have two layers&lt;/P&gt;&lt;P&gt;Bronze Layer -&amp;gt; which has raw Kafka data&lt;/P&gt;&lt;P&gt;Silver Layer -&amp;gt; which has deserialized data&lt;/P&gt;&lt;P&gt;I can think of two ways to do it&lt;/P&gt;&lt;P&gt;First way is to read data from Kafka, write the raw data to bronze then read data from bronze and decode it and write it to silver&lt;/P&gt;&lt;P&gt;Second way is to read data from Kafka, write data to bronze and simultaneously decode the data and write it to silver. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am trying to understand the advantages &amp;amp; disadvantages of each solution. Solution two is much easier to implement but feels like solution one is more fault tolerant&lt;/P&gt;</description>
      <pubDate>Wed, 06 Apr 2022 22:47:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-s-the-best-architecture-for-structured-streaming-and-why/m-p/23466#M16194</guid>
      <dc:creator>Constantine</dc:creator>
      <dc:date>2022-04-06T22:47:43Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best architecture for Structured Streaming and why?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-s-the-best-architecture-for-structured-streaming-and-why/m-p/23467#M16195</link>
      <description>&lt;P&gt;@John Constantine​&amp;nbsp;, "Bronze Layer -&amp;gt; which has raw Kafka data"&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you use &lt;A href="https://confluent.io" alt="https://confluent.io" target="_blank"&gt;confluent.io&lt;/A&gt;, you can also utilize a direct sink to DataLake Storage - bronze layer.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;"Silver Layer -&amp;gt; which has deserialized data"&lt;/P&gt;&lt;P&gt;Then use Delta Live Tables to process it to delta silver. (file notification mode recommended)&lt;/P&gt;</description>
      <pubDate>Thu, 07 Apr 2022 10:15:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-s-the-best-architecture-for-structured-streaming-and-why/m-p/23467#M16195</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-04-07T10:15:43Z</dc:date>
    </item>
  </channel>
</rss>

