<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Optimized option to write updates to Aurora PostgresDB from Databricks/spark in Warehousing &amp; Analytics</title>
    <link>https://community.databricks.com/t5/warehousing-analytics/optimized-option-to-write-updates-to-aurora-postgresdb-from/m-p/116086#M2027</link>
    <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If near real-time latency is critical, focus on &lt;STRONG&gt;optimizing parallel writes with batch updates&lt;/STRONG&gt; (Option 1).&lt;BR /&gt;&lt;BR /&gt;If you prioritize transactional stability and can tolerate slightly higher latency due to staging, continue refining your &lt;STRONG&gt;temporary table with merge triggers&lt;/STRONG&gt; (Option 2).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 21 Apr 2025 14:40:05 GMT</pubDate>
    <dc:creator>Walter_C</dc:creator>
    <dc:date>2025-04-21T14:40:05Z</dc:date>
    <item>
      <title>Optimized option to write updates to Aurora PostgresDB from Databricks/spark</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/optimized-option-to-write-updates-to-aurora-postgresdb-from/m-p/115659#M2015</link>
      <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;Hello All,&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; We want to update our postgres tables from our spark structured streaming workflow on Databricks. We are using foreachbatch utility to write to this sink. I want to understand an optimized way to do this at near real time latency avoiding deadlocks or improving concurrency and parallelism. Right now we are considering 2 options:&lt;/P&gt;&lt;P&gt;1. JDBC connector/pyscopg2: where we try to run updates directly on the main table in postgres but we are not utilizing parallelism here appropriately and inserting one record at a time(batchsize=1) for the fear of loosing/delaying other records in the batch in case of a failed record in the batch. This is also increasing latency which we do not desire.&lt;/P&gt;&lt;P&gt;2. Append to temp table: In this approach we are creating a temp table on postgres (for every table we want to update) and then running merge to the actual table via trigger on the postgres end.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Option 2 have been working well for us so far but I want to reach out to others here and experts on the forum to understand if there is any better approach for this or any suggestions on our approach to optimize and achieve real time streaming frequency.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any information is highly appreciated.&lt;/P&gt;&lt;P&gt;Thanks in advance for your response.&lt;/P&gt;&lt;P&gt;Sweta&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 16 Apr 2025 14:33:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/optimized-option-to-write-updates-to-aurora-postgresdb-from/m-p/115659#M2015</guid>
      <dc:creator>Sweta</dc:creator>
      <dc:date>2025-04-16T14:33:49Z</dc:date>
    </item>
    <item>
      <title>Re: Optimized option to write updates to Aurora PostgresDB from Databricks/spark</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/optimized-option-to-write-updates-to-aurora-postgresdb-from/m-p/116086#M2027</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If near real-time latency is critical, focus on &lt;STRONG&gt;optimizing parallel writes with batch updates&lt;/STRONG&gt; (Option 1).&lt;BR /&gt;&lt;BR /&gt;If you prioritize transactional stability and can tolerate slightly higher latency due to staging, continue refining your &lt;STRONG&gt;temporary table with merge triggers&lt;/STRONG&gt; (Option 2).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 21 Apr 2025 14:40:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/optimized-option-to-write-updates-to-aurora-postgresdb-from/m-p/116086#M2027</guid>
      <dc:creator>Walter_C</dc:creator>
      <dc:date>2025-04-21T14:40:05Z</dc:date>
    </item>
  </channel>
</rss>

