<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Pushing data from databricks (cloud) to Oracle (on-prem) instance? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140010#M51328</link>
    <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/103045"&gt;@RIDBX&lt;/a&gt;&amp;nbsp;, right now there isn’t a dedicated Databricks connector or library for writing directly into on-prem Oracle. We don’t see this pattern very often, so there’s no built-in proprietary option today. That said, I dug through our docs and pulled together the options that do exist so you at least have a few concrete paths to consider.&lt;/P&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;What’s available today&lt;/H3&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Spark JDBC (Databricks → Oracle on‑prem):&lt;/STRONG&gt; You can write from Databricks to Oracle using the generic Spark JDBC sink. For large volumes, it’s important to control write parallelism (for example, &lt;CODE class="qt3gz9f"&gt;df.repartition(n)&lt;/CODE&gt; before &lt;CODE class="qt3gz9f"&gt;.write.format("jdbc")&lt;/CODE&gt;) so you don’t overwhelm the Oracle instance and network path, and to size partitions to what Oracle can ingest efficiently.&lt;/P&gt;
&lt;BR /&gt;Example (Python):
&lt;DIV class="go8b9g1 _7pq7t6cl" data-ui-element="code-block-container"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-python qt3gz9e hljs language-python _1ymogdh2"&gt;(df.repartition(&lt;SPAN class="hljs-number"&gt;8&lt;/SPAN&gt;)  &lt;SPAN class="hljs-comment"&gt;# tune to match Oracle capacity and network&lt;/SPAN&gt;
   .write
   .&lt;SPAN class="hljs-built_in"&gt;format&lt;/SPAN&gt;(&lt;SPAN class="hljs-string"&gt;"jdbc"&lt;/SPAN&gt;)
   .option(&lt;SPAN class="hljs-string"&gt;"url"&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;"jdbc:oracle:thin:@//host:1521/service_name"&lt;/SPAN&gt;)
   .option(&lt;SPAN class="hljs-string"&gt;"dbtable"&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;"SCHEMA.TARGET_TABLE"&lt;/SPAN&gt;)
   .option(&lt;SPAN class="hljs-string"&gt;"user"&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;"&amp;lt;user&amp;gt;"&lt;/SPAN&gt;)
   .option(&lt;SPAN class="hljs-string"&gt;"password"&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;"&amp;lt;password&amp;gt;"&lt;/SPAN&gt;)
   .mode(&lt;SPAN class="hljs-string"&gt;"append"&lt;/SPAN&gt;)
   .save())&lt;/CODE&gt;&lt;/PRE&gt;
&lt;DIV class="go8b9g3 _7pq7t62y _7pq7t6cm _7pq7t6ay _7pq7t6bo"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Lakehouse Federation (Oracle → Databricks, read‑only):&lt;/STRONG&gt; Databricks supports a managed way to run federated queries against Oracle without moving data into the lakehouse. You create a Unity Catalog connection + foreign catalog and then query Oracle tables from Databricks. Note: federation is read‑only (no writes back to Oracle from federation).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Oracle reading Databricks tables (zero‑copy):&lt;/STRONG&gt; If what you want is to access Databricks tables from Oracle, Oracle is a supported consumer of &lt;STRONG&gt;Delta Sharing&lt;/STRONG&gt; for read‑only access to Unity Catalog tables (zero‑copy). This lets Oracle read Databricks data without you pushing rows over JDBC in bulk.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Networking considerations (on‑prem Oracle):&lt;/STRONG&gt; Regardless of the approach, Databricks compute must be able to reach the on‑prem listener (VPN/Direct Connect/ExpressRoute or equivalent). This is an explicit prerequisite for both JDBC and Lakehouse Federation connections.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;Is JDBC the “best” option for large volumes?&lt;/H3&gt;
&lt;P class="qt3gz91 paragraph"&gt;It’s a solid and common option, but “best” depends on constraints:&lt;/P&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;If you truly need to copy large volumes into Oracle and keep doing it regularly, &lt;STRONG&gt;JDBC works&lt;/STRONG&gt; but you must throttle/consolidate partitions and batch the writes so the Oracle server and network can keep up, otherwise throughput will collapse or you’ll see back‑pressure/timeouts.&lt;/P&gt;
&lt;P class="qt3gz91 paragraph"&gt;If your requirement is only for Oracle to consume the data (not to “own” it), strongly consider &lt;STRONG&gt;Delta Sharing to Oracle&lt;/STRONG&gt; (read‑only, zero‑copy) to avoid expensive copy jobs and eliminate write bottlenecks on Oracle.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;For ad hoc or operational reporting where you don’t want to ingest into Databricks first, &lt;STRONG&gt;Lakehouse Federation&lt;/STRONG&gt; is ideal (Databricks reads from Oracle on demand), but again this is read‑only and not a solution for pushing data into Oracle.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;About “Lakebridge” vs. Lakeflow&lt;/H3&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;There isn’t a Databricks product called “Lakebridge.” You might be thinking of &lt;STRONG&gt;Lakehouse Federation&lt;/STRONG&gt; (read‑only query over JDBC) or &lt;STRONG&gt;Lakeflow&lt;/STRONG&gt; (Databricks’ unified ingestion/orchestration offering).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Lakeflow Connect&lt;/STRONG&gt; provides managed ingestion connectors (GA for several sources like SQL Server, Salesforce, etc.). The Lakeflow roadmap explicitly includes &lt;STRONG&gt;Oracle&lt;/STRONG&gt; database connectors, with Oracle highlighted in the Lakeflow launch/blog as an upcoming source; timing/features evolve over time.&lt;/P&gt;
&lt;BR /&gt;Today, if you need Databricks → Oracle loads, Lakeflow doesn’t provide a managed “reverse ETL to Oracle” sink; you’d use Spark JDBC as shown above.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;Practical guidance (large volume to Oracle)&lt;/H3&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;Start with Spark JDBC write and tune:&lt;/P&gt;
&lt;UL class="qt3gz98 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;Repartition to a small, fixed number that Oracle can handle (e.g., 4–16), and size clusters accordingly.&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;If you currently have very fine‑grained partitions, coalesce before writing to reduce concurrent connections/inserts.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;Validate network throughput and stability first (on‑prem paths can be the bottleneck even when Oracle is sized correctly).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;If the goal is for Oracle‑side consumers to see the data, evaluate &lt;STRONG&gt;Delta Sharing&lt;/STRONG&gt; so Oracle can read your Unity Catalog tables directly without copying, which often outperforms bulk copy jobs operationally and cost‑wise (read‑only).&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;Good docs to bookmark&lt;/H3&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;A href="https://docs.databricks.com/aws/en/query-federation/oracle" target="_self"&gt;Run federated queries on Oracle&lt;/A&gt; (Lakehouse Federation setup, read‑only).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;A href="https://docs.databricks.com/aws/en/archive/connectors/jdbc" target="_self"&gt;JDBC read/write patterns&lt;/A&gt; and performance tuning from Databricks.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;A href="https://docs.databricks.com/aws/en/archive/connectors/external-systems" target="_self"&gt;Connect to external systems:&lt;/A&gt; approach comparison (federation vs. drivers vs. managed ingestion).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;A href="https://docs.databricks.com/aws/en/external-access/" target="_self"&gt;Access Databricks data from external systems&lt;/A&gt; (Delta Sharing; Oracle supported as a consumer).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;A href="https://www.databricks.com/blog/introducing-databricks-lakeflow" target="_self"&gt;Lakeflow overview and launch blog&lt;/A&gt; (managed connectors and roadmap mentioning Oracle).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Hope this helps, Louis.&lt;/P&gt;</description>
    <pubDate>Sat, 22 Nov 2025 20:05:12 GMT</pubDate>
    <dc:creator>Louis_Frolio</dc:creator>
    <dc:date>2025-11-22T20:05:12Z</dc:date>
    <item>
      <title>Pushing data from databricks (cloud) to Oracle (on-prem) instance?</title>
      <link>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140008#M51327</link>
      <description>&lt;P&gt;Pushing data from databricks (cloud) to Oracle (on-prem) instance?&lt;/P&gt;&lt;P&gt;===================================================&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Thanks for reviewing my threads. I find some threads on this subject dated in 2022 by&amp;nbsp;@&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&lt;A class="" href="https://community.databricks.com/t5/user/viewprofilepage/user-id/71565" target="_self"&gt;&lt;SPAN class=""&gt;Ajay-Pandey&lt;/SPAN&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;P&gt;&lt;A title="Databricks to Oracle " href="http:// https://community.databricks.com/t5/data-engineering/databricks-to-oracle/m-p/16765#M10883" target="_self"&gt;Databricks to Oracle &lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We find many options introduced in databricks, so I am opening this thread to get some new insights.&lt;/P&gt;&lt;P&gt;We have a situation, where a databricks tables (large volumes) sitting on cloud. We like to push/ingest these data set to Oracle (on-prem) instance. I saw some post suggests to use jdbc in databricks notebook and spark write to Oracle table.&lt;/P&gt;&lt;P&gt;Is this best option for large volume data?&lt;/P&gt;&lt;P&gt;I learned about a partnership between Oracle + databricks. It give an option to connect to databricks from Oracle (on-prem instance) and read databricks from oracle via connectivity. I did not see much info.&lt;/P&gt;&lt;P&gt;I find recent Databricks roadmap added many more functionality. Is there a functionality in Lakebridge or LakeFlow to get a solution?&lt;/P&gt;&lt;P&gt;Are there any doc/whitepapers on this subject?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Thanks for your insights .&lt;/P&gt;</description>
      <pubDate>Sat, 22 Nov 2025 19:52:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140008#M51327</guid>
      <dc:creator>RIDBX</dc:creator>
      <dc:date>2025-11-22T19:52:09Z</dc:date>
    </item>
    <item>
      <title>Re: Pushing data from databricks (cloud) to Oracle (on-prem) instance?</title>
      <link>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140010#M51328</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/103045"&gt;@RIDBX&lt;/a&gt;&amp;nbsp;, right now there isn’t a dedicated Databricks connector or library for writing directly into on-prem Oracle. We don’t see this pattern very often, so there’s no built-in proprietary option today. That said, I dug through our docs and pulled together the options that do exist so you at least have a few concrete paths to consider.&lt;/P&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;What’s available today&lt;/H3&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Spark JDBC (Databricks → Oracle on‑prem):&lt;/STRONG&gt; You can write from Databricks to Oracle using the generic Spark JDBC sink. For large volumes, it’s important to control write parallelism (for example, &lt;CODE class="qt3gz9f"&gt;df.repartition(n)&lt;/CODE&gt; before &lt;CODE class="qt3gz9f"&gt;.write.format("jdbc")&lt;/CODE&gt;) so you don’t overwhelm the Oracle instance and network path, and to size partitions to what Oracle can ingest efficiently.&lt;/P&gt;
&lt;BR /&gt;Example (Python):
&lt;DIV class="go8b9g1 _7pq7t6cl" data-ui-element="code-block-container"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-python qt3gz9e hljs language-python _1ymogdh2"&gt;(df.repartition(&lt;SPAN class="hljs-number"&gt;8&lt;/SPAN&gt;)  &lt;SPAN class="hljs-comment"&gt;# tune to match Oracle capacity and network&lt;/SPAN&gt;
   .write
   .&lt;SPAN class="hljs-built_in"&gt;format&lt;/SPAN&gt;(&lt;SPAN class="hljs-string"&gt;"jdbc"&lt;/SPAN&gt;)
   .option(&lt;SPAN class="hljs-string"&gt;"url"&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;"jdbc:oracle:thin:@//host:1521/service_name"&lt;/SPAN&gt;)
   .option(&lt;SPAN class="hljs-string"&gt;"dbtable"&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;"SCHEMA.TARGET_TABLE"&lt;/SPAN&gt;)
   .option(&lt;SPAN class="hljs-string"&gt;"user"&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;"&amp;lt;user&amp;gt;"&lt;/SPAN&gt;)
   .option(&lt;SPAN class="hljs-string"&gt;"password"&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;"&amp;lt;password&amp;gt;"&lt;/SPAN&gt;)
   .mode(&lt;SPAN class="hljs-string"&gt;"append"&lt;/SPAN&gt;)
   .save())&lt;/CODE&gt;&lt;/PRE&gt;
&lt;DIV class="go8b9g3 _7pq7t62y _7pq7t6cm _7pq7t6ay _7pq7t6bo"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Lakehouse Federation (Oracle → Databricks, read‑only):&lt;/STRONG&gt; Databricks supports a managed way to run federated queries against Oracle without moving data into the lakehouse. You create a Unity Catalog connection + foreign catalog and then query Oracle tables from Databricks. Note: federation is read‑only (no writes back to Oracle from federation).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Oracle reading Databricks tables (zero‑copy):&lt;/STRONG&gt; If what you want is to access Databricks tables from Oracle, Oracle is a supported consumer of &lt;STRONG&gt;Delta Sharing&lt;/STRONG&gt; for read‑only access to Unity Catalog tables (zero‑copy). This lets Oracle read Databricks data without you pushing rows over JDBC in bulk.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Networking considerations (on‑prem Oracle):&lt;/STRONG&gt; Regardless of the approach, Databricks compute must be able to reach the on‑prem listener (VPN/Direct Connect/ExpressRoute or equivalent). This is an explicit prerequisite for both JDBC and Lakehouse Federation connections.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;Is JDBC the “best” option for large volumes?&lt;/H3&gt;
&lt;P class="qt3gz91 paragraph"&gt;It’s a solid and common option, but “best” depends on constraints:&lt;/P&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;If you truly need to copy large volumes into Oracle and keep doing it regularly, &lt;STRONG&gt;JDBC works&lt;/STRONG&gt; but you must throttle/consolidate partitions and batch the writes so the Oracle server and network can keep up, otherwise throughput will collapse or you’ll see back‑pressure/timeouts.&lt;/P&gt;
&lt;P class="qt3gz91 paragraph"&gt;If your requirement is only for Oracle to consume the data (not to “own” it), strongly consider &lt;STRONG&gt;Delta Sharing to Oracle&lt;/STRONG&gt; (read‑only, zero‑copy) to avoid expensive copy jobs and eliminate write bottlenecks on Oracle.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;For ad hoc or operational reporting where you don’t want to ingest into Databricks first, &lt;STRONG&gt;Lakehouse Federation&lt;/STRONG&gt; is ideal (Databricks reads from Oracle on demand), but again this is read‑only and not a solution for pushing data into Oracle.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;About “Lakebridge” vs. Lakeflow&lt;/H3&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;There isn’t a Databricks product called “Lakebridge.” You might be thinking of &lt;STRONG&gt;Lakehouse Federation&lt;/STRONG&gt; (read‑only query over JDBC) or &lt;STRONG&gt;Lakeflow&lt;/STRONG&gt; (Databricks’ unified ingestion/orchestration offering).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Lakeflow Connect&lt;/STRONG&gt; provides managed ingestion connectors (GA for several sources like SQL Server, Salesforce, etc.). The Lakeflow roadmap explicitly includes &lt;STRONG&gt;Oracle&lt;/STRONG&gt; database connectors, with Oracle highlighted in the Lakeflow launch/blog as an upcoming source; timing/features evolve over time.&lt;/P&gt;
&lt;BR /&gt;Today, if you need Databricks → Oracle loads, Lakeflow doesn’t provide a managed “reverse ETL to Oracle” sink; you’d use Spark JDBC as shown above.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;Practical guidance (large volume to Oracle)&lt;/H3&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;Start with Spark JDBC write and tune:&lt;/P&gt;
&lt;UL class="qt3gz98 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;Repartition to a small, fixed number that Oracle can handle (e.g., 4–16), and size clusters accordingly.&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;If you currently have very fine‑grained partitions, coalesce before writing to reduce concurrent connections/inserts.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;Validate network throughput and stability first (on‑prem paths can be the bottleneck even when Oracle is sized correctly).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;If the goal is for Oracle‑side consumers to see the data, evaluate &lt;STRONG&gt;Delta Sharing&lt;/STRONG&gt; so Oracle can read your Unity Catalog tables directly without copying, which often outperforms bulk copy jobs operationally and cost‑wise (read‑only).&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;Good docs to bookmark&lt;/H3&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;A href="https://docs.databricks.com/aws/en/query-federation/oracle" target="_self"&gt;Run federated queries on Oracle&lt;/A&gt; (Lakehouse Federation setup, read‑only).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;A href="https://docs.databricks.com/aws/en/archive/connectors/jdbc" target="_self"&gt;JDBC read/write patterns&lt;/A&gt; and performance tuning from Databricks.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;A href="https://docs.databricks.com/aws/en/archive/connectors/external-systems" target="_self"&gt;Connect to external systems:&lt;/A&gt; approach comparison (federation vs. drivers vs. managed ingestion).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;A href="https://docs.databricks.com/aws/en/external-access/" target="_self"&gt;Access Databricks data from external systems&lt;/A&gt; (Delta Sharing; Oracle supported as a consumer).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;A href="https://www.databricks.com/blog/introducing-databricks-lakeflow" target="_self"&gt;Lakeflow overview and launch blog&lt;/A&gt; (managed connectors and roadmap mentioning Oracle).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Hope this helps, Louis.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Nov 2025 20:05:12 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140010#M51328</guid>
      <dc:creator>Louis_Frolio</dc:creator>
      <dc:date>2025-11-22T20:05:12Z</dc:date>
    </item>
    <item>
      <title>Re: Pushing data from databricks (cloud) to Oracle (on-prem) instance?</title>
      <link>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140026#M51334</link>
      <description>&lt;P&gt;Thanks for weighing in and proving an interesting insight. Here are some questions coming to my mind upon reviewing this thread.&lt;/P&gt;&lt;P&gt;1) .write.format("jdbc") has .option("dbtable", "SCHEMA.TARGET_TABLE"). Where to specify Databricks table and SCHEMA.TARGET_TABLE columns mapping. Do we need to make Databricks table and Oracle SCHEMA.TARGET_TABLE structures as mirror?&lt;/P&gt;&lt;P&gt;2) If we have many tables to process, do you need to make these for each table in notebook separate cell or there a way to make generic one script and pass tablename/structures as parameter. If we can , how do we that?&lt;/P&gt;&lt;P&gt;3) option("user", "&amp;lt;user&amp;gt;") , .option("password", "&amp;lt;password&amp;gt;") work for own account. If we use service principal account to prod env. How do we configure these credentials as hidden?&lt;/P&gt;&lt;P&gt;4) I was refering to &lt;A title="Lakebridge" href="https://www.databricks.com/solutions/migration/lakebridge" target="_self"&gt;&lt;STRONG&gt;“Lakebridge.”&lt;/STRONG&gt;&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for your guidance.&lt;/P&gt;</description>
      <pubDate>Sun, 23 Nov 2025 21:05:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140026#M51334</guid>
      <dc:creator>RIDBX</dc:creator>
      <dc:date>2025-11-23T21:05:28Z</dc:date>
    </item>
    <item>
      <title>Re: Pushing data from databricks (cloud) to Oracle (on-prem) instance?</title>
      <link>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140444#M51427</link>
      <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;Thanks for weighing in and proving an interesting insight. Here are some questions coming to my mind upon reviewing this thread.&lt;/P&gt;&lt;P&gt;1) .write.format("jdbc") has .option("dbtable", "SCHEMA.TARGET_TABLE"). Where to specify Databricks table and SCHEMA.TARGET_TABLE columns mapping. Do we need to make Databricks table and Oracle SCHEMA.TARGET_TABLE structures as mirror?&lt;/P&gt;&lt;P&gt;2) If we have many tables to process, do you need to make these for each table in notebook separate cell or there a way to make generic one script and pass tablename/structures as parameter. If we can , how do we that?&lt;/P&gt;&lt;P&gt;3) option("user", "&amp;lt;user&amp;gt;") , .option("password", "&amp;lt;password&amp;gt;") work for own account. If we use service principal account to prod env. How do we configure these credentials as hidden?&lt;/P&gt;&lt;P&gt;4) I was refering to &lt;A href="https://www.databricks.com/solutions/migration/lakebridge" target="_blank" rel="nofollow noopener noreferrer"&gt;“Lakebridge.”&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for your guidance.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 26 Nov 2025 17:13:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140444#M51427</guid>
      <dc:creator>RIDBX</dc:creator>
      <dc:date>2025-11-26T17:13:46Z</dc:date>
    </item>
    <item>
      <title>Re: Pushing data from databricks (cloud) to Oracle (on-prem) instance?</title>
      <link>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140450#M51433</link>
      <description>&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;Option 1: Spark JDBC write from Databricks to Oracle (recommended for “push”/ingestion)&lt;/H3&gt;
&lt;P class="qt3gz91 paragraph"&gt;Use the built‑in Spark JDBC writer with Oracle’s JDBC driver. It’s the most direct path for writing into on‑prem Oracle and gives you control over batching, parallelism, and commit semantics.&lt;/P&gt;
&lt;DIV class="go8b9g1 _7pq7t6cl" data-ui-element="code-block-container"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-python qt3gz9e hljs language-python _1ymogdh2"&gt;df = spark.read.table(&lt;SPAN class="hljs-string"&gt;"your_source_table"&lt;/SPAN&gt;)  &lt;SPAN class="hljs-comment"&gt;# or a Delta/Parquet source&lt;/SPAN&gt;

jdbc_url = &lt;SPAN class="hljs-string"&gt;"jdbc:oracle:thin:@//&amp;lt;host&amp;gt;:&amp;lt;port&amp;gt;/&amp;lt;service_name&amp;gt;"&lt;/SPAN&gt;
props = {
  &lt;SPAN class="hljs-string"&gt;"user"&lt;/SPAN&gt;: &lt;SPAN class="hljs-string"&gt;"&amp;lt;oracle_user&amp;gt;"&lt;/SPAN&gt;,
  &lt;SPAN class="hljs-string"&gt;"password"&lt;/SPAN&gt;: &lt;SPAN class="hljs-string"&gt;"&amp;lt;oracle_password&amp;gt;"&lt;/SPAN&gt;,
  &lt;SPAN class="hljs-string"&gt;"driver"&lt;/SPAN&gt;: &lt;SPAN class="hljs-string"&gt;"oracle.jdbc.OracleDriver"&lt;/SPAN&gt;
}

&lt;SPAN class="hljs-comment"&gt;# Tune parallelism before write (example: 16 partitions). Don’t set too high.&lt;/SPAN&gt;
(df.repartition(&lt;SPAN class="hljs-number"&gt;16&lt;/SPAN&gt;)
   .write
   .&lt;SPAN class="hljs-built_in"&gt;format&lt;/SPAN&gt;(&lt;SPAN class="hljs-string"&gt;"jdbc"&lt;/SPAN&gt;)
   .option(&lt;SPAN class="hljs-string"&gt;"url"&lt;/SPAN&gt;, jdbc_url)
   .option(&lt;SPAN class="hljs-string"&gt;"dbtable"&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;"SCHEMA.TARGET_TABLE"&lt;/SPAN&gt;)
   .options(**props)
   .mode(&lt;SPAN class="hljs-string"&gt;"append"&lt;/SPAN&gt;)
   .save())&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/DIV&gt;
&lt;P class="qt3gz91 paragraph"&gt;Key tuning guidance for large volumes:&lt;/P&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;Control write parallelism by &lt;CODE class="qt3gz9f"&gt;repartition(N)&lt;/CODE&gt; before the write; keep N moderate (e.g., 8–32) so you don’t overwhelm Oracle with too many concurrent inserts.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;For reliability with constraints, write into a staging table and then MERGE on Oracle to avoid duplication or primary key violations in case of partial failures; speculative execution should be disabled by default.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;Network matters: ensure private connectivity from Databricks compute to your on‑prem Oracle (VPN/ExpressRoute/Direct Connect) and allow the relevant ports; this same prerequisite is documented for Lakehouse Federation but applies equally to JDBC writes.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="qt3gz91 paragraph"&gt;Notes:&lt;/P&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;Fetch size tuning is for reads, not writes. However, if you perform any JDBC reads from Oracle before transforming/writing, Oracle’s default fetchSize is small (10). Increasing it (e.g., to 100–1000, workload‑dependent) improves read throughput.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;Option 2: Lakehouse Federation with Oracle (read-only; not a sink)&lt;/H3&gt;
&lt;P class="qt3gz91 paragraph"&gt;Lakehouse Federation lets Databricks query Oracle without migrating data, by creating a Unity Catalog connection and a foreign catalog that mirrors Oracle. This is excellent for ad‑hoc analytics or POCs, but it is read‑only for Oracle—so it does not push data into Oracle tables. Use JDBC writes for ingestion into Oracle as in Option 1.&lt;/P&gt;
&lt;UL class="qt3gz97 qt3gz92"&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;Oracle Federation requires network connectivity from Databricks compute and uses TLS for Oracle Cloud; other Oracle databases use Oracle’s Native Network Encryption (NNE).&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="qt3gz9a"&gt;
&lt;P class="qt3gz91 paragraph"&gt;Oracle Federation is supported and was announced in public preview earlier and later GA across clouds; use the docs to set up connections and foreign catalogs in Unity Catalog.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;Option 3: Lakeflow (Connect/Jobs) and how it relates&lt;/H3&gt;
&lt;P class="qt3gz91 paragraph"&gt;&lt;STRONG&gt;Lakeflow Connect&lt;/STRONG&gt; is Databricks’ ingestion (into the Lakehouse) and workflow layer; where both Federation and Lakeflow Connect exist for a source, we recommend Lakeflow Connect for higher volumes and lower latency ingestion into Databricks—but Lakeflow is oriented to bringing data in, not writing out to Oracle as a sink.&lt;/P&gt;</description>
      <pubDate>Wed, 26 Nov 2025 18:10:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140450#M51433</guid>
      <dc:creator>iyashk-DB</dc:creator>
      <dc:date>2025-11-26T18:10:03Z</dc:date>
    </item>
    <item>
      <title>Re: Pushing data from databricks (cloud) to Oracle (on-prem) instance?</title>
      <link>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140456#M51437</link>
      <description>&lt;P&gt;Thanks for weighing in/guidance. I did not see your response/guidance for following:&lt;/P&gt;&lt;P&gt;2) If we have many tables to process, do you need to make these for each table in notebook separate cell or there a way to make generic one script and pass tablename/structures as parameter. If we can , how do we that?&lt;/P&gt;&lt;P&gt;3) option("user", "&amp;lt;user&amp;gt;") , .option("password", "&amp;lt;password&amp;gt;") work for own account. If we use service principal account to prod env. &lt;STRONG&gt;How do we configure these credentials as hidden?&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Thanks for your guidance.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Nov 2025 20:12:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140456#M51437</guid>
      <dc:creator>RIDBX</dc:creator>
      <dc:date>2025-11-26T20:12:47Z</dc:date>
    </item>
    <item>
      <title>Re: Pushing data from databricks (cloud) to Oracle (on-prem) instance?</title>
      <link>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140512#M51449</link>
      <description>&lt;H3 class="_7uu25p0 qt3gz9c _7pq7t612 heading3 _7uu25p1"&gt;2) Run many tables with a generic script (parameterized)&lt;/H3&gt;
&lt;P&gt;We suggest that you define a list that maps each Databricks source to its corresponding Oracle target and per-table options (partitions, mode, and pre/post SQL). Then loop over it in one notebook or job.&lt;/P&gt;
&lt;P&gt;Something like as follows for an example (Python, single script):&lt;/P&gt;
&lt;DIV data-ui-element="code-block-container"&gt;
&lt;PRE&gt;# 1) Table manifest (could also come from a Delta table)
manifest = [
  {
    "source_table": "catalog.schema.table1",
    "target_table": "SCHEMA.TABLE1",
    "partitions": 16,
    "mode": "append"
  },
  {
    "source_table": "catalog.schema.table2",
    "target_table": "SCHEMA.TABLE2",
    "partitions": 8,
    "mode": "overwrite"  # or "append"
  },
  # ...
]

# 2) Secrets-backed credentials (see section 3)
oracle_user = dbutils.secrets.get("oracle-prod", "username")
oracle_pwd  = dbutils.secrets.get("oracle-prod", "password")

jdbc_url = "jdbc:oracle:thin:@//&amp;lt;host&amp;gt;:&amp;lt;port&amp;gt;/&amp;lt;service_name&amp;gt;"
common_props = {
  "user": oracle_user,
  "password": oracle_pwd,
  "driver": "oracle.jdbc.OracleDriver",
  # Optional write tuning:
  # "batchsize": "1000",
}

def write_one(spec):
  df = spark.read.table(spec["source_table"])
  n = int(spec.get("partitions", 8))
  mode = spec.get("mode", "append")
  (df.repartition(n)               # control parallelism responsibly
     .write
     .format("jdbc")
     .option("url", jdbc_url)
     .option("dbtable", spec["target_table"])
     .options(**common_props)
     .mode(mode)
     .save())

for spec in manifest:
  write_one(spec)&lt;/PRE&gt;
&lt;DIV&gt;
&lt;DIV&gt;Why this works at scale:&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P&gt;You control write parallelism per table using repartition(N); keep N moderate (e.g., 8–32) to avoid overloading Oracle.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P&gt;One notebook can process all tables; use Databricks Jobs to pass parameters (e.g., a manifest path or table filter) and to run subsets in parallel with controlled concurrency.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;3) Hide credentials (service account / non‑human account)&lt;/H3&gt;
&lt;P&gt;Use Databricks Secrets so credentials never appear in code, logs, or UI:&lt;/P&gt;
&lt;DIV data-ui-element="code-block-container"&gt;
&lt;PRE&gt;# Store once in a secret scope (e.g., backed by a cloud key vault)
oracle_user = dbutils.secrets.get(scope="oracle-prod", key="username")
oracle_pwd  = dbutils.secrets.get(scope="oracle-prod", key="password")

props = {
  "user": oracle_user,
  "password": oracle_pwd,
  "driver": "oracle.jdbc.OracleDriver"
}&lt;/PRE&gt;
&lt;/DIV&gt;
&lt;P&gt;Databricks explicitly recommends storing JDBC usernames/passwords in secrets rather than embedding them in URLs or notebooks; retrieve them at runtime with dbutils.secrets.get(...). This works for both human users and service accounts created in Oracle for production use.&lt;/P&gt;</description>
      <pubDate>Thu, 27 Nov 2025 11:14:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/140512#M51449</guid>
      <dc:creator>iyashk-DB</dc:creator>
      <dc:date>2025-11-27T11:14:44Z</dc:date>
    </item>
    <item>
      <title>Re: Pushing data from databricks (cloud) to Oracle (on-prem) instance?</title>
      <link>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/146945#M52739</link>
      <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;Thanks for weighing in and proving an interesting insight. Here are some questions coming to my mind upon reviewing this thread.&lt;/P&gt;&lt;P&gt;1) .write.format("jdbc") has .option("dbtable", "SCHEMA.TARGET_TABLE"). Where to specify Databricks table and SCHEMA.TARGET_TABLE columns mapping. Do we need to make Databricks table and Oracle SCHEMA.TARGET_TABLE structures as mirror?&lt;/P&gt;&lt;P&gt;2) If we have many tables to process, do you need to make these for each table in notebook separate cell or there a way to make generic one script and pass tablename/structures as parameter. If we can , how do we that?&lt;/P&gt;&lt;P&gt;3) option("user", "&amp;lt;user&amp;gt;") , .option("password", "&amp;lt;password&amp;gt;") work for own account. If we use service principal account to prod env. How do we configure these credentials as hidden?&lt;/P&gt;&lt;P&gt;4) I was refering to &lt;A href="https://www.databricks.com/solutions/migration/lakebridge" target="_blank" rel="nofollow noopener noreferrer"&gt;“Lakebridge.”&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for your guidance.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 05 Feb 2026 23:06:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/146945#M52739</guid>
      <dc:creator>RIDBX</dc:creator>
      <dc:date>2026-02-05T23:06:49Z</dc:date>
    </item>
    <item>
      <title>Re: Pushing data from databricks (cloud) to Oracle (on-prem) instance?</title>
      <link>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/146946#M52740</link>
      <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;Thanks for weighing in and proving an interesting insight. Here are some questions coming to my mind upon reviewing this thread.&lt;/P&gt;&lt;P&gt;I like to see a way to write to on-prem linux folder directly from DBX without going thru S3. As we see many features added to DBX, are there way to do this?&lt;/P&gt;&lt;P&gt;Thanks for your guidance.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 05 Feb 2026 23:10:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pushing-data-from-databricks-cloud-to-oracle-on-prem-instance/m-p/146946#M52740</guid>
      <dc:creator>RIDBX</dc:creator>
      <dc:date>2026-02-05T23:10:29Z</dc:date>
    </item>
  </channel>
</rss>

