<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Is there a Databricks spark connector for java? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119299#M45832</link>
    <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/34815"&gt;@Louis_Frolio&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Just wanted to clear my use case, I want to run the code locally, but the data insertion should happen in a remote Databricks workspace. I tried using JDBC, but it seems its performance is low in the case of write operations, even after adding batch size and number of partitions.&lt;BR /&gt;&lt;BR /&gt;Is there any alternative for my use case? Also, I am using Java for the current implementation.&lt;/P&gt;</description>
    <pubDate>Thu, 15 May 2025 10:10:42 GMT</pubDate>
    <dc:creator>I-am-Biplab</dc:creator>
    <dc:date>2025-05-15T10:10:42Z</dc:date>
    <item>
      <title>Is there a Databricks spark connector for java?</title>
      <link>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119122#M45800</link>
      <description>&lt;P&gt;Is there a Databricks Spark connector for Java, just like we have for Snowflake (reference of Snowflake spark connector -&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://docs.snowflake.com/en/user-guide/spark-connector-use)" target="_blank" rel="nofollow noopener noreferrer"&gt;https://docs.snowflake.com/en/user-guide/spark-connector-use)&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Essentially, the use case is to transfer data from S3 to a Databricks table. In the current implementation, I am using Spark to read data from S3 and JDBC to write data to Databricks. But I want to use Spark instead to write data to Databricks.&lt;/P&gt;</description>
      <pubDate>Wed, 14 May 2025 06:30:33 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119122#M45800</guid>
      <dc:creator>I-am-Biplab</dc:creator>
      <dc:date>2025-05-14T06:30:33Z</dc:date>
    </item>
    <item>
      <title>Re: Is there a Databricks spark connector for java?</title>
      <link>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119187#M45814</link>
      <description>&lt;DIV class="paragraph"&gt;Databricks does not offer a specific Spark connector for Java comparable to the Snowflake Spark connector mentioned in the provided URL. However, Databricks supports directly writing data to Databricks tables using Spark APIs. In your use case of transferring data from S3 to a Databricks table, you can achieve this fully using Spark without relying on JDBC.&lt;/DIV&gt;
&lt;DIV class="paragraph"&gt;Here’s a streamlined approach to replace the JDBC write operation with Spark-based writes: 1. &lt;STRONG&gt;Reading Data from S3&lt;/STRONG&gt;: Use the Spark &lt;CODE&gt;read&lt;/CODE&gt; function with the appropriate format based on your data (e.g., &lt;CODE&gt;csv&lt;/CODE&gt;, &lt;CODE&gt;parquet&lt;/CODE&gt;, etc.) and specify the S3 path. &lt;CODE&gt;scala
   val data = spark.read.format("parquet").load("s3://bucket-name/folder-name")
   &lt;/CODE&gt; Ensure you configure your AWS credentials for accessing S3.&lt;/DIV&gt;
&lt;OL start="2"&gt;
&lt;LI&gt;&lt;STRONG&gt;Writing Data to Databricks Table&lt;/STRONG&gt;: Use the Delta format or another supported format to write data directly to a Databricks table: &lt;CODE&gt;scala
data.write.format("delta").save("/mnt/databricks-table-path")
&lt;/CODE&gt; If the table is pre-defined, you can use the &lt;CODE&gt;saveAsTable&lt;/CODE&gt; method instead: &lt;CODE&gt;scala
data.write.format("delta").mode("overwrite").saveAsTable("database.table_name")
&lt;/CODE&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;DIV class="paragraph"&gt;This approach eliminates the need for JDBC and integrates seamlessly with Databricks' native capabilities. However, if Java compatibility is an absolute requirement, these Spark APIs can still be invoked via Java bindings provided by Apache Spark. Concepts like &lt;CODE&gt;DataStreamReader&lt;/CODE&gt; and &lt;CODE&gt;DataStreamWriter&lt;/CODE&gt; in Java mirror their Scala equivalents.&lt;/DIV&gt;
&lt;DIV class="paragraph"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="paragraph"&gt;Hope this helps, Lou.&lt;/DIV&gt;</description>
      <pubDate>Wed, 14 May 2025 13:42:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119187#M45814</guid>
      <dc:creator>Louis_Frolio</dc:creator>
      <dc:date>2025-05-14T13:42:52Z</dc:date>
    </item>
    <item>
      <title>Re: Is there a Databricks spark connector for java?</title>
      <link>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119299#M45832</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/34815"&gt;@Louis_Frolio&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Just wanted to clear my use case, I want to run the code locally, but the data insertion should happen in a remote Databricks workspace. I tried using JDBC, but it seems its performance is low in the case of write operations, even after adding batch size and number of partitions.&lt;BR /&gt;&lt;BR /&gt;Is there any alternative for my use case? Also, I am using Java for the current implementation.&lt;/P&gt;</description>
      <pubDate>Thu, 15 May 2025 10:10:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119299#M45832</guid>
      <dc:creator>I-am-Biplab</dc:creator>
      <dc:date>2025-05-15T10:10:42Z</dc:date>
    </item>
    <item>
      <title>Re: Is there a Databricks spark connector for java?</title>
      <link>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119364#M45854</link>
      <description>&lt;P&gt;We have native connectivity with VSCode. Check it out here:&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/dev-tools/vscode-ext/" target="_blank"&gt;https://docs.databricks.com/aws/en/dev-tools/vscode-ext/&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;You may also want to dig into Databricks Connect. Check it out here:&amp;nbsp;&lt;A href="https://docs.databricks.com/aws/en/release-notes/dbconnect/" target="_blank"&gt;https://docs.databricks.com/aws/en/release-notes/dbconnect/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 15 May 2025 16:28:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119364#M45854</guid>
      <dc:creator>Louis_Frolio</dc:creator>
      <dc:date>2025-05-15T16:28:16Z</dc:date>
    </item>
    <item>
      <title>Re: Is there a Databricks spark connector for java?</title>
      <link>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119410#M45866</link>
      <description>&lt;P&gt;You don't need a separate Spark connector ,Databricks natively supports writing to Delta tables using standard Spark APIs. Instead of using JDBC, you can use df.write().format("delta") to efficiently write data from S3 to Databricks tables.&lt;/P&gt;</description>
      <pubDate>Fri, 16 May 2025 03:49:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/is-there-a-databricks-spark-connector-for-java/m-p/119410#M45866</guid>
      <dc:creator>sandeepmankikar</dc:creator>
      <dc:date>2025-05-16T03:49:46Z</dc:date>
    </item>
  </channel>
</rss>

