Hi — welcome to Databricks! Unfortunately, Databricks Connect v2 (DBR 13.3+) does not support Java — it only supports Python, Scala, and R. The legacy v1 did support Java, but it's been deprecated and is end-of-support.
That said, here are your options as a Java developer:
Option 1: Use Scala with Databricks Connect (JVM interop)
Since Scala runs on the JVM, you can call the Databricks Connect Scala APIs from Java. This gives you full DataFrame read/write support:
// Scala — callable from Java via JVM interop
import com.databricks.connect.DatabricksSession
import org.apache.spark.sql.types._
val spark = DatabricksSession.builder().getOrCreate()
// Write a DataFrame to a table
val df = spark.read.table("samples.nyctaxi.trips")
df.limit(5).show()
// Create and write your own DataFrame
val schema = StructType(Seq(
StructField("id", IntegerType, false),
StructField("name", StringType, false)
))
val data = Seq(Row(1, "Alice"), Row(2, "Bob"))
val df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema)
df.write.saveAsTable("my_catalog.my_schema.my_table")
Add the Maven dependency:
<dependency>
<groupId>com.databricks</groupId>
<artifactId>databricks-connect</artifactId>
<version>15.4.0</version> <!-- match your DBR version -->
</dependency>
See: Databricks Connect Scala Examples
Option 2: Databricks SDK for Java + SQL (Pure Java, no Spark dependency)
If you want to stay in pure Java, the Databricks SDK for Java lets you:
- Upload parquet files to Unity Catalog Volumes via the Files API
- Execute SQL via the Statement Execution API to register/query tables
This is closer to the Iceberg pattern you described (write files, then register):
import com.databricks.sdk.WorkspaceClient;
WorkspaceClient w = new WorkspaceClient();
// Upload a parquet file to a Volume
w.files().upload("/Volumes/my_catalog/my_schema/my_volume/data.parquet", inputStream);
// Then run SQL to create a table from the file
// (via Statement Execution API or JDBC for the SQL part)
Maven dependency:
<dependency>
<groupId>com.databricks</groupId>
<artifactId>databricks-sdk-java</artifactId>
<version>0.2.0</version> <!-- use latest from Maven Central -->
</dependency>
Option 3: JDBC with Bulk Ingestion
I know you want to avoid JDBC, but it's worth noting that Databricks JDBC supports Arrow-based bulk ingestion which significantly reduces the overhead compared to traditional row-by-row JDBC inserts. It may be faster than you expect.
A Note on Free Edition
Databricks Connect requires a cluster or serverless compute with Spark Connect enabled. The Free Edition (Community Edition) has limited compute options, so Databricks Connect may not work there. The SDK + SQL approach (Option 2) or JDBC (Option 3) are more likely to work on the free tier.
Docs:
Hope that helps point you in the right direction!
Anuj Lathi
Solutions Engineer @ Databricks