cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

jdbc integration returning header as data for read operation

Pingleinferyx
New Contributor
package com.example.databricks;
 
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;
 
public class DatabricksJDBCApp {
 
    public static void main(String[] args) {
        // Initialize Spark Session
        SparkSession spark = SparkSession.builder()
                .appName("Databricks JDBC Example")
                .master("local")
                .getOrCreate();
 
string pwd = "XXXXXXXXXXXXXXXXXXXXXXXX";
string warehouseId = "XXXXXXXXXXXXXXX";
String host = "XXXXXXXXXXXXXXXXXXX"
        // JDBC URL to connect to Databricks
        String url = "jdbc:databricks://host:443;" +
                     "transportMode=http;ssl=1;" +
                     "HttpPath=/sql/1.0/warehouses/warehouseId;" +
                     "UID=token;PWD="pwd;
 
        // Specify schema and table
        String dbTable = "framework_databricks.test_table3";  
        // JDBC Driver Class
        String driver = "com.databricks.client.jdbc.Driver";
        
      Dataset<Row> databricksDF = spark.read()
      .format("jdbc")
      .option("url", url)
      .option("dbtable", dbTable) 
      .option("driver", driver)
      .load();
 
        // Show schema and data
        databricksDF.printSchema();
        System.out.println("Row Count: " + databricksDF.count());
        databricksDF.show();
 
        // Stop Spark session
        spark.stop();
    }
}
 
 
spark version :: 3.5.0
databricks jdbc version :: 2.6.40
 
Actual output ::
Row Count: 2
+---+----+
| id|name|
+---+----+
| id|name|
| id|name|
+---+----+
 
 
 
Expected output ::
Row Count: 2
+----+-----+
| id |name |
+----+-----+
| one|Alice|
| two|Bob  |
+----+-----+

so i was trying to integrate databricks in my java code above is code snippet used for jdbc connection while trying to read data from databricks table i am facing issue where df is returning data as header only have provided expected and actual outputs of df  after investigating alittle on google tried changing dbtable name with and without schema name but still same issue is there

4 REPLIES 4

VZLA
Databricks Employee
Databricks Employee

Can you please try it this way:

package com.example.databricks;

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;

public class DatabricksJDBCApp {

    public static void main(String[] args) {
        // Initialize Spark Session
        SparkSession spark = SparkSession.builder()
                .appName("Databricks JDBC Example")
                .master("local")
                .getOrCreate();

        String pwd = "XXXXXXXXXXXXXXXXXXXXXXXX";
        String warehouseId = "XXXXXXXXXXXXXXX";
        String host = "XXXXXXXXXXXXXXXXXXX";

        // JDBC URL to connect to Databricks
        String url = "jdbc:databricks://" + host + ":443;" +
                     "transportMode=http;ssl=1;" +
                     "HttpPath=/sql/1.0/warehouses/" + warehouseId + ";" +
                     "UID=token;PWD=" + pwd;

        // Use a SQL query to fetch the data
        String query = "(SELECT * FROM framework_databricks.test_table3) AS temp";

        // JDBC Driver Class
        String driver = "com.databricks.client.jdbc.Driver";
        
        Dataset<Row> databricksDF = spark.read()
            .format("jdbc")
            .option("url", url)
            .option("dbtable", query)  // Use SQL query as dbtable
            .option("driver", driver)
            .load();

        // Show schema and data
        databricksDF.printSchema();
        System.out.println("Row Count: " + databricksDF.count());
        databricksDF.show();

        // Stop Spark session
        spark.stop();
    }
}

dixonantony
New Contributor II

For me also same issue, can you please help?

from pyspark.sql import SparkSession
import pyspark
import os
from datetime import datetime
from subprocess import PIPE, run

jdbc_url = "jdbc:databricks://adb-xxxxx.1.azuredatabricks.net:443/default;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/warehouses/xxxx;"
username = "token"
password = "dapxxxxxx15f12"

# Use a SQL query to fetch the data
query = "(SELECT * FROM test_catalog.ams.sample) AS temp"
driver = "com.databricks.client.jdbc.Driver"

spark = SparkSession \
.builder \
.appName("Databricks JDBC Read") \
.config("spark.jars", "/home/spark/shared/user-libs/spark/DatabricksJDBC42.jar")\
.config("spark.sql.sources.jdbc.useNativeQuery", "false") \
.getOrCreate()


# # Read data from Databricks SQL endpoint
df = spark.read \
.format("jdbc") \
.option("url", jdbc_url) \
.option("dbtable", query) \
.option("user", username) \
.option("password", password) \
.option("driver", driver) \
.load()

df.printSchema()
df.show()

# stop the session
spark.stop()

Result:-

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
root
|-- first_name: string (nullable = true)

+----------+
|first_name|
+----------+
|first_name|
|first_name|
|first_name|
|first_name|
|first_name|
|first_name|
|first_name|
|first_name|
|first_name|
+----------+

(python) bash-5.1$

VZLA
Databricks Employee
Databricks Employee

@Pingleinferyx @dixonantony My apologies I misread the problem, can you try setting the .option("header", "true") in the spark.read or explicitly mention the schema and see if that helps?

ReubenGreen
New Contributor II

Any update?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group