<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: jdbc integration returning header as data for read operation in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/100437#M40300</link>
    <description>&lt;P&gt;When I attempt either of those options I get the following errors:&amp;nbsp;&lt;/P&gt;&lt;P&gt;defining the .schema(explicit_schema) returns the same result of having the headers as the data for every row&lt;/P&gt;&lt;P&gt;OR&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;[Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: null, Query: SELECT * FROM edl.test_table WHERE 1=0, Error message from Server: Configuration header is not available..&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 29 Nov 2024 15:42:08 GMT</pubDate>
    <dc:creator>Dengineer</dc:creator>
    <dc:date>2024-11-29T15:42:08Z</dc:date>
    <item>
      <title>jdbc integration returning header as data for read operation</title>
      <link>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/93240#M38661</link>
      <description>&lt;DIV&gt;package com.example.databricks;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;import org.apache.spark.sql.Dataset;&lt;/DIV&gt;&lt;DIV&gt;import org.apache.spark.sql.Row;&lt;/DIV&gt;&lt;DIV&gt;import org.apache.spark.sql.SparkSession;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;public class DatabricksJDBCApp {&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; public static void main(String[] args) {&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // Initialize Spark Session&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; SparkSession spark = SparkSession.builder()&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .appName("Databricks JDBC Example")&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .master("local")&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .getOrCreate();&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;string pwd = "XXXXXXXXXXXXXXXXXXXXXXXX";&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;string warehouseId = "XXXXXXXXXXXXXXX";&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;String host = "XXXXXXXXXXXXXXXXXXX"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // JDBC URL to connect to Databricks&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; String url = "jdbc:databricks://host:443;" +&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;"transportMode=http;ssl=1;" +&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;"HttpPath=/sql/1.0/warehouses/warehouseId;" +&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;"UID=token;PWD="pwd;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // Specify schema and table&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; String dbTable = "framework_databricks.test_table3";&amp;nbsp;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // JDBC Driver Class&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; String driver = "com.databricks.client.jdbc.Driver";&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; Dataset&amp;lt;Row&amp;gt; databricksDF = spark.read()&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; .format("jdbc")&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; .option("url", url)&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; .option("dbtable", dbTable)&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; .option("driver", driver)&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; .load();&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // Show schema and data&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; databricksDF.printSchema();&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; System.out.println("Row Count: " + databricksDF.count());&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; databricksDF.show();&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // Stop Spark session&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; spark.stop();&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; }&lt;/DIV&gt;&lt;DIV&gt;}&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;spark version :: 3.5.0&lt;/DIV&gt;&lt;DIV&gt;databricks jdbc version :: 2.6.40&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Actual output ::&lt;/DIV&gt;&lt;DIV&gt;Row Count: 2&lt;/DIV&gt;&lt;DIV&gt;+---+----+&lt;/DIV&gt;&lt;DIV&gt;| id|name|&lt;/DIV&gt;&lt;DIV&gt;+---+----+&lt;/DIV&gt;&lt;DIV&gt;| id|name|&lt;/DIV&gt;&lt;DIV&gt;| id|name|&lt;/DIV&gt;&lt;DIV&gt;+---+----+&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Expected output ::&lt;/DIV&gt;&lt;DIV&gt;Row Count: 2&lt;/DIV&gt;&lt;DIV&gt;+----+-----+&lt;/DIV&gt;&lt;DIV&gt;| id |name |&lt;/DIV&gt;&lt;DIV&gt;+----+-----+&lt;/DIV&gt;&lt;DIV&gt;| one|Alice|&lt;/DIV&gt;&lt;DIV&gt;| two|Bob&amp;nbsp; |&lt;/DIV&gt;&lt;DIV&gt;+----+-----+&lt;/DIV&gt;&lt;P&gt;so i was trying to integrate databricks in my java code above is code snippet used for jdbc connection while trying to read data from databricks table i am facing issue where df is returning data as header only have provided expected and actual outputs of df&amp;nbsp; after investigating alittle on google tried changing dbtable name with and without schema name but still same issue is there&lt;/P&gt;</description>
      <pubDate>Wed, 09 Oct 2024 08:47:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/93240#M38661</guid>
      <dc:creator>Pingleinferyx</dc:creator>
      <dc:date>2024-10-09T08:47:09Z</dc:date>
    </item>
    <item>
      <title>Re: jdbc integration returning header as data for read operation</title>
      <link>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/97232#M39458</link>
      <description>&lt;P&gt;Can you please try it this way:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;package com.example.databricks;

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;

public class DatabricksJDBCApp {

    public static void main(String[] args) {
        // Initialize Spark Session
        SparkSession spark = SparkSession.builder()
                .appName("Databricks JDBC Example")
                .master("local")
                .getOrCreate();

        String pwd = "XXXXXXXXXXXXXXXXXXXXXXXX";
        String warehouseId = "XXXXXXXXXXXXXXX";
        String host = "XXXXXXXXXXXXXXXXXXX";

        // JDBC URL to connect to Databricks
        String url = "jdbc:databricks://" + host + ":443;" +
                     "transportMode=http;ssl=1;" +
                     "HttpPath=/sql/1.0/warehouses/" + warehouseId + ";" +
                     "UID=token;PWD=" + pwd;

        // Use a SQL query to fetch the data
        String query = "(SELECT * FROM framework_databricks.test_table3) AS temp";

        // JDBC Driver Class
        String driver = "com.databricks.client.jdbc.Driver";
        
        Dataset&amp;lt;Row&amp;gt; databricksDF = spark.read()
            .format("jdbc")
            .option("url", url)
            .option("dbtable", query)  // Use SQL query as dbtable
            .option("driver", driver)
            .load();

        // Show schema and data
        databricksDF.printSchema();
        System.out.println("Row Count: " + databricksDF.count());
        databricksDF.show();

        // Stop Spark session
        spark.stop();
    }
}
&lt;/LI-CODE&gt;</description>
      <pubDate>Fri, 01 Nov 2024 14:02:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/97232#M39458</guid>
      <dc:creator>VZLA</dc:creator>
      <dc:date>2024-11-01T14:02:51Z</dc:date>
    </item>
    <item>
      <title>Re: jdbc integration returning header as data for read operation</title>
      <link>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/97749#M39530</link>
      <description>&lt;P&gt;For me also same issue, can you please help?&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;P&gt;from pyspark.sql import SparkSession&lt;BR /&gt;import pyspark&lt;BR /&gt;import os&lt;BR /&gt;from datetime import datetime&lt;BR /&gt;from subprocess import PIPE, run&lt;/P&gt;&lt;P&gt;jdbc_url = "jdbc:databricks://adb-xxxxx.1.azuredatabricks.net:443/default;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/warehouses/xxxx;"&lt;BR /&gt;username = "token"&lt;BR /&gt;password = "dapxxxxxx15f12"&lt;/P&gt;&lt;P&gt;# Use a SQL query to fetch the data&lt;BR /&gt;query = "(SELECT * FROM test_catalog.ams.sample) AS temp"&lt;BR /&gt;driver = "com.databricks.client.jdbc.Driver"&lt;/P&gt;&lt;P&gt;spark = SparkSession \&lt;BR /&gt;.builder \&lt;BR /&gt;.appName("Databricks JDBC Read") \&lt;BR /&gt;.config("spark.jars", "/home/spark/shared/user-libs/spark/DatabricksJDBC42.jar")\&lt;BR /&gt;.config("spark.sql.sources.jdbc.useNativeQuery", "false") \&lt;BR /&gt;.getOrCreate()&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;# # Read data from Databricks SQL endpoint&lt;BR /&gt;df = spark.read \&lt;BR /&gt;.format("jdbc") \&lt;BR /&gt;.option("url", jdbc_url) \&lt;BR /&gt;.option("dbtable", query) \&lt;BR /&gt;.option("user", username) \&lt;BR /&gt;.option("password", password) \&lt;BR /&gt;.option("driver", driver) \&lt;BR /&gt;.load()&lt;/P&gt;&lt;P&gt;df.printSchema()&lt;BR /&gt;df.show()&lt;BR /&gt;&lt;BR /&gt;# stop the session&lt;BR /&gt;spark.stop()&lt;/P&gt;&lt;P&gt;Result:-&lt;/P&gt;&lt;P&gt;SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".&lt;BR /&gt;SLF4J: Defaulting to no-operation (NOP) logger implementation&lt;BR /&gt;SLF4J: See &lt;A href="http://www.slf4j.org/codes.html#StaticLoggerBinder" target="_blank"&gt;http://www.slf4j.org/codes.html#StaticLoggerBinder&lt;/A&gt; for further details.&lt;BR /&gt;root&lt;BR /&gt;|-- first_name: string (nullable = true)&lt;/P&gt;&lt;P&gt;+----------+&lt;BR /&gt;|first_name|&lt;BR /&gt;+----------+&lt;BR /&gt;|first_name|&lt;BR /&gt;|first_name|&lt;BR /&gt;|first_name|&lt;BR /&gt;|first_name|&lt;BR /&gt;|first_name|&lt;BR /&gt;|first_name|&lt;BR /&gt;|first_name|&lt;BR /&gt;|first_name|&lt;BR /&gt;|first_name|&lt;BR /&gt;+----------+&lt;/P&gt;&lt;P&gt;(python) bash-5.1$&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 05 Nov 2024 13:32:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/97749#M39530</guid>
      <dc:creator>dixonantony</dc:creator>
      <dc:date>2024-11-05T13:32:22Z</dc:date>
    </item>
    <item>
      <title>Re: jdbc integration returning header as data for read operation</title>
      <link>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/97785#M39544</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/125572"&gt;@Pingleinferyx&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/131185"&gt;@dixonantony&lt;/a&gt;&amp;nbsp;My apologies I misread the problem, can you try setting the&amp;nbsp;&lt;SPAN&gt;.option("header", "true") in the spark.read or explicitly mention the schema and see if that helps?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Nov 2024 15:50:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/97785#M39544</guid>
      <dc:creator>VZLA</dc:creator>
      <dc:date>2024-11-05T15:50:52Z</dc:date>
    </item>
    <item>
      <title>Re: jdbc integration returning header as data for read operation</title>
      <link>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/98645#M39777</link>
      <description>&lt;P&gt;Any update?&lt;/P&gt;</description>
      <pubDate>Wed, 13 Nov 2024 11:04:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/98645#M39777</guid>
      <dc:creator>ReubenGreen</dc:creator>
      <dc:date>2024-11-13T11:04:56Z</dc:date>
    </item>
    <item>
      <title>Re: jdbc integration returning header as data for read operation</title>
      <link>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/100434#M40298</link>
      <description>&lt;P&gt;I'm encountering the same issue - has there been an update on this?&lt;/P&gt;</description>
      <pubDate>Fri, 29 Nov 2024 14:54:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/100434#M40298</guid>
      <dc:creator>Dengineer</dc:creator>
      <dc:date>2024-11-29T14:54:23Z</dc:date>
    </item>
    <item>
      <title>Re: jdbc integration returning header as data for read operation</title>
      <link>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/100437#M40300</link>
      <description>&lt;P&gt;When I attempt either of those options I get the following errors:&amp;nbsp;&lt;/P&gt;&lt;P&gt;defining the .schema(explicit_schema) returns the same result of having the headers as the data for every row&lt;/P&gt;&lt;P&gt;OR&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;[Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: null, Query: SELECT * FROM edl.test_table WHERE 1=0, Error message from Server: Configuration header is not available..&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 29 Nov 2024 15:42:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/100437#M40300</guid>
      <dc:creator>Dengineer</dc:creator>
      <dc:date>2024-11-29T15:42:08Z</dc:date>
    </item>
    <item>
      <title>Re: jdbc integration returning header as data for read operation</title>
      <link>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/100456#M40306</link>
      <description>&lt;P&gt;After reading through the Driver documentation I've finally found a solution that appears to work for me. I've added .option("UseNativeQuery", 0) to my JDBC connection. The query that was being passed from the Databricks Driver to the Databricks Cluster was being altered to select the column names from my subquery, as opposed to the data values. By passing .option("UseNativeQuery", 0) the driver now alters my query to select the correct values (by transforming into a HiveQL syntax). My understanding is that the overhead is slightly higher to transform the query - but at least it's actually working. Unlike with "UseNativeQuery"=1 or "UseNativeQuery"=2 (which is what it defaults to if not specified in the .options()).&lt;/P&gt;</description>
      <pubDate>Fri, 29 Nov 2024 17:46:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/jdbc-integration-returning-header-as-data-for-read-operation/m-p/100456#M40306</guid>
      <dc:creator>Dengineer</dc:creator>
      <dc:date>2024-11-29T17:46:41Z</dc:date>
    </item>
  </channel>
</rss>

