Databricks

tz1 · ‎03-02-2022

I have a Java program like this to test out the Databricks JDBC connection with the Databricks JDBC driver.

        Connection connection = null;
        try {
            Class.forName(driver);
            connection = DriverManager.getConnection(url, username, password);
            if (connection != null) {
                System.out.println("Connection Established");
            } else {
                System.out.println("Connection Failed");
            }
            Statement statement = connection.createStatement();
            ResultSet rs = statement.executeQuery("select * from standard_info_service.daily_transactions"); 
            while (rs.next()) {
                System.out.print("created_date: " + rs.getInt("created_date") + ", ");
                System.out.println("daily_transactions: " + rs.getInt("daily_transactions"));
            }
        } catch (Exception e) {
            System.out.println(e);
        }

This program, however, throws an error like this:

Connection Established
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
java.sql.SQLException: [Simba][SparkJDBCDriver](500618) Error occured while deserializing arrow data: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available

What will be the solution?

Alice__Caterpil · ‎03-23-2022

Hi @Jose Gonzalez ,

This similar issue in snowflake in JDBC is a good reference, I was able to get this to work in Java OpenJDK 17 by having this JVM option specified:

--add-opens=java.base/java.nio=ALL-UNNAMED

Although I came across another issue with using apache DHCP to connect to Databricks SQL endpoint:

Caused by: java.sql.SQLFeatureNotSupportedException: [Simba][JDBC](10220) Driver does not support this optional feature.

at com.simba.spark.exceptions.ExceptionConverter.toSQLException(Unknown Source)

at com.simba.spark.jdbc.common.SConnection.setAutoCommit(Unknown Source)

at com.simba.spark.jdbc.jdbc42.DSS42Connection.setAutoCommit(Unknown Source)

at org.apache.commons.dbcp2.DelegatingConnection.setAutoCommit(DelegatingConnection.java:801)

The same problem occurred after I switched to Hikari.

Finally, I got it working by just using Basic DataSource and set auto-commit to False. BasicDataSource is not suitable for production though, would there be a new driver release that can handle this better?

View solution in original post

Kaniz · ‎03-02-2022

Hi @Tony Zhou , Can you specify the versions which you're using?

tz1 · ‎03-03-2022

Thanks @Kaniz Fatma

openjdk-17.0.2

Databricks JDBC Driver 2.6.22

Alice__Caterpil · ‎03-09-2022

This error is mentioned in Spark documentation - https://spark.apache.org/docs/latest/, looks like this is specific to the version of Java and can be avoid by having the mentioned properties set

Kaniz · ‎03-09-2022

Hi @Alice Hung , Thank you for your contribution.

Kaniz · ‎03-09-2022

Hi @Tony Zhou , It's mentioned in the doc.

Spark runs on Java 8/11, Scala 2.12/2.13, Python 3.6+ and R 3.5+.

Python 3.6 support is deprecated as of Spark 3.2.0. Java 8 prior to version 8u201 support is deprecated as of Spark 3.2.0.

For the Scala API, Spark 3.2.1 uses Scala 2.12.

You will need to use a compatible Scala version (2.12.x).

Note:-

For Python 3.9, Arrow optimisation and pandas UDFs might not work due to the supported Python versions in Apache Arrow.

Please refer to the latest Python Compatibility page.

For Java 11, -Dio.netty.tryReflectionSetAccessible=true is required additionally for Apache Arrow library.

This prevents java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available when Apache Arrow uses Netty internally.

jose_gonzalez · ‎03-22-2022

Hi @Tony Zhou ,

Just a friendly follow-up. Did @Kaniz Fatma 's response helped you to resolve this issue? if not, please share more details, like the full error stack trace and some code snippets.

Alice__Caterpil · ‎03-23-2022

Hi @Jose Gonzalez ,

This similar issue in snowflake in JDBC is a good reference, I was able to get this to work in Java OpenJDK 17 by having this JVM option specified:

--add-opens=java.base/java.nio=ALL-UNNAMED

Although I came across another issue with using apache DHCP to connect to Databricks SQL endpoint:

Caused by: java.sql.SQLFeatureNotSupportedException: [Simba][JDBC](10220) Driver does not support this optional feature.

at com.simba.spark.exceptions.ExceptionConverter.toSQLException(Unknown Source)

at com.simba.spark.jdbc.common.SConnection.setAutoCommit(Unknown Source)

at com.simba.spark.jdbc.jdbc42.DSS42Connection.setAutoCommit(Unknown Source)

at org.apache.commons.dbcp2.DelegatingConnection.setAutoCommit(DelegatingConnection.java:801)

The same problem occurred after I switched to Hikari.

Finally, I got it working by just using Basic DataSource and set auto-commit to False. BasicDataSource is not suitable for production though, would there be a new driver release that can handle this better?

AmeyJoshi · ‎03-24-2022

Thanks a lot @Alice Hung your suggestion works. I am really grateful to you for sharing it. There is absolutely no help available elsewhere.

Kaniz · ‎03-24-2022

Hi @Amey Joshi , Happy to know that it helped and you're able to resolve it. Would you like to mark @Alice Hung 's answer as the best answer ?

AmeyJoshi · ‎03-24-2022

Definitely, I would want to. But I can't find an option to mark it as the best.

Kaniz · ‎03-24-2022

Hi @Amey Joshi , Here is the option to select the Best Answer.

AmeyJoshi · ‎03-24-2022

I am sorry @Kaniz Fatma but I don't see that option available to me. If you see it, kindly use it on my behalf.

Kaniz · ‎03-24-2022

Hi @Amey Joshi , It's weird that you cannot see that option. Let me get back to you on this. In the meanwhile, I'll select the best answer for you 😊 .

Databricks

Problem with Databricks JDBC connection: Error occured while deserializing arrow data

Unity Catalog Lakeguard: Industry-first and only data governance for multi-user Apache™ Spark cluste

Announcing the General Availability of Databricks Asset Bundles

Register now and save 50% on training at Data + AI Summit!

How to successfully build GenAI applications

Meet DBRX, the New Standard for High-Quality LLMs