cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Problem with Databricks JDBC connection: Error occured while deserializing arrow data

tz1
New Contributor III

I have a Java program like this to test out the Databricks JDBC connection with the Databricks JDBC driver.

        Connection connection = null;
        try {
            Class.forName(driver);
            connection = DriverManager.getConnection(url, username, password);
            if (connection != null) {
                System.out.println("Connection Established");
            } else {
                System.out.println("Connection Failed");
            }
            Statement statement = connection.createStatement();
            ResultSet rs = statement.executeQuery("select * from standard_info_service.daily_transactions"); 
            while (rs.next()) {
                System.out.print("created_date: " + rs.getInt("created_date") + ", ");
                System.out.println("daily_transactions: " + rs.getInt("daily_transactions"));
            }
        } catch (Exception e) {
            System.out.println(e);
        }

This program, however, throws an error like this:

Connection Established
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
java.sql.SQLException: [Simba][SparkJDBCDriver](500618) Error occured while deserializing arrow data: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available

What will be the solution?

1 ACCEPTED SOLUTION

Accepted Solutions

Alice__Caterpil
New Contributor III

Hi @Jose Gonzalez​ ,

This similar issue in snowflake in JDBC is a good reference, I was able to get this to work in Java OpenJDK 17 by having this JVM option specified:

--add-opens=java.base/java.nio=ALL-UNNAMED

Although I came across another issue with using apache DHCP to connect to Databricks SQL endpoint:

Caused by: java.sql.SQLFeatureNotSupportedException: [Simba][JDBC](10220) Driver does not support this optional feature.

at com.simba.spark.exceptions.ExceptionConverter.toSQLException(Unknown Source)

at com.simba.spark.jdbc.common.SConnection.setAutoCommit(Unknown Source)

at com.simba.spark.jdbc.jdbc42.DSS42Connection.setAutoCommit(Unknown Source)

at org.apache.commons.dbcp2.DelegatingConnection.setAutoCommit(DelegatingConnection.java:801)

at org.apache.commons.dbcp2.DelegatingConnection.setAutoCommit(DelegatingConnection.java:801)

The same problem occurred after I switched to Hikari.

Finally, I got it working by just using Basic DataSource and set auto-commit to False. BasicDataSource is not suitable for production though, would there be a new driver release that can handle this better?

View solution in original post

13 REPLIES 13

Kaniz_Fatma
Community Manager
Community Manager

Hi @Tony Zhou​ , Can you specify the versions which you're using?

tz1
New Contributor III

Thanks @Kaniz Fatma​ 

openjdk-17.0.2

Databricks JDBC Driver 2.6.22

Alice__Caterpil
New Contributor III

This error is mentioned in Spark documentation - https://spark.apache.org/docs/latest/, looks like this is specific to the version of Java and can be avoid by having the mentioned properties set

Hi @Alice Hung​ , Thank you for your contribution.

Kaniz_Fatma
Community Manager
Community Manager

Hi @Tony Zhou​ , It's mentioned in the doc.

Spark runs on Java 8/11, Scala 2.12/2.13, Python 3.6+ and R 3.5+.

Python 3.6 support is deprecated as of Spark 3.2.0. Java 8 prior to version 8u201 support is deprecated as of Spark 3.2.0.

For the Scala API, Spark 3.2.1 uses Scala 2.12.

You will need to use a compatible Scala version (2.12.x).

Note:-

For Python 3.9, Arrow optimisation and pandas UDFs might not work due to the supported Python versions in Apache Arrow.

Please refer to the latest Python Compatibility page.

For Java 11, -Dio.netty.tryReflectionSetAccessible=true is required additionally for Apache Arrow library.

This prevents java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available when Apache Arrow uses Netty internally.

Hi @Tony Zhou​ ,

Just a friendly follow-up. Did @Kaniz Fatma​ 's response helped you to resolve this issue? if not, please share more details, like the full error stack trace and some code snippets.

Alice__Caterpil
New Contributor III

Hi @Jose Gonzalez​ ,

This similar issue in snowflake in JDBC is a good reference, I was able to get this to work in Java OpenJDK 17 by having this JVM option specified:

--add-opens=java.base/java.nio=ALL-UNNAMED

Although I came across another issue with using apache DHCP to connect to Databricks SQL endpoint:

Caused by: java.sql.SQLFeatureNotSupportedException: [Simba][JDBC](10220) Driver does not support this optional feature.

at com.simba.spark.exceptions.ExceptionConverter.toSQLException(Unknown Source)

at com.simba.spark.jdbc.common.SConnection.setAutoCommit(Unknown Source)

at com.simba.spark.jdbc.jdbc42.DSS42Connection.setAutoCommit(Unknown Source)

at org.apache.commons.dbcp2.DelegatingConnection.setAutoCommit(DelegatingConnection.java:801)

at org.apache.commons.dbcp2.DelegatingConnection.setAutoCommit(DelegatingConnection.java:801)

The same problem occurred after I switched to Hikari.

Finally, I got it working by just using Basic DataSource and set auto-commit to False. BasicDataSource is not suitable for production though, would there be a new driver release that can handle this better?

Thanks a lot @Alice Hung​  your suggestion works. I am really grateful to you for sharing it. There is absolutely no help available elsewhere.

Hi @Amey Joshi​ , Happy to know that it helped and you're able to resolve it. Would you like to mark @Alice Hung​ 's answer as the best answer ?

AmeyJoshi
New Contributor III

Definitely, I would want to. But I can't find an option to mark it as the best.

Hi @Amey Joshi​ , Here is the option to select the Best Answer.

Screenshot 2022-03-24 at 10.25.54 PM

AmeyJoshi
New Contributor III

I am sorry @Kaniz Fatma​ but I don't see that option available to me. If you see it, kindly use it on my behalf.

Hi @Amey Joshi​ , It's weird that you cannot see that option. Let me get back to you on this. In the meanwhile, I'll select the best answer for you 😊 .

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group