Databricks

AbhiJ · ‎07-26-2023

Hello team,

I am using commons(commons-dbcp2) Datasource which supports default connection pooling in Spring Java application (rest services to fetch databricks data via JDBC template).

Initially all works fine and can get the data from databricks via jdbc template and already created databricks connection from connection pool.

Databricks JDBC url format is :

jdbc:databricks://dbc-f2536520-fc66.cloud.databricks.com:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/7781542073893090/0329-163852-fxzadblp;AuthMech=3;UID=token;PWD=<PWDTOKEN>

The databricks jar which I am using for datasource connection is

<dependency>
<groupId>com.databricks</groupId>
<artifactId>databricks-jdbc</artifactId>
<version>2.6.33</version>
</dependency>

The issue is when I stay the running server idle for more than 20 mins, I starts getting below issue from databricks.

[Databricks][DatabricksJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Invalid SessionHandle: SessionHandle [56eea0f9-f7ee-400e-b2f6-5e9cec51c2a4]:62:61, org.apache.hive.service.cli.session.SessionManager:getSession:SessionManager.java:349, org.apache.spark.sql.hive.thriftserver.SparkSQLSessionManager:getSession:SparkSQLSessionManager.scala:179, com.databricks.sql.hive.thriftserver.thrift.ThriftHandlerUtils$:getSessionHandle:ThriftHandlerUtils.scala:55,

Please let me know if anyone has faced this issue and what is the resolution for it.

Kaniz · ‎07-26-2023

Hi @AbhiJ, The error message indicates an issue with the session handle. This could be due to various reasons, such as an expired session or a problem with the session manager. The following steps can be taken to troubleshoot the issue:

1. Check if the session handle is valid and active. If the session has expired, a new session needs to be created.

2. Check if there are any issues with the session manager. This can be done by checking the logs of the session manager and verifying if it is running correctly.

3. Verify if the JDBC driver is compatible with the Databricks version. If not, update the driver to a compatible version.

4. Check if there are any network connectivity issues between the client and the Databricks cluster. This can be done by running a ping test or using a network monitoring tool.

AbhiJ · ‎07-26-2023

@Kaniz ,

The links which you pasted seems doesn't works.

Anyways, please find my response for your suggestions.

1. Check if the session handle is valid and active. If the session has expired, a new session needs to be created.

I am using commons connection pooling which creates a singleton datasource object with already created connections in advance.

Please note that initially everything works fine and I am able to get the data from databricks.

The issues arises after I don't do any activity after initial fetch for some 20 mins (with server running) and then again try to fetch the data from databricks via same datasource(pooled connection) which got created initially.

I was expecting the databricks JDBC connector jar should handle this error and create a new connection if the existing connection got stale.

2. Check if there are any issues with the session manager. This can be done by checking the logs of the session manager and verifying if it is running correctly.

I get's the same logs from databricks log4j logs (Driver logs tab of compute cluster section)

3. Verify if the JDBC driver is compatible with the Databricks version. If not, update the driver to a compatible version.

I am using 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12) and since I am using maven databricks jdbc latest version (<version>2.6.33</version>), It should include support for 12.2 LTS version.

4. Check if there are any network connectivity issues between the client and the Databricks cluster. This can be done by running a ping test or using a network monitoring tool.

There are no network issues as I initially mentioned, the initial fetch works fine and later after some idle time, databricks throws invalid session handle error.

Also, I would like to mention if I open JDBC connection and close it for every requests, it works fine irrespective of idle time but I want's to leverage the benefit of connection pooling (already created connections) instead of creating and closing the connections for every request.

I think the error needs to be handled inside databricks connector jar.

NandiniN · ‎07-27-2023

Hi @AbhiJ ,

Looking at the error it appears the session is invalid/expired.

The error usually occurs when the jdbc/odbc client re-accesses the cluster with the same session id after the idle session timeout threshold. Ideally restarting the client could be a workaround. Also you can set the following property into the Spark Config of the cluster.```spark.hadoop.hive.server2.idle.session.timeout```
The default value is 900000 - 15minutes (and you mentioned it happens after 20 min)

Restarting the client will help.

Please let me know if that helps.

Thanks!

ash42 · ‎10-17-2023

I am seeing the same issue with hikari. When a pooled connection is created then the databricks cluster is terminated (or restarted), the HikariDataSource retains a stale session handle.

Why does connection.isValid() returns true then executing any query on the same connection will fail with invalid sessionHandle. I was expecting the connection.isValid() to return false in this case.

To reproduce:

Create a HikariDataSource referencing a databricks cluster.
Terminate the databricks cluster.
get a connection from the datasource like HikariDataSource.getConnection()
connection.isValid(1) return true.
connection.executeQuery("SELECT 1"), this will block until the cluster is restarted then fail with java.sql.SQLException: [Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: 08000, Query: SELECT 1, Error message from Server: Invalid SessionHandle.

Databricks

Facing Issues with Databricks JDBC Connectivity after Idle time

Building DBRX-class Custom LLMs with Mosaic AI Training

Accurate, Safe and Governed: How to Move GenAI from POC to Production

Exciting Announcement: Introducing our Learning Library!

Databricks Community Social, May 2024 - Speaker session around Training offerings

🔔 Attention Databricks Academy Users: SSO Implementation Incoming! Secure Your Account Today!