cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks JDBC driver fails with socket read timeout

Kyle2
New Contributor II

We work with a application that connects to our Databricks serverless SQL warehouse via Databricks JDBC driver. It runs a few thousand SQL select statements everyday, and a small percentage of them will fail with the following error details:

 

java.sql.SQLException: [Databricks][JDBCDriver](500157) Socket timeout expired when executing query: SELECT COA***.; caused by com.databricks.client.support.exceptions.GeneralException: [Databricks][JDBCDriver](500157) Socket timeout expired when executing query: SELECT COA***.; caused by com.databricks.client.jdbc42.internal.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out; caused by java.net.SocketTimeoutException: Read timed out

 

A full stack trace can be provided upon request.

On the SQL warehouse side the failed queries are showing this as the error: Query has been timed out due to inactivity., and upon reviewing the query profile the biggest standout metric is the Result fetching by client metric. That metric ranges from 6-20 minutes on those failed queries, which is dramatically longer compared to the usual fetching time of around ~225ms on successful ones.

 

Observations around failed statements:
  • The statements fail typically after 10 seconds but there were several failures also after 15 or 16 seconds.
  • Execution never gets into fetching phase. 
  • Failed statements continue execution on server side. Statements fail on server-side eventually, but it takes even 10 minutes or more. That time is reported as fetching time.
  • When the failed statement is executed from sql console, it pass and data are fetched quickly as most of the queries returns low number of records.
  • Failures happen during high load. serverless SQL warehouse is on its maximal configuration, i.e. no scale down or up during failures. If the HW is switched to more powerful nodes, it has no effect.

JDBC driver configuration:

  • Driver version 2.6.38, happens also with 2.6.36
  • Authentication by PAT
  • SocketTimeOut either 0 or 300, happens for both values
  • Preset catalog and schema
  • Connection initialized with "SET ansi_mode = false"

Connection pool settings:

  • HikariCP version 5.1.0
  • Maximally 30 parallel connections to the SQL warehouse

We're looking for any advice on how to mitigate the failures.

2 REPLIES 2

Rafael-Sousa
Contributor

Did you tried to increase the socket timeout?

Rafael Sousa

Kyle2
New Contributor II

Yeah :/, the SocketTimeOut has been toyed with, no value that has been tried has made any difference. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group