cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to fix intermittent 503 errors in 10.4 LTS

ebyhr
New Contributor II

I sometimes get the below error recently in version 10.4 LTS. Any solution to fix the intermittent failure? I added retry logic in our code, but Databricks query succeeded (even though it threw an exception) and it leads to the unexpected table status.

The error message:

[Databricks][DatabricksJDBCDriver](500593) Communication link failure. Failed to connect to server. Reason: HTTP retry after response received with no Retry-After header, error: HTTP Response code: 503, Error message: Unknown.

The full stacktrace:

io.trino.tempto.query.QueryExecutionException: java.sql.SQLException: [Databricks][DatabricksJDBCDriver](500593) Communication link failure. Failed to connect to server. Reason: HTTP retry after response received with no Retry-After header, error: HTTP Response code: 503, Error message: Unknown.
at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:119)
at io.trino.tempto.query.JdbcQueryExecutor.executeQuery(JdbcQueryExecutor.java:84)
at io.trino.tests.product.utils.QueryExecutors$3.lambda$executeQuery$0(QueryExecutors.java:149)
at net.jodah.failsafe.Functions.lambda$get$0(Functions.java:48)
at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:62)
at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:62)
at net.jodah.failsafe.Execution.executeSync(Execution.java:129)
at net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:376)
at net.jodah.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:67)
at io.trino.tests.product.utils.QueryExecutors$3.executeQuery(QueryExecutors.java:149)
at io.trino.tests.product.deltalake.TestDeltaLakeWriteDatabricksCompatibility$CaseTestTable.<init>(TestDeltaLakeWriteDatabricksCompatibility.java:366)
at io.trino.tests.product.deltalake.TestDeltaLakeWriteDatabricksCompatibility.testCaseUpdateInPartition(TestDeltaLakeWriteDatabricksCompatibility.java:160)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:104)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:645)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:851)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1177)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.sql.SQLException: [Databricks][DatabricksJDBCDriver](500593) Communication link failure. Failed to connect to server. Reason: HTTP retry after response received with no Retry-After header, error: HTTP Response code: 503, Error message: Unknown.
at com.databricks.client.hivecommon.api.HS2Client.handleTTransportException(Unknown Source)
at com.databricks.client.spark.jdbc.DowloadableFetchClient.handleTTransportException(Unknown Source)
at com.databricks.client.hivecommon.api.HS2Client.executeStatementInternal(Unknown Source)
at com.databricks.client.hivecommon.api.HS2Client.executeStatement(Unknown Source)
at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeRowCountQueryHelper(Unknown Source)
at com.databricks.client.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.execute(Unknown Source)
at com.databricks.client.jdbc.common.SStatement.executeNoParams(Unknown Source)
at com.databricks.client.jdbc.common.BaseStatement.execute(Unknown Source)
at com.databricks.client.hivecommon.jdbc42.Hive42Statement.execute(Unknown Source)
at io.trino.tempto.query.JdbcQueryExecutor.executeQueryNoParams(JdbcQueryExecutor.java:128)
at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:112)
... 24 more
Suppressed: java.lang.Exception: Query: INSERT INTO default.update_case_compat_zk3lu03mfzd5 VALUES (1, 1, 0), (2, 2, 0), (3, 3, 1)
at io.trino.tempto.query.JdbcQueryExecutor.executeQueryNoParams(JdbcQueryExecutor.java:136)
... 25 more
Caused by: com.databricks.client.support.exceptions.ErrorException: [Databricks][DatabricksJDBCDriver](500593) Communication link failure. Failed to connect to server. Reason: HTTP retry after response received with no Retry-After header, error: HTTP Response code: 503, Error message: Unknown.
... 35 more
Caused by: com.databricks.client.jdbc42.internal.apache.thrift.transport.TTransportException: HTTP retry after response received with no Retry-After header, error: HTTP Response code: 503, Error message: Unknown
at com.databricks.client.hivecommon.HttpRetrySettings.shouldRetry(Unknown Source)
at com.databricks.client.hivecommon.api.HS2ClientWrapper.shouldReexecuteRequest(Unknown Source)
at com.databricks.client.hivecommon.api.HS2ClientWrapper.executeWithRetry(Unknown Source)
at com.databricks.client.hivecommon.api.HS2ClientWrapper.ExecuteStatement(Unknown Source)
... 33 more

5 REPLIES 5

Hubert-Dudek
Esteemed Contributor III

Maybe you can add additional validation to the output (that object exists). You can also share your code.

ebyhr
New Contributor II

Unfortunately, no we can't. There're so many code and the failed place isn't deterministic. https://github.com/trinodb/trino/issues/14391

The code is https://github.com/trinodb/tempto/blob/a3f013ae9faae1848972a25db40ba041c83b69d7/tempto-core/src/main.... It simply executes query, nothing special.

findinpath
Contributor

I experience the same situation.

Caused by: java.sql.SQLException: [Databricks][DatabricksJDBCDriver](500593) Communication link failure. Failed to connect to server. Reason: HTTP retry after response received with no Retry-After header, error: HTTP Response code: 503, Error message: Unknown.

I've attempted a retry on the client side via failsafe library, but this turns out to have the effect of doing a duplicate `INSERT` in case that the failure happens on an `INSERT` statement.

It seems that the error code 500593 is rather signaling that the operation took longer than expected.

I'm just wondering, can this situation be avoided by specifying a longer timeout ?

Anonymous
Not applicable

Hi @Yuya Ebihara​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

ebyhry
New Contributor II

The issue still happens.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group