cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Error while establishing JDBC connection to Azure databricks via HTTP proxy

jonasmin
New Contributor III

I am using the databricks JDBC driver (https://databricks.com/spark/jdbc-drivers-download) to connect to Azure databricks.

The connection needs to be routed through a HTTP proxy. I found parameters that can be configured for using the HTTP proxy:databricks jdbcBy passing invalid parameters I found that the parameters are parsed by the driver.

Still I see the error:

java.sql.SQLException: [Databricks][DatabricksJDBCDriver](700120) Host ....azuredatabricks.net cannot be resolved through DnsResolver com.databricks.client.jdbc.rpc.InternalDnsResolver. Error Message: No such host is known (....azuredatabricks.net)
       at com.databricks.client.jdbc.rpc.CustomDnsResolverLoader.getResolvedHost(Unknown Source)
       at com.databricks.client.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source)
       at com.databricks.client.spark.jdbc.DownloadableFetchClientFactory.createClient(Unknown Source)
       at com.databricks.client.hivecommon.core.HiveJDBCCommonConnection.connectToServer(Unknown Source)
       at com.databricks.client.spark.core.SparkJDBCConnection.connectToServer(Unknown Source)
       at com.databricks.client.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source)
       at com.databricks.client.spark.core.SparkJDBCConnection.establishConnection(Unknown Source)
       at com.databricks.client.jdbc.core.LoginTimeoutConnection.connect(Unknown Source)
       at com.databricks.client.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
       at com.databricks.client.jdbc.common.AbstractDriver.connect(Unknown Source)
       at org.apache.commons.dbcp2.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:55)
       at org.apache.commons.dbcp2.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:355)
       at org.apache.commons.dbcp2.BasicDataSource.validateConnectionFactory(BasicDataSource.java:115)
       at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:665)
       at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:544)
       at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:753)
       at cloud.celonis.connector.jdbc.services.DatabaseConnectionService.lambda$getConnection$0(DatabaseConnectionService.java:47)
       at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

So, it seems that the proxy is not used after all.

Do you have any suggestions, what to check and how to make the driver use the proxy? Thank you!

7 REPLIES 7

Prabakar
Databricks Employee
Databricks Employee

@Jonas Minningโ€‹ the error message says there is a problem with the DNSresolver. Are you using the default DNS or a custom DNS? Suspect the issue is on the DNS resolution. It would be worth checking with your networking team. If it's a custom DNS, you can try changing it to default and check if this works.

jonasmin
New Contributor III

Thanks for your answer @Prabakar Ammeappinโ€‹ ,

the thing is that I need to use the proxy, so I would expect the Driver to connect to the proxy first, e.g. resolving the hostname of the proxy instead of the hostname of the databricks host.

User16764241763
Honored Contributor

We don't think the DNS resolution traffic will go through the Proxy first. Can you try setting below in the host file and see if it connects.

xxxxxxxxxx.azuredatabricks net <IP address>

Ultimately you have to ensure your custom DNS servers are able to resolve the host names without any issues.

Thanks for the answer @Arvind Ravishโ€‹,

We added the IP to the local host file.

Then the DNS resolver error disappeared and we got a timeout error instead.

This timeout is probably due to firewall which does not allow direct connection from our client to Azure. So, I think we still are facing the issue that the driver does not seem to use the proxy but tries a direct connection instead.

So, I was wondering which conditions need to be fulfilled for the driver to use the proxy. Do you know something around this?

Ravikant
New Contributor II

Hi @Jonas Minningโ€‹, were you able to find the solution for this issue? We are also facing similar issue.

jonasmin
New Contributor III

Hi Ravikant,

unfortunately, we have not found another solution than not using a proxy.

Best, Jonas

MS_Varma
New Contributor II

Hi @Jonas Minningโ€‹ , actually I am also having the same issue and when i looked into the driver related documentation I found that the driver currently only supports SOCKS proxies and I believe this is the reason why we are getting this error. So, I wanted to check if there are any other drivers that support HTTP proxies. If anyone has information on such JDBC driver could you please let me know?

The driver that i am currently using is DatabricksJDBC42-2.6.29

Best, Varma

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group