cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Connect to Databricks using Java SDK through proxy

Nagasundaram
New Contributor II

I'm trying to connect to databricks from java using the java sdk and get cluster/sqlWarehouse state. I'm able to connect and get cluster state from my local. But, once I deploy it to the server, my company's network is not allowing the connection. We need to use proxy here but I'm not sure how to use proxy with the databricks java sdk.

Below is the code that works in local env:

DatabricksConfig config = new DatabricksConfig().setHost("https://name.databricks.com").setToken("myToken").resolve();

WorkspaceClient wc = new WorkspaceClient (config); wc.clusters().get("myClusterId").getState().toString();

 

Any hint or suggestion would be very helpful.

2 REPLIES 2

Hi @Retired_mod 

Thanks for the reply. I tried adding the arguments in manifest.yml file like this:

JAVA_OPTS: "-Dhttp.proxyHost='your_proxy_hos't -Dhttp.proxyPort='your_proxy_port' -Dhttps.proxyHost='your_proxy_host' -Dhttps.proxyPort='your_proxy_port'"

 

Still, I'm getting connection refused to databricks when I deploy this to PCF.

Is using WorkspaceClient the right way? Or, do I need to use AccountClient?

AlliaKhosla
Databricks Employee
Databricks Employee

Hi  @Nagasundaram 

 

You can make use of the below init script inorder to use a proxy server with Databricks cluster. The content of the init script can be added at  "Workspace/shared/setproxy.sh" 

==================================================

val proxy = "http://localhost:8888" // set this to your actual proxy

val proxy_host = "localhost"

val proxy_port = "8888"

val no_proxy = "127.0.0.1,.local,169.254.169.254,s3.amazonaws.com,s3.us-east-1.amazonaws.com" // make sure to update no proxy as needed (e.g. for S3 region or any other internal domains)

val java_no_proxy = "localhost|127.*|[::1]|169.254.169.254|s3.amazonaws.com|*.s3.amazonaws.com|s3.us-east-1.amazonaws.com|*.s3.us-east-1.amazonaws.com|10.*" // replace 10.* with cluster IP range!!!!!!

 

dbutils.fs.put("Workspace/shared/setproxy.sh", s"""#!/bin/bash

echo "export http_proxy=$proxy" >> /databricks/spark/conf/spark-env.sh

echo "export https_proxy=$proxy" >> /databricks/spark/conf/spark-env.sh

echo "export no_proxy=$no_proxy" >> /databricks/spark/conf/spark-env.sh

echo "export HTTP_PROXY=$proxy" >> /databricks/spark/conf/spark-env.sh

echo "export HTTPS_PROXY=$proxy" >> /databricks/spark/conf/spark-env.sh

echo "export NO_PROXY=$no_proxy" >> /databricks/spark/conf/spark-env.sh

echo "export _JAVA_OPTIONS=\"-Dhttps.proxyHost=${proxy_host} -Dhttps.proxyPort=${proxy_port} -Dhttp.proxyHost=${proxy_host} -Dhttp.proxyPort=${proxy_port} -Dhttp.nonProxyHosts=${java_no_proxy}\"" >> /databricks/spark/conf/spark-env.sh

 

echo "http_proxy=$proxy" >> /etc/environment

echo "https_proxy=$proxy" >> /etc/environment

echo "no_proxy=$no_proxy" >> /etc/environment

echo "HTTP_PROXY=$proxy" >> /etc/environment

echo "HTTPS_PROXY=$proxy" >> /etc/environment

echo "NO_PROXY=$no_proxy" >> /etc/environment

 

cat >> /etc/R/Renviron << EOF

http_proxy=$proxy

https_proxy=$proxy

no_proxy=$no_proxy

EOF

""", true)

==================================================

Please test it carefully in your environment.

 

Let me know if that helps. 

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group