I'm trying to access to a Databricks SQL Warehouse with python. I'm able to connect with a token on a Compute Instance on Azure Machine Learning. It's a VM with conda installed, I create an env in python 3.10.
from databricks import sql as dbsql
dbsql.connect(
server_hostname="databricks_address",
http_path="http_path",
access_token="dapi....",
)
But once I create a job and I Launch it in a compute Cluster with a custom Dockerfile
FROM mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04:latest
ENV https_proxy http://xxxxxx:yyyy
ENV no_proxy xxxxxx
RUN mkdir -p /usr/share/man/man1
RUN wget https://download.java.net/java/GA/jdk19.0.1/afdd2e245b014143b62ccb916125e3ce/10/GPL/openjdk-19.0.1_linux-x64_bin.tar.gz \
&& tar xvf openjdk-19.0.1_linux-x64_bin.tar.gz \
&& mv jdk-19.0.1 /opt/
ENV JAVA_HOME /opt/jdk-19.0.1
ENV PATH="${PATH}:$JAVA_HOME/bin"
# Install requirements with pip conf for Jfrog
COPY pip.conf pip.conf
ENV PIP_CONFIG_FILE pip.conf
# python installs (python 3.10 inside all azure ubuntu images)
COPY requirements.txt .
RUN pip install -r requirements.txt && rm requirements.txt
# set command
CMD ["bash"]
My image is created and starts to run my code, but fails on previous code sample. I am using the same values of https_proxy and no_poxy in my compute instance and compute cluster.
2024-01-22 13:30:13,520 - thrift_backend - Error during request to server: {"method": "OpenSession", "session-id": null, "query-id": null, "http-code": null, "error-message": "", "original-exception": "Retry request would exceed Retry policy max retry duration of 900.0 seconds", "no-retry-reason": "non-retryable error", "bounded-retry-delay": null, "attempt": "1/30", "elapsed-seconds": "846.7684090137482/900.0"}
Traceback (most recent call last):
File "/mnt/azureml/cr/j/67f1e8c93a8942d582fb7babc030101b/exe/wd/main.py", line 198, in <module>
main()
File "/mnt/azureml/cr/j/67f1e8c93a8942d582fb7babc030101b/exe/wd/main.py", line 31, in main
return dbsql.connect(
File "/opt/miniconda/lib/python3.10/site-packages/databricks/sql/__init__.py", line 51, in connect
return Connection(server_hostname, http_path, access_token, **kwargs)
File "/opt/miniconda/lib/python3.10/site-packages/databricks/sql/client.py", line 235, in __init__
self._open_session_resp = self.thrift_backend.open_session(
File "/opt/miniconda/lib/python3.10/site-packages/databricks/sql/thrift_backend.py", line 576, in open_session
response = self.make_request(self._client.OpenSession, open_session_req)
File "/opt/miniconda/lib/python3.10/site-packages/databricks/sql/thrift_backend.py", line 505, in make_request
self._handle_request_error(error_info, attempt, elapsed)
File "/opt/miniconda/lib/python3.10/site-packages/databricks/sql/thrift_backend.py", line 335, in _handle_request_error
raise network_request_error
databricks.sql.exc.RequestError: Error during request to server
In both, I am using the lastest version of databricks-sql-connector (3.0.1)