cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Error in databricks-sql-connector

BorislavBlagoev
Valued Contributor III
from databricks import sql
 
hostname = '<name>.databricks.com'
http_path = '/sql/1.0/endpoints/<endpoint_id>'
access_token = '<personal_token>'
 
connection = sql.connect(server_hostname=hostname, http_path=http_path, access_token=access_token)
 
cursor = connection.cursor()
 
cursor.execute('test_query')
 
result = cursor.fetchall()
 
for row in result:
  print(row)
 
cursor.close()

I get the following error when I execute the code above in notebook.

I have permissions for the entrypoint that I use and access_token. I have removed them in the question!

Error during OpenSession; Request: TOpenSessionReq(client_protocol=5, username=None, password=None, configuration=None) Error: None Bounded-Retry-Delay: None Attempt: 1/30 Elapsed-Seconds: 1.3954582214355469/900.0
 
 
----> 7 connection = sql.connect(server_hostname=hostname, 
      8                          http_path=http_path,
      9                          access_token=access_token)
 
/databricks/python/lib/python3.8/site-packages/databricks/sql/__init__.py in connect(server_hostname, http_path, access_token, **kwargs)
     26     """
     27     from databricks.sql.client import Connection
---> 28     return Connection(server_hostname, http_path, access_token, **kwargs)
 
/databricks/python/lib/python3.8/site-packages/databricks/sql/client.py in __init__(self, server_hostname, http_path, access_token, **kwargs)
    261                 client_protocol=protocol_version
    262             )
--> 263             response = self._make_request(self._client.OpenSession, open_session_req)
    264             _check_status(response)
    265             assert response.sessionHandle is not None, "Expected a session from OpenSession"
 
/databricks/python/lib/python3.8/site-packages/databricks/sql/client.py in _make_request(self, method, request)
    372                     or elapsed + retry_delay > self._retry_stop_after_attempts_duration):
    373                 _logger.error("Error during " + log_base)
--> 374                 raise OperationalError("Error during Thrift request" +
    375                     (": " + error_message) if error_message else "",
    376                     error)
 
OperationalError: ('', EOFError())

1 ACCEPTED SOLUTION

Accepted Solutions

NiallEgan__Data
New Contributor III

Hi @Borislav Blagoev​ ,

Thanks very much for taking the time to collect these logs.

The problem here (as indicated by the `IpAclValidation` message) is that IP allow listing (enabled for your workspace) will not allow arbitrary connections from Spark clusters back to the control plane by default. The subnet(s) for the data plane needs to be added to the IP allow list.

Here is a link to docs on IpACL validation: https://docs.databricks.com/security/network/ip-access-list.html

Here is a link to docs on managing subnets in the data plane: https://docs.databricks.com/administration-guide/cloud-configurations/aws/customer-managed-vpc.html#...

Not sure what your end goal is with this, but it's probably also worth mentioning that there are (better) alternatives to using the `databricks-sql-connector` on Databricks notebooks. For example, in a Python notebook you can just use `spark.sql(...)` to execute SQL commands.

View solution in original post

16 REPLIES 16

BilalAslamDbrx
Honored Contributor II
Honored Contributor II

@Borislav Blagoev​ I see the same problem. I've flagged this to our engineering team, they will investigate.

Sounds great! Thank you!

ben_fleis
New Contributor II

Hi @Borislav Blagoev​ , we're looking into it and will respond here when we know more.

BorislavBlagoev
Valued Contributor III

Thank you!

NiallEgan__Data
New Contributor III

Hi,

We were unable to reproduce the issue on a notebook. We plan to release databricks-sql-connector 0.9.4 soon which will have improved logging that will help us get to the bottom of the issue.

In the meantime, could you double check that the token is correct? Unfortunately that is the error message you get for an incorrect token (that will also be improved with 0.9.4)

I tried several times and it is the same!

@Borislav Blagoev​ I have a repro of the bug. @Niall Egan​ shared it with you on our bug tracker.

BilalAslamDbrx
Honored Contributor II
Honored Contributor II

@Borislav Blagoev​ do you by any chance have IP access lists enabled on your workspace?

NiallEgan__Data
New Contributor III

Hi @Borislav Blagoev​ ,

We just released v0.9.4, which has improved logging and error messages. Can you please run with v0.9.4, and increase the logging level like shown:

import logging
logging.basicConfig(level=logging.INFO)
databricks.sql.connect(...)

I will try it. I will post the error under your comment.

INFO:databricks.sql.client:Error during request to server: IpAclValidation: Method: OpenSession; Session-id: None; Query-id: None; HTTP-code: 403; Error-message: IpAclValidation; Original-exception: ; No-retry-reason: non-retryable error
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0
INFO:py4j.java_gateway:Received command c on object id p0

logging.basicConfig(level=logging.INFO)
     10 
---> 11 connection = sql.connect(server_hostname=hostname, 
     12                          http_path=http_path,
     13                          access_token=access_token)
 
/databricks/python/lib/python3.8/site-packages/databricks/sql/__init__.py in connect(server_hostname, http_path, access_token, **kwargs)
     26     """
     27     from databricks.sql.client import Connection
---> 28     return Connection(server_hostname, http_path, access_token, **kwargs)
 
/databricks/python/lib/python3.8/site-packages/databricks/sql/client.py in __init__(self, server_hostname, http_path, access_token, **kwargs)
    266                 client_protocol=protocol_version
    267             )
--> 268             response = self._make_request(self._client.OpenSession, open_session_req)
    269             _check_status(response)
    270             assert response.sessionHandle is not None, "Expected a session from OpenSession"
 
/databricks/python/lib/python3.8/site-packages/databricks/sql/client.py in _make_request(self, method, request)
    417             error_info = response_or_error_info
    418             # The error handler will either sleep or throw an exception
--> 419             self._handle_request_error(error_info, attempt, elapsed)
    420 
    421 
 
/databricks/python/lib/python3.8/site-packages/databricks/sql/client.py in _handle_request_error(self, error_info, attempt, elapsed)
    338             _logger.info("{}: {}".format(user_friendly_error_message, full_error_info_str))
    339 
--> 340             raise OperationalError(user_friendly_error_message, error_info.error)
    341 
    342         _logger.info("Retrying request after error in {} seconds: {}".format(
 
OperationalError: ('Error during request to server: IpAclValidation', EOFError())

@Niall Egan​ 

NiallEgan__Data
New Contributor III

Hi @Borislav Blagoev​ ,

Thanks very much for taking the time to collect these logs.

The problem here (as indicated by the `IpAclValidation` message) is that IP allow listing (enabled for your workspace) will not allow arbitrary connections from Spark clusters back to the control plane by default. The subnet(s) for the data plane needs to be added to the IP allow list.

Here is a link to docs on IpACL validation: https://docs.databricks.com/security/network/ip-access-list.html

Here is a link to docs on managing subnets in the data plane: https://docs.databricks.com/administration-guide/cloud-configurations/aws/customer-managed-vpc.html#...

Not sure what your end goal is with this, but it's probably also worth mentioning that there are (better) alternatives to using the `databricks-sql-connector` on Databricks notebooks. For example, in a Python notebook you can just use `spark.sql(...)` to execute SQL commands.

@Niall Egan​  I want to execute query from Databricks SQL in notebook or job.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.