cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks SQL connectivity in Python with Service Principals

harripy
New Contributor III

Tried to use M2M OAuth connectivity on Databricks SQL Warehouse in Python:

from databricks.sdk.core import Config, oauth_service_principal
from databricks import sql

....
config = Config(host=f"https://{host}",
                                client_id=db_user,
                                client_secret=passwd)
oauth_sp = oauth_service_principal(config)
return sql.connect(server_hostname=host,
                   http_path=http_path,
                   catalog=catalog,
                   schema=schema,
                   credentials_provider=oauth_sp)

  The params mentioned are properly assigned, but this fails to an error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/path/venv/lib/python3.11/site-packages/databricks/sql/__init__.py", line 50, in connect
    return Connection(server_hostname, http_path, access_token, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/venv/lib/python3.11/site-packages/databricks/sql/client.py", line 189, in __init__
    self._session_handle = self.thrift_backend.open_session(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 464, in open_session
    response = self.make_request(self._client.OpenSession, open_session_req)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 393, in make_request
    self._handle_request_error(error_info, attempt, elapsed)
  File "/path/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 261, in _handle_request_error
    raise network_request_error
databricks.sql.exc.RequestError: Error during request to server

The credentials provider seems to be set correctly, yet the connection is tried to be established with the access_token - what am I missing?

8 REPLIES 8

Yeshwanth
Honored Contributor

@harripy , could you please brief us about your use case? So that we can understand the issue better.

harripy
New Contributor III

We have configured a service principal for Databricks SQL Warehouse and we want to do DML calls on the database through Python connection (mostly INSERT-SELECT-clauses). I found documentation stating that it SHOULD be possible to get the connection to the SQL Warehouse just by providing the m2m-auth configuration (having the databricks sp client id and databricks sp client secret) as the credentials_provider. During execution the error message though indicates that the connection is tried to be opened with access token instead of the m2m authentication. I might be missing out some crucial configuration somewhere, but it would be nice to get this working.

Current working solution is for us:

  1. connect to workspace with sp client/secret
  2. request token from workspace
  3. connect using the given token
  4. on connection close delete the token

daniel_sahal
Esteemed Contributor

@harripy 
I've just looked through the Databricks SDK code and it looks like m2m auth is generating oauth bearer token to authenticate, that's why you see "access_token", but it's different from PAT (https://github.com/databricks/databricks-sdk-py/blob/68feadf82f12d387621704f2957d743bfc86f4bd/databr...).

So it looks like it's able to authenticate with Databricks through M2M and get access token. 
It's hard to guess what's the issue with the logs that you've provided.

What i suggest is to enable debug logging and paste the logs.
https://github.com/databricks/databricks-sdk-py/tree/68feadf82f12d387621704f2957d743bfc86f4bd?tab=re... 

harripy
New Contributor III

alright, made following test call:

dbConfig = {'targetDvUser':db_user,
            'targetDvPassword':password, 
            'targetDvHost':'dbc-098946cd-8b3a.cloud.databricks.com',
            'targetDvCatalog':'ade_test',
            'targetHttpPath':'/sql/1.0/warehouses/7b15eacc3fef2e55',
            'targetSchema':'manager_test_rdv'}

import logging,sys
from databricks.sdk.core import Config, oauth_service_principal
from databricks import sql
from databricks.sql.exc import Error, OperationalError

logging.basicConfig(stream=sys.stderr,
                    level=logging.INFO,
                    format='%(asctime)s [%(name)s][%(levelname)s] %(message)s')
logging.getLogger('databricks.sdk').setLevel(logging.DEBUG)

config = Config(host=f"https://{dbConfig['targetDvHost']}",
                                client_id=dbConfig['targetDvUser'],
                                client_secret=dbConfig['targetDvPassword'])
oauth_sp = oauth_service_principal(config)
conn = sql.connect(server_hostname=dbConfig['targetDvHost'],
                   http_path=dbConfig['targetHttpPath'],
                   catalog=dbConfig['targetDvCatalog'],
                   schema=dbConfig['targetSchema'],
                   credentials_provider=oauth_sp)

And the response was as follows:

>>> dbConfig = {'targetDvUser':db_user,
...             'targetDvPassword':password,
...             'targetDvHost':'dbc-098946cd-8b3a.cloud.databricks.com',
...             'targetDvCatalog':'ade_test',
...             'targetHttpPath':'/sql/1.0/warehouses/7b15eacc3fef2e55',
...             'targetSchema':'manager_test_rdv'}
>>> import logging,sys
>>> from databricks.sdk.core import Config, oauth_service_principal
>>> from databricks import sql
>>> from databricks.sql.exc import Error, OperationalError
>>> logging.basicConfig(stream=sys.stderr,
...                     level=logging.INFO,
...                     format='%(asctime)s [%(name)s][%(levelname)s] %(message)s')
>>> logging.getLogger('databricks.sdk').setLevel(logging.DEBUG)
>>> config = Config(host=f"https://{dbConfig['targetDvHost']}",
...                                 client_id=dbConfig['targetDvUser'],
...                                 client_secret=dbConfig['targetDvPassword'])
2024-04-26 12:37:43,611 [databricks.sdk][DEBUG] Attempting to configure auth: pat
2024-04-26 12:37:43,611 [databricks.sdk][DEBUG] Attempting to configure auth: basic
2024-04-26 12:37:43,611 [databricks.sdk][DEBUG] Attempting to configure auth: metadata-service
2024-04-26 12:37:43,611 [databricks.sdk][DEBUG] Attempting to configure auth: oauth-m2m
>>> oauth_sp = oauth_service_principal(config)
>>> conn = sql.connect(server_hostname=dbConfig['targetDvHost'],
...                    http_path=dbConfig['targetHttpPath'],
...                    catalog=dbConfig['targetDvCatalog'],
...                    schema=dbConfig['targetSchema'],
...                    credentials_provider=oauth_sp)
2024-04-26 12:38:22,632 [databricks.sdk.oauth][DEBUG] Retrieving token for f15f66cb-d173-432b-b530-896d2834d4c7
2024-04-26 12:38:22,967 [databricks.sql.thrift_backend][INFO] Error during request to server: {"method": "OpenSession", "session-id": null, "query-id": null, "http-code": null, "error-message": "", "original-exception": "'dict' object is not callable", "no-retry-reason": "non-retryable error", "bounded-retry-delay": null, "attempt": "1/30", "elapsed-seconds": "0.00017309188842773438/900.0"}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/harri.pylkkanen/workspace/dv-executor/venv/lib/python3.11/site-packages/databricks/sql/__init__.py", line 50, in connect
    return Connection(server_hostname, http_path, access_token, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/harri.pylkkanen/workspace/dv-executor/venv/lib/python3.11/site-packages/databricks/sql/client.py", line 189, in __init__
    self._session_handle = self.thrift_backend.open_session(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/harri.pylkkanen/workspace/dv-executor/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 464, in open_session
    response = self.make_request(self._client.OpenSession, open_session_req)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/harri.pylkkanen/workspace/dv-executor/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 393, in make_request
    self._handle_request_error(error_info, attempt, elapsed)
  File "/Users/harri.pylkkanen/workspace/dv-executor/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 261, in _handle_request_error
    raise network_request_error
databricks.sql.exc.RequestError: Error during request to server

So indeed the oauth bearer token is generated (or being generated), but then the open session crashes - is this a service principal permissions related thing or some configuration missed somewhere?

daniel_sahal
Esteemed Contributor

@harripy 
Looking at the SDK logs - i've got no idea what could happen here. I think that SQL logs could be handy. Could you set databricks.sql logging level to DEBUG as well?

harripy
New Contributor III

These log lines will be added with databricks.sql DEBUG:

2024-04-29 08:47:10,276 [databricks.sql.thrift_backend][DEBUG] retry parameter: _retry_delay_min given_or_default 1.0
2024-04-29 08:47:10,277 [databricks.sql.thrift_backend][DEBUG] retry parameter: _retry_delay_max given_or_default 60.0
2024-04-29 08:47:10,277 [databricks.sql.thrift_backend][DEBUG] retry parameter: _retry_stop_after_attempts_count given_or_default 30
2024-04-29 08:47:10,277 [databricks.sql.thrift_backend][DEBUG] retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
2024-04-29 08:47:10,277 [databricks.sql.thrift_backend][DEBUG] retry parameter: _retry_delay_default given_or_default 5.0
2024-04-29 08:47:10,309 [databricks.sql.thrift_backend][DEBUG] Sending request: TOpenSessionReq(client_protocol=None, username=None, password=None, configuration={'spark.thriftserver.arrowBasedRowSet.timestampAsString': 'false'}, getInfos=None, client_protocol_i64=42247, connectionProperties=None, initialNamespace=TNamespace(catalogName='ade_test', schemaName='manager_test_rdv'), canUseMultipleCatalogs=True, sessionId=None)
2024-04-29 08:47:10,310 [databricks.sql.thrift_backend][INFO] Error during request to server: {"method": "OpenSession", "session-id": null, "query-id": null, "http-code": null, "error-message": "", "original-exception": "'dict' object is not callable", "no-retry-reason": "non-retryable error", "bounded-retry-delay": null, "attempt": "1/30", "elapsed-seconds": "0.00023293495178222656/900.0"}
Traceback (most recent call last):

kavyakavuri
New Contributor II

I am facing the same issue with the same error logs as @harripy. Can you please help @Yeshwanth @Dani ?

 

Mat_Conquest
New Contributor II

Did anyone get this to work? I have tried the code above but I get a slightly different error but I don't see the same level of details from the logs

2024-10-04 14:59:25,508 [databricks.sdk][DEBUG] Attempting to configure auth: pat
2024-10-04 14:59:25,508 [databricks.sdk][DEBUG] Attempting to configure auth: basic
2024-10-04 14:59:25,509 [databricks.sdk][DEBUG] Attempting to configure auth: metadata-service
2024-10-04 14:59:25,509 [databricks.sdk][DEBUG] Attempting to configure auth: oauth-m2m
2024-10-04 14:59:25,523 [urllib3.connectionpool][DEBUG] Starting new HTTPS connection (1): adb-xx.azuredatabricks.net:443
2024-10-04 14:59:25,706 [urllib3.connectionpool][DEBUG] https://adb-xx.azuredatabricks.net:443 "GET /oidc/.well-known/oauth-authorization-server HTTP/11" 200 None
2024-10-04 14:59:25,714 [urllib3.connectionpool][DEBUG] Starting new HTTPS connection (1): adb-xx.azuredatabricks.net:443
2024-10-04 14:59:25,871 [urllib3.connectionpool][DEBUG] https://adb-xx.azuredatabricks.net:443 "GET /oidc/.well-known/oauth-authorization-server HTTP/11" 200 None
Traceback (most recent call last):
File "H:\Documents\Git_New\DataHubUserAccess\source\Python\take3.py", line 28, in <module>
conn = sql.connect(server_hostname=dbConfig['targetDvHost'],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\matcon\Python\Lib\site-packages\databricks\sql\__init__.py", line 50, in connect
return Connection(server_hostname, http_path, access_token, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\matcon\Python\Lib\site-packages\databricks\sql\client.py", line 157, in __init__
auth_provider = get_python_sql_connector_auth_provider(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\matcon\Python\Lib\site-packages\databricks\sql\auth\auth.py", line 96, in get_python_sql_connector_auth_provider
return get_auth_provider(cfg)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\matcon\Python\Lib\site-packages\databricks\sql\auth\auth.py", line 68, in get_auth_provider
raise RuntimeError("No valid authentication settings!")
RuntimeError: No valid authentication settings!

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group