Databricks SQL connectivity in Python with Service Principals
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-23-2024 04:46 AM
Tried to use M2M OAuth connectivity on Databricks SQL Warehouse in Python:
from databricks.sdk.core import Config, oauth_service_principal
from databricks import sql
....
config = Config(host=f"https://{host}",
client_id=db_user,
client_secret=passwd)
oauth_sp = oauth_service_principal(config)
return sql.connect(server_hostname=host,
http_path=http_path,
catalog=catalog,
schema=schema,
credentials_provider=oauth_sp)
The params mentioned are properly assigned, but this fails to an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/path/venv/lib/python3.11/site-packages/databricks/sql/__init__.py", line 50, in connect
return Connection(server_hostname, http_path, access_token, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/venv/lib/python3.11/site-packages/databricks/sql/client.py", line 189, in __init__
self._session_handle = self.thrift_backend.open_session(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 464, in open_session
response = self.make_request(self._client.OpenSession, open_session_req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 393, in make_request
self._handle_request_error(error_info, attempt, elapsed)
File "/path/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 261, in _handle_request_error
raise network_request_error
databricks.sql.exc.RequestError: Error during request to server
The credentials provider seems to be set correctly, yet the connection is tried to be established with the access_token - what am I missing?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-24-2024 11:23 PM
@harripy , could you please brief us about your use case? So that we can understand the issue better.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-24-2024 11:40 PM
We have configured a service principal for Databricks SQL Warehouse and we want to do DML calls on the database through Python connection (mostly INSERT-SELECT-clauses). I found documentation stating that it SHOULD be possible to get the connection to the SQL Warehouse just by providing the m2m-auth configuration (having the databricks sp client id and databricks sp client secret) as the credentials_provider. During execution the error message though indicates that the connection is tried to be opened with access token instead of the m2m authentication. I might be missing out some crucial configuration somewhere, but it would be nice to get this working.
Current working solution is for us:
- connect to workspace with sp client/secret
- request token from workspace
- connect using the given token
- on connection close delete the token
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-25-2024 01:45 AM
@harripy
I've just looked through the Databricks SDK code and it looks like m2m auth is generating oauth bearer token to authenticate, that's why you see "access_token", but it's different from PAT (https://github.com/databricks/databricks-sdk-py/blob/68feadf82f12d387621704f2957d743bfc86f4bd/databr...).
So it looks like it's able to authenticate with Databricks through M2M and get access token.
It's hard to guess what's the issue with the logs that you've provided.
What i suggest is to enable debug logging and paste the logs.
https://github.com/databricks/databricks-sdk-py/tree/68feadf82f12d387621704f2957d743bfc86f4bd?tab=re...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-26-2024 02:43 AM
alright, made following test call:
dbConfig = {'targetDvUser':db_user,
'targetDvPassword':password,
'targetDvHost':'dbc-098946cd-8b3a.cloud.databricks.com',
'targetDvCatalog':'ade_test',
'targetHttpPath':'/sql/1.0/warehouses/7b15eacc3fef2e55',
'targetSchema':'manager_test_rdv'}
import logging,sys
from databricks.sdk.core import Config, oauth_service_principal
from databricks import sql
from databricks.sql.exc import Error, OperationalError
logging.basicConfig(stream=sys.stderr,
level=logging.INFO,
format='%(asctime)s [%(name)s][%(levelname)s] %(message)s')
logging.getLogger('databricks.sdk').setLevel(logging.DEBUG)
config = Config(host=f"https://{dbConfig['targetDvHost']}",
client_id=dbConfig['targetDvUser'],
client_secret=dbConfig['targetDvPassword'])
oauth_sp = oauth_service_principal(config)
conn = sql.connect(server_hostname=dbConfig['targetDvHost'],
http_path=dbConfig['targetHttpPath'],
catalog=dbConfig['targetDvCatalog'],
schema=dbConfig['targetSchema'],
credentials_provider=oauth_sp)
And the response was as follows:
>>> dbConfig = {'targetDvUser':db_user,
... 'targetDvPassword':password,
... 'targetDvHost':'dbc-098946cd-8b3a.cloud.databricks.com',
... 'targetDvCatalog':'ade_test',
... 'targetHttpPath':'/sql/1.0/warehouses/7b15eacc3fef2e55',
... 'targetSchema':'manager_test_rdv'}
>>> import logging,sys
>>> from databricks.sdk.core import Config, oauth_service_principal
>>> from databricks import sql
>>> from databricks.sql.exc import Error, OperationalError
>>> logging.basicConfig(stream=sys.stderr,
... level=logging.INFO,
... format='%(asctime)s [%(name)s][%(levelname)s] %(message)s')
>>> logging.getLogger('databricks.sdk').setLevel(logging.DEBUG)
>>> config = Config(host=f"https://{dbConfig['targetDvHost']}",
... client_id=dbConfig['targetDvUser'],
... client_secret=dbConfig['targetDvPassword'])
2024-04-26 12:37:43,611 [databricks.sdk][DEBUG] Attempting to configure auth: pat
2024-04-26 12:37:43,611 [databricks.sdk][DEBUG] Attempting to configure auth: basic
2024-04-26 12:37:43,611 [databricks.sdk][DEBUG] Attempting to configure auth: metadata-service
2024-04-26 12:37:43,611 [databricks.sdk][DEBUG] Attempting to configure auth: oauth-m2m
>>> oauth_sp = oauth_service_principal(config)
>>> conn = sql.connect(server_hostname=dbConfig['targetDvHost'],
... http_path=dbConfig['targetHttpPath'],
... catalog=dbConfig['targetDvCatalog'],
... schema=dbConfig['targetSchema'],
... credentials_provider=oauth_sp)
2024-04-26 12:38:22,632 [databricks.sdk.oauth][DEBUG] Retrieving token for f15f66cb-d173-432b-b530-896d2834d4c7
2024-04-26 12:38:22,967 [databricks.sql.thrift_backend][INFO] Error during request to server: {"method": "OpenSession", "session-id": null, "query-id": null, "http-code": null, "error-message": "", "original-exception": "'dict' object is not callable", "no-retry-reason": "non-retryable error", "bounded-retry-delay": null, "attempt": "1/30", "elapsed-seconds": "0.00017309188842773438/900.0"}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/harri.pylkkanen/workspace/dv-executor/venv/lib/python3.11/site-packages/databricks/sql/__init__.py", line 50, in connect
return Connection(server_hostname, http_path, access_token, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/harri.pylkkanen/workspace/dv-executor/venv/lib/python3.11/site-packages/databricks/sql/client.py", line 189, in __init__
self._session_handle = self.thrift_backend.open_session(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/harri.pylkkanen/workspace/dv-executor/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 464, in open_session
response = self.make_request(self._client.OpenSession, open_session_req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/harri.pylkkanen/workspace/dv-executor/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 393, in make_request
self._handle_request_error(error_info, attempt, elapsed)
File "/Users/harri.pylkkanen/workspace/dv-executor/venv/lib/python3.11/site-packages/databricks/sql/thrift_backend.py", line 261, in _handle_request_error
raise network_request_error
databricks.sql.exc.RequestError: Error during request to server
So indeed the oauth bearer token is generated (or being generated), but then the open session crashes - is this a service principal permissions related thing or some configuration missed somewhere?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-26-2024 03:24 AM
@harripy
Looking at the SDK logs - i've got no idea what could happen here. I think that SQL logs could be handy. Could you set databricks.sql logging level to DEBUG as well?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-28-2024 10:49 PM
These log lines will be added with databricks.sql DEBUG:
2024-04-29 08:47:10,276 [databricks.sql.thrift_backend][DEBUG] retry parameter: _retry_delay_min given_or_default 1.0
2024-04-29 08:47:10,277 [databricks.sql.thrift_backend][DEBUG] retry parameter: _retry_delay_max given_or_default 60.0
2024-04-29 08:47:10,277 [databricks.sql.thrift_backend][DEBUG] retry parameter: _retry_stop_after_attempts_count given_or_default 30
2024-04-29 08:47:10,277 [databricks.sql.thrift_backend][DEBUG] retry parameter: _retry_stop_after_attempts_duration given_or_default 900.0
2024-04-29 08:47:10,277 [databricks.sql.thrift_backend][DEBUG] retry parameter: _retry_delay_default given_or_default 5.0
2024-04-29 08:47:10,309 [databricks.sql.thrift_backend][DEBUG] Sending request: TOpenSessionReq(client_protocol=None, username=None, password=None, configuration={'spark.thriftserver.arrowBasedRowSet.timestampAsString': 'false'}, getInfos=None, client_protocol_i64=42247, connectionProperties=None, initialNamespace=TNamespace(catalogName='ade_test', schemaName='manager_test_rdv'), canUseMultipleCatalogs=True, sessionId=None)
2024-04-29 08:47:10,310 [databricks.sql.thrift_backend][INFO] Error during request to server: {"method": "OpenSession", "session-id": null, "query-id": null, "http-code": null, "error-message": "", "original-exception": "'dict' object is not callable", "no-retry-reason": "non-retryable error", "bounded-retry-delay": null, "attempt": "1/30", "elapsed-seconds": "0.00023293495178222656/900.0"}
Traceback (most recent call last):
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-15-2024 10:02 AM
I am facing the same issue with the same error logs as @harripy. Can you please help @Yeshwanth @Dani ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-03-2024 10:04 PM
Did anyone get this to work? I have tried the code above but I get a slightly different error but I don't see the same level of details from the logs
2024-10-04 14:59:25,508 [databricks.sdk][DEBUG] Attempting to configure auth: pat
2024-10-04 14:59:25,508 [databricks.sdk][DEBUG] Attempting to configure auth: basic
2024-10-04 14:59:25,509 [databricks.sdk][DEBUG] Attempting to configure auth: metadata-service
2024-10-04 14:59:25,509 [databricks.sdk][DEBUG] Attempting to configure auth: oauth-m2m
2024-10-04 14:59:25,523 [urllib3.connectionpool][DEBUG] Starting new HTTPS connection (1): adb-xx.azuredatabricks.net:443
2024-10-04 14:59:25,706 [urllib3.connectionpool][DEBUG] https://adb-xx.azuredatabricks.net:443 "GET /oidc/.well-known/oauth-authorization-server HTTP/11" 200 None
2024-10-04 14:59:25,714 [urllib3.connectionpool][DEBUG] Starting new HTTPS connection (1): adb-xx.azuredatabricks.net:443
2024-10-04 14:59:25,871 [urllib3.connectionpool][DEBUG] https://adb-xx.azuredatabricks.net:443 "GET /oidc/.well-known/oauth-authorization-server HTTP/11" 200 None
Traceback (most recent call last):
File "H:\Documents\Git_New\DataHubUserAccess\source\Python\take3.py", line 28, in <module>
conn = sql.connect(server_hostname=dbConfig['targetDvHost'],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\matcon\Python\Lib\site-packages\databricks\sql\__init__.py", line 50, in connect
return Connection(server_hostname, http_path, access_token, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\matcon\Python\Lib\site-packages\databricks\sql\client.py", line 157, in __init__
auth_provider = get_python_sql_connector_auth_provider(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\matcon\Python\Lib\site-packages\databricks\sql\auth\auth.py", line 96, in get_python_sql_connector_auth_provider
return get_auth_provider(cfg)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\matcon\Python\Lib\site-packages\databricks\sql\auth\auth.py", line 68, in get_auth_provider
raise RuntimeError("No valid authentication settings!")
RuntimeError: No valid authentication settings!

