Sql Error on MultiNode cluster, but fine on SingleNode
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-17-2024 10:43 AM
If I run the following code on a cluster in SingleNode mode it works fine, but if I run the exact same cell on a MultiNode Cluster It throws:
SparkConnectGrpcException: (java.sql.SQLTransientConnectionException) Could not connect to address=(host=HOSTm)(port=PORT)(type=master) : (conn=715518) Connections using insecure transport are prohibited while --require_secure_transport=ON.
code:
secret_scope="myScope"
user = dbutils.secrets.get(secret_scope, "MyUser")
password = dbutils.secrets.get(secret_scope, "MyPass")
url = dbutils.secrets.get(secret_scope, "MyURL") #format: jdbc:mysql://DOMAIN:PORT/SCHEMA
host = dbutils.secrets.get(secret_scope, "MyHost")
options = {
"url": url,
"query": "select 1 as id",
"user": user,
"password": password,
"useSSL": "True",
"sslmode": "required",
"ssl" : "{ \"ca\" = \"/dbfs/databricks/certs/aws-global-bundle.pem\"}",
"serverSslCert": "dbfs:/databricks/certs/aws-global-bundle.pem",
"isolationLevel":"READ_UNCOMMITTED",
"enabledSslProtocolSuites":"TLSv1.2",
}
df = spark.read.format('JDBC').options(**options).load()
df.display()
Any ideas? seems like maybe its some spark setting I'm missing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-17-2024 07:26 PM
Try moving the .pem files from dbfs to WSFS or Volume
~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2024 08:14 AM
I tried both and get the same error.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2024 08:33 AM
changed to:
got the same error
to confirm its not an issues with the pem file I was able to view the contents of the file with:
dbutils.fs.head("/Volumes/catalog/schema/volume/aws-global-bundle.pem")
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2024 08:48 AM
Try this code from Shared Compute
dbutils.fs.ls("/dfbs/databricks/")
I see you are using /dbfs/databricks under SSL key.
~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2024 09:19 AM
Thanks for the quick reply.
dbutils.fs.ls("dbfs:/databricks/certs/")
[FileInfo(path='dbfs:/databricks/certs/aws-global-bundle.pem', name='aws-global-bundle.pem', size=152872, modificationTime=1728053118000)]
secret_scope="myScope"
user = dbutils.secrets.get(secret_scope, "MyUser")
password = dbutils.secrets.get(secret_scope, "MyPass")
url = dbutils.secrets.get(secret_scope, "MyURL") #format: jdbc:mysql://DOMAIN:PORT/SCHEMA
host = dbutils.secrets.get(secret_scope, "MyHost")
options = {
"url": url,
"query": "select 1 as id",
"user": user,
"password": password,
"useSSL": "True",
"sslmode": "required",
"serverSslCert": "dbfs:/databricks/certs/aws-global-bundle.pem",
"isolationLevel":"READ_UNCOMMITTED",
"enabledSslProtocolSuites":"TLSv1.2",
}
df = spark.read.format('JDBC').options(**options).load()
df.display()
Same error:
SparkConnectGrpcException: (java.sql.SQLTransientConnectionException) Could not connect to address=(host=HOSTm)(port=PORT)(type=master) : (conn=715518) Connections using insecure transport are prohibited while --require_secure_transport=ON.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2024 08:48 PM
I see you gave a thumbs up; if the solution worked, can you please Accept it?
~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2024 07:48 AM
Sorry, I was just trying to get you attention back on this post. No the issue is not resolved.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2024 08:09 AM
Have you granted ANY FILE access?
https://docs.databricks.com/en/dbfs/unity-catalog.html#how-does-dbfs-work-in-shared-access-mode
GRANT SELECT ON ANY FILE TO `<user@domain-name>`
~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2024 08:33 AM
Thanks for looking into this,
I was able to grant myself select on any file, but it did not resolve the issue
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2024 12:01 PM
Will it be possible to paste the cluster config screenshot of the ones that work and the ones that fail?
~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2024 12:34 PM - edited 10-22-2024 12:37 PM
Can you try with this config?
options = {
"url": url,
"query": "select 1 as id",
"user": user,
"password": password,
"useSSL": "true", # Use lowercase 'true'
"sslmode": "VERIFY_CA",
"serverSslCert": "/dbfs/databricks/certs/aws-global-bundle.pem",
"isolationLevel": "READ_UNCOMMITTED",
"enabledSslProtocolSuites": "TLSv1.2", }
There are some minor changes please see whether this works in both cluster modes.
~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2024 12:45 PM
unfortunately, same error.
Emailed the cluster config json files to our SA, whom you are working with.
the difference is:
works:
"data_security_mode": "NONE",
"spark_conf": {
"spark.master": "local[*, 4]",
"spark.databricks.cluster.profile": "singleNode"
},
doesn't work:
"spark_conf": {}
"data_security_mode": "USER_ISOLATION"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2024 01:12 PM
data_security_mode": "NONE": This is a non-Unity Catalog Cluster. No Governance enforced.
"data_security_mode": "USER_ISOLATION": This is a UC Shared Compute cluster that has certain limitations when accessing Low-Level APIs, RDDs, and dbfs/data bricks folders.
If the .pem files are copied under /Workspace/Shared or /Volumes you should be able to access them via
/Workspace/Shared/file.pem
/Volumes/path/file.pem
Please make sure READ access to these folders is available.
~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2024 02:26 PM
so I copied the pem file into a volume:
/Volumes/catalog/schema/volume/aws-global-bundle.pem
and to the workspace:
/Workspace/Shared/global-bundle.pem
data sec mode: None cannot read this file and so throws the error:
Failed to find serverSslCert file. serverSslCert=/Volumes/catalog/schema/volume.....
this is expected since UC is not enabled basically in this mode.
I get the same thing if I try to reference it at the workspace location: /Workspace/Shared/global-bundle.pem.
But really I want it to work with data security mode User isolation. the result is the same error as before using both locations:
SparkConnectGrpcException: (java.sql.SQLTransientConnectionException) Could not connect to address=(host=HOSTm)(port=PORT)(type=master) : (conn=715518) Connections using insecure transport are prohibited while --require_secure_transport=ON.
What is interesting to me is its either not getting to the part where it looks for this pem file, OR its getting passed it but erroring out afterwards. To test this I tried a bogus file location in the volume
/Volumes/catalog/schema/volume/bad.pem
Same error. Which makes me thing its erroring before its even going to look for that file....