cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks-connect OpenSSL Handshake failed on WSL2

vpacik
New Contributor

When trying to setup databricks-connect on WSL2 using 13.3 cluster, I receive the following error regarding OpenSSL CERTIFICATE_ERIFY_FAILED.

The authentication is done via SPARK_REMOTE env. variable. 

E0415 11:24:26.646129568  142172 ssl_transport_security.cc:1519]       Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED.
Traceback (most recent call last):
  File "/home/vpacik/Codes/spark-test.py", line 5, in <module>
    spark.range(10).show()
  File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/dataframe.py", line 996, in show
    print(self._show_string(n, truncate, vertical))
  File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/dataframe.py", line 753, in _show_string
    ).toPandas()
  File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/dataframe.py", line 1655, in toPandas
    return self._session.client.to_pandas(query)
  File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 798, in to_pandas
    table, schema, metrics, observed_metrics, _ = self._execute_and_fetch(req)
  File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1172, in _execute_and_fetch
    for response in self._execute_and_fetch_as_iterator(req):
  File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1153, in _execute_and_fetch_as_iterator
    self._handle_error(error)
  File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1308, in _handle_error
    self._handle_rpc_error(error)
  File "/home/vpacik/Codes/.venv/lib/python3.10/site-packages/pyspark/sql/connect/client/core.py", line 1348, in _handle_rpc_error
    raise SparkConnectGrpcException(str(rpc_error)) from None
pyspark.errors.exceptions.connect.SparkConnectGrpcException: <_MultiThreadedRendezvous of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:20.42.4.211:443: Ssl handshake failed: SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED"
        debug_error_string = "UNKNOWN:Error received from peer  {created_time:"2024-04-15T11:24:26.646729568+02:00", grpc_status:14, grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:20.42.4.211:443: Ssl handshake failed: SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED"}"

The script used:

from databricks.connect import DatabricksSession

spark = DatabricksSession.builder.getOrCreate()
spark.range(10).show()

 

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @jp_allard

  • One approach to resolve this is to disable SSL certificate verification. However, keep in mind that this approach may compromise security.
  • In your Databricks configuration file (usually located at ~/.databrickscfg), add the following line to the relevant profile:
    insecure = True
    
  • Save the changes and try running your Databricks Connect setup again.
  • To use TLS 1.3 on your clusters, you can enable OpenJSSE in the cluster’s Apache Spark configuration.
  • Add the following line to your Spark Config (e.g., in spark-defaults.conf):
    spark.driver.extraJavaOptions -XX:+UseOpenJSSE
    
  • Restart your cluster to apply the changes. TLS 1.3 should now be enabled and usable in your notebooks.