JDBC / SSH-tunnel to connect to PostgreSQL not working on multi-node clusters

laurenskuiper97 — Tue, 16 Apr 2024 07:24:27 GMT

Hi everybody,

I'm trying to setup a connection between Databricks' Notebooks and an external PostgreSQL database through a SSH-tunnel. On a single-node cluster, this is working perfectly fine. However, when this is ran on a multi-node cluster, this code fails. I assume it must be because of the localhost reference in the JDBC url which is interpreted by the executors on a multi-node cluster. However, inserting the hostname of the driver in that part of the URL with something like socket.getHostname() is not working as well, as was explained here: https://stackoverflow.com/questions/68278369/how-to-use-pyspark-to-read-a-mysql-database-using-a-ssh-tunnel

Does anyone know how to make this work on multi-node clusters as well?

# Set up SSH tunnel with SSHTunnelForwarder( ("data.host", 22), ssh_username="databricks", ssh_pkey=paramiko.RSAKey.from_private_key(private_key), remote_bind_address=("127.0.0.1", 5432), local_bind_address=("localhost", 5432) ) as tunnel: print(tunnel.is_active) jdbc_url = f"jdbc:postgresql://localhost:{tunnel.local_bind_port}/{database}" print(jdbc_url) df = spark.read \ .format("jdbc") \ .option("url", jdbc_url) \ .option("user", username) \ .option("password", password) \ .option("driver", driver) \ .option("dbtable", f"(SELECT * FROM schema_test.table_test) as tmp") \ .load() display(df)

Re: JDBC / SSH-tunnel to connect to PostgreSQL not working on multi-node clusters

-werners- — Wed, 17 Apr 2024 07:33:03 GMT

I doubt it is possible.
The driver runs the program, and sends tasks to the executors. But since creating the ssh tunnel is no spark task, I don't think it will be established on any executor.

topic JDBC / SSH-tunnel to connect to PostgreSQL not working on multi-node clusters in Data Engineering

JDBC / SSH-tunnel to connect to PostgreSQL not working on multi-node clusters

Re: JDBC / SSH-tunnel to connect to PostgreSQL not working on multi-node clusters