โ01-31-2024 05:55 AM
The story is that I can access a service listening on port 80 using a single node cluster, but can't do the same using a shared node cluster.
I read about the `spark.databricks.pyspark.iptable.outbound.whitelisted.ports` , however, setting that :
`spark.databricks.pyspark.iptable.outbound.whitelisted.ports 587,9100,9243,443,22,80`
does not make it work. Get Started Discussions
I would like if there are settings I'm missing or the above settings is the only supposed to be used in that case.
โ01-31-2024 07:34 AM
Hi, I hope you are doing well.
Can you confirm if the connectivity works well while using the "Single User" and "No-Isolation Shared" clusters? The Shared clusters, by default, block outbound traffic to some of the ports.
Workaround 1:
Try to add this spark property on the shared cluster and try to run the same connectivity tests
====
spark.databricks.pyspark.iptable.outbound.whitelisted.ports 4554
====
If this fails, please try to configure an init script to run on the shared cluster where this issue is observed. The init script uses use IP tables firewall to open INPUT/OUTPUT TCP connections to port 4554 for the cluster
Workaround 2:
Create an init-script with the below script, attach it to the cluster and run the connectivity test.
====
#!/bin/bash
iptables -A OUTPUT -p tcp --dport 4554 -j ACCEPT
iptables -A INPUT -p tcp --dport 4554 -j ACCEPT
====
Please try both of the workarounds and keep us posted regarding the progress. Also, do not hesitate to reach out to us if you need any help.
Note: I have taken port 4554 as an example. Please change it as per your use case.
โ02-01-2024 01:33 AM
spark.databricks.pyspark.iptable.outbound.whitelisted.ports <-- this is not working
This rule is supposed to accept the incoming to port 80, however, my problem is the outgoing connection.
Are you sure I need to add this?
iptables -A INPUT -p tcp --dport 80 -j ACCEPT
โ02-01-2024 03:32 AM
The init.sh can't be tested
INVALID_PARAMETER_VALUE: Attempting to install the following init scripts that are not in the allowlist. /Volumes/main/default/datalake/libs/init.sh: PERMISSION_DENIED: '/Volumes/main/default/datalake/libs/init.sh' is not in the artifact allowlist
I'll back as soon as possibile.
โ02-01-2024 11:05 PM
@6502 Please try placing the init script on the S3 or Workspace location and share the results here.
โ09-17-2024 11:32 AM
@6502 you first need to allow the script in the metastore configuration (navgiate to catalog, on the top of the page there's a small cogwheel, click on your catalog name), then:
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group