Single-Node cluster works but Multi-Node clusters do not read data.

TheDataDexter
New Contributor III

I am currently working with a VNET injected databricks workspace. At the moment I have mounted a the databricks cluster on an ADLS G2 resource.

When running notebooks on a single node that read, transform, and write data we do not encounter any problems. However, when I run the same notebook on a multi-node cluster the spark job enters into an infinite waiting stage.

When looking deeper into the Spark cluster UI I find an active stage: "load at NativeMethodAccessorImpl.java:0 (Stage 2.0)". When going to the stage page we get a 403 error - Invalid or missing CSRF token.

Interestingly, for testing purposes we've deployed a databricks workspace without VNET injection. There we make use of a multi-node clusters without any issues.

-werners-
Esteemed Contributor III

Could it be your whitelisted firewall IP-ranges? wrong CIDR-interval?

Every note will have a separate IP-address, so if your firewall is not configured correctly, the nodes cannot communicate.

View solution in original post

TheDataDexter
New Contributor III

@Werner Stinckens​ thank you for your reply. I will take a look into the netwerk configurations today.

ellafj
New Contributor II

@TheDataDexter Did you find a solution to your problem? I am facing the same issue