10-06-2024 11:45 PM
Hi,
Our data workspace architecture consists of a collection of discrete workspace segregated according to business function and environment. Moreover they are not all deployed to the same region: dev and staging are deployed to south east asia, whereas production workspaces are deployed to northern europe. Each workspace is configured to access isolated resources (storage account, key vault, etc.) and we also enabled Azure Private Link as a simplified deployment according to this document. We allow public access to the workspaces since there are other security provisions in place.
We have a scenario where a process on workspace A need to invoke a job on workspace B. This should theoretically be accommodated via REST or SDK. However, when invoking a REST call (via the SDK) on A which should trigger the job on B, we are getting the following error:
Cert validation failed. Cross workspace access is denied due to network policies. Config: host=https://adb-***.azuredatabricks.net, client_id=***, client_secret=***, auth_type=oauth-m2m
For the record we're able to authenticate against a Service Principale and access the workspace's own REST endpoints without issues. Moreover - and this struck me as particularly odd but bear in mind our limited networking knowledge - we were able to invoke the SDK on all external workspaces which are deployed in any region other than the "calling" workspace's region.
E.g. workspace A is deployed in south east asia, we are able to use the SDK to, say, list all jobs created in workspace B deployed in northern europe and vice versa. Additionally, we are also able to invoke the API on any workspace which does not have a private link configured, irrespective of the region in which the workspace was deployed.
Other than disabling the private link, is there any network configuration we can do to allow cross workspace access. Any pointers are hugely appreciated!
P.S I also have a secondary question: with the advent of serverless compute (both SQL and notebook/workflow), is there still value in deploying the Azure Private Link between control and data plane?
Thank you.
10-07-2024 12:27 AM - edited 10-07-2024 12:27 AM
Hi @james_farrugia,
To enable cross-workspace REST API access, you can configure VNet peering between the VNets that host your Databricks workspaces. This will allow them to communicate with each other directly.
Steps:
Configure VNet Peering:
Navigate to each VNet in Azure and configure the necessary peering connections to connect the VNets hosting your Databricks workspaces.
Ensure that the peering is correctly set up to allow traffic between the VNets.
Check Network Security Groups (NSG):
Identify the subnets associated with each VNet, as there may be Network Security Groups (NSGs) configured.
If an NSG is present, you'll need to modify the inbound and outbound rules to allow traffic between the VNets, as NSGs control network traffic. Make sure to allow the appropriate ports and protocols for cross-workspace communication.
10-07-2024 12:49 AM
Hi @filipniziol ,
Thanks for the tip. I might repost further queries here especially with regards to NSG as I've never manipulated these manually before.
One question: why is it that vnet peering/NSG alteration are not required when invoking a service on a host deployed in a region other than the calling workspace?
Regards,
James
10-07-2024 11:57 PM
Hi @james_farrugia ,
Additionally, we are also able to invoke the API on any workspace which does not have a private link configured, irrespective of the region in which the workspace was deployed.
Regarding your observation that you can invoke the API on any workspace without Private Link configured:
If Private Link is not configured, then the traffic is routed via the public internet, which allows cross-region access without specific network restrictions.
Have you encountered any situation where you had two workspaces with Private Link configured and were able to invoke an API call from one workspace with Private Link to the other workspace, also with Private Link enabled?
If yes, this could indicate a misconfiguration, where both Private Link and public access are enabled simultaneously on the workspace. This setup would allow API calls via the public endpoint, bypassing the restrictions intended by Private Link.
Let me know your thoughts on this or if you need further help!
10-08-2024 03:19 AM
Hi @filipniziol ,
Have you encountered any situation where you had two workspaces with Private Link configured and were able to invoke an API call from one workspace with Private Link to the other workspace, also with Private Link enabled?
All our production workspaces are deployed with Private Link. Production workspaces are deployed in Northern Europe. We are able to access a REST endpoint hosted on any production workspace from a workspace deployed in south east asia.
Therefore, in the cross-region scenario, cross-workspace access is not an issue. Which leads me to conclude that VNET peering is necessary only when cross accessing workspaces deployed in the same region.
If yes, this could indicate a misconfiguration, where both Private Link and public access are enabled simultaneously on the workspace. This setup would allow API calls via the public endpoint, bypassing the restrictions intended by Private Link.
We've implemented the scenario in the red box below and it seems like it's working well. This means that classic compute in the data plane can interact with notebooks/jobs/etc, in the control plane. At this point, we're fine with workspaces accessed through the public IP - my objective was to eliminate public IPs between control and data planes.
What is odd is why cross workspace access poses no problem when bridging regions but throws an error when we try to invoke a call on a host which is in the same region of the calling workspace (as mentioned above).
Regards,
James
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group