Wednesday
Hi everyone,
I have a question regarding networking.
A bit of background first: For security reasons, the current allow-policy from GCP to our on-prem-infrastructure is being replaced by a deny-policy for traffic originating from GCP. Therefore access needs to be specifically granted per on-prem data source/service that is going to be used from GCP.
Our setup: We have 4 DBX workspaces and the corresponding 4 subnets in the GCP customer managed VPC. Our compute clusters now uses Google Compute Engine and starts the cluster in the corresponding subnet.
Problem: It seems like the traffic originates from GCP clusters when connecting to on-prem data sources is coming from all over the place - sometimes from the node subnets and sometimes from the pod subnets.
My question is: What can I do to pinpoint exactly where the traffic originates when we connect to a on prem datasource? What subnets (node, pod, service) do Databricks use for cluster, SQL warehouses and so on?
Looking forward to our discussions. Thank you!
Best Regards
Wednesday
Hi @KLin, happy to help! -
The reason why traffic originates from the pods subnet for clusters/SQL warehouses without the x-databricks-nextgen-cluster
tag (still using GKE) and from the node subnet for clusters with the GCE tag is due to the underlying infrastructure differences between Google Kubernetes Engine (GKE) and Google Compute Engine (GCE).
In GKE, Databricks clusters are implemented as Kubernetes namespaces, and the traffic is managed at the pod level. Each pod within the GKE cluster is assigned an IP address from the pod subnet, which is why traffic originates from the pods subnet.
On the other hand, when using GCE, the clusters are hosted on virtual machines (VMs) rather than Kubernetes pods. These VMs are assigned IP addresses from the node subnet. Therefore, traffic for clusters with the GCE tag originates from the node subnet.
This distinction is made to align with the network configurations and resource management specific to each type of infrastructure (GKE for Kubernetes-based deployments and GCE for VM-based deployments).
Wednesday
Hi @KLin,
Adding some comments:
Subnets in GCP VPC:
Traffic Origin:
Identifying Traffic Origin:
Wednesday
Thanks for the timely reply.
With regard to your reply:
A follow up question from me is: How does DBX decide whether to use node subnets or pod subnets? and since DBX has the incentives already to move from GKE to GCE (we have already made the switch, but, correct me if i am wrong, it seems like the workspaces are still hosted on GKE), do you know how does the switch impact the workspaces and the subnets?
Thanks a lot!
Wednesday
Hi @KLin - no problem! You can share your workspace ID via a DIM, I can try getting more details of need be.
Databricks decides whether to use node subnets or pod subnets based on the specific network configuration and the type of traffic. The node subnets are used for the Google Compute Engine (GCE) virtual machines that host the nodes, while the pod subnets are used for the individual pods within the Google Kubernetes Engine (GKE) clusters.
Regarding the switch from GKE to GCE, the workspaces are still hosted on GKE, which means that the network configurations involving node subnets and pod subnets remain relevant. The switch to GCE primarily impacts the underlying infrastructure but does not change the way workspaces and subnets are managed within the GKE clusters. The workspaces will continue to use the same subnet configurations for nodes and pods as defined during their creation.
Wednesday
Hi @Alberto_Umana thank you for the detailed explanation. I figured out that the clusters/SQL warehouses that does not have the x-databricks-nextgen-cluster tag, i.e. still using GKE, the traffic originates from the pods subnet. If the clusters have the GCE tag, then the traffic originates from the node subnet. Is there a reason why it is done this way?
Much appreciated!
Wednesday
Hi @KLin, happy to help! -
The reason why traffic originates from the pods subnet for clusters/SQL warehouses without the x-databricks-nextgen-cluster
tag (still using GKE) and from the node subnet for clusters with the GCE tag is due to the underlying infrastructure differences between Google Kubernetes Engine (GKE) and Google Compute Engine (GCE).
In GKE, Databricks clusters are implemented as Kubernetes namespaces, and the traffic is managed at the pod level. Each pod within the GKE cluster is assigned an IP address from the pod subnet, which is why traffic originates from the pods subnet.
On the other hand, when using GCE, the clusters are hosted on virtual machines (VMs) rather than Kubernetes pods. These VMs are assigned IP addresses from the node subnet. Therefore, traffic for clusters with the GCE tag originates from the node subnet.
This distinction is made to align with the network configurations and resource management specific to each type of infrastructure (GKE for Kubernetes-based deployments and GCE for VM-based deployments).
Wednesday
Awesome! Thanks a lot for the clear and detailed info. Have nice day!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group