VM bootstrap and authentication
When a VM boots up, it automatically authenticates with Databricks control plane using Managed Identity (MI), a per-VM credential signed by Azure AD. Once authenticated, the VM fetches secrets from the control plane, including the auth token required to initiate the relay connection. All communication during this step is TLS-encrypted using Databricksโ server certificate. Databricks rotates the certificate regularly.
During normal operation, Databricks control plane sends HTTP-based RPC requests to each Spark worker to submit commands, check execution status, monitor node health, etc. All traffic between the control plane and Spark workers are proxied through the Databricks relay. The main idea is that both the RPC server (listener) and the RPC client (senders) opens an outbound connection to the relay service. The relay server forwards requests and responses between senders and listeners.
Listeners authenticate with the relay server using per workspace auth tokens. Databricks control plane sends the token to worker VMs during bootstrap (explained above). All communication with the Databricks Relay service is TLS-encrypted.