With Databricks serverless networking, our goal is to make connectivity secure and simple, with minimal configuration. In turn, you can focus on the data and AI use-cases that matter most to you. One area that we’ve heard a lot of asks around is for keeping a workspace locked down from access to unauthorized resources, while still enabling in-cloud or sometimes even cross-cloud secure connectivity to sanctioned resources.
In this blog post, we will address one of the popular customer requests for such a cross-cloud scenario: locking down access to the internet from your serverless workloads on AWS, but enabling access to Azure Open AI through a dedicated, per-customer connection.
To control access to the internet from your serverless workloads, we're enhancing egress control capabilities. Please use this form to join our previews, or contact your account team to learn more.
To establish a dedicated connection through the model serving endpoint to Azure OpenAI, two critical connections are required: the connection between the model serving endpoint and the customer's VPC in AWS, and the connection from the customer's VPC to Azure OpenAI.
Databricks' serverless compute plane networking is managed by Network Connectivity Configuration (NCC). Each NCC container currently offers two options:
For maximum security, we recommend using AWS PrivateLink from serverless, which is also the approach we adopt in our example configuration.
Let’s take a closer look at the architecture:
At the heart of the setup, we deployed HAProxy on EC2 instances as a Layer 4 forwarding mechanism. Requests from the model serving nodes are routed through PrivateLink to HAProxy servers, which then forward these requests directly to Azure OpenAI. To enhance the enterprise-readiness of the solution, we implement several features:
On AWS:
On Azure:
Constructing a VPN connection between Amazon VPC and Azure VNet is possible for those wishing to eliminate public access entirely. Please reach out to your account team for guidance if you would like to go this route.
This section describes the detailed steps to configure a dedicated and secured connection to Azure Open AI service for your workspace.
Follow the Azure documentation to create an Azure OpenAI service and deploy a model.
⚠️NOTE: Once Azure Open AI service is available, please note down the deployment name and endpoint. This will be used later to configure the proxy server backend and construct the API URL when registering your MLFlow model in Databricks model registry.
The following AWS resources are required to build the VPC endpoint service:
⚠️NOTE: Please tag the EIPs properly as the tag key (not tag value) will be used to identify the EIP pool in the Lambda function. In our example, only the EIPs with tag key “dais24_eip” will be assigned to the proxy servers.
⚠️NOTE: Optionally, you can write a shell script to install the proxy server and put it in the user data field in Advanced details section of Launch Template.
Below is a sample user data shell script:
#!/bin/bash
# Fetch the token required for IMDSv2
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
# Fetch the public IP address of the instance using the token
PUBLIC_IP=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" -s http://169.254.169.254/latest/meta-data/public-ipv4)
# Check if the public IP address is not empty
if [ -n "$PUBLIC_IP" ]; then
echo "Public IP found: $PUBLIC_IP"
echo "Installing HAProxy..."
# Update the package repository and install HAProxy
sudo yum update -y
sudo yum install haproxy -y
# Enable and start the HAProxy service
# systemctl enable haproxy
# systemctl start haproxy
echo "HAProxy installation completed."
else
echo "No public IP assigned to this instance. Skipping HAProxy installation."
fi
⚠️NOTE: When creating the NLB, please select TCP port 443 in Basic configuration section and TCP protocol in Health checks section
⚠️NOTE: Please set the initial desired capacity and minimum capacity to 0 when creating the Auto Scaling group. If these two parameters are not set to 0, the EC2 proxy servers will be launched immediately but no EIPs will be assigned. The reason is that the lifecycle hook does not exist at this point and the Lambda function that is used to assign EIPs will not be triggered. You need to manually update them to the actual values once the lifecycle hook is created.
The following IAM permissions need to be granted to the role:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"autoscaling:CompleteLifecycleAction"
],
"Effect": "Allow",
"Resource": "*"
},
{
"Action": [
"ec2:AssociateAddress",
"ec2:DisassociateAddress",
"ec2:DescribeInstances",
"ec2:DescribeAddresses",
"ec2:CreateTags"
],
"Effect": "Allow",
"Resource": "*"
},
{
"Action": "logs:CreateLogGroup",
"Effect": "Allow",
"Resource": "*"
},
{
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
Below is a sample Python code of the Lambda function:
import boto3
def lambda_handler(event, context):
# Create a new EC2 client
ec2_client = boto3.client('ec2')
as_client = boto3.client('autoscaling')
# Get the list of all available Elastic IPs from EIP pool where tag-key filter needs to match the EIP tag key
eips = ec2_client.describe_addresses(Filters=[{'Name': 'tag-key', 'Values': ['dais24_eip']}])
aval_eips = [eip for eip in eips['Addresses'] if 'AssociationId' not in eip]
if not aval_eips:
raise Exception('No free EIPs available')
instance_id = event['detail']["EC2InstanceId"]
eip = aval_eips[0]['AllocationId']
# Associate the EIP with the instance
ec2_client.associate_address(AllocationId=eip, InstanceId=instance_id)
# Complete the lifecycle action
response = as_client.complete_lifecycle_action(
LifecycleHookName=event['detail']["LifecycleHookName"],
AutoScalingGroupName=event['detail']['AutoScalingGroupName'],
LifecycleActionToken=event['detail']['LifecycleActionToken'],
LifecycleActionResult='CONTINUE',
InstanceId=instance_id
)
⚠️NOTE: The Lambda function's timeout should be set to 10 seconds.
global
log /dev/log local0
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
# utilize system-wide crypto-policies
ssl-default-bind-ciphers PROFILE=SYSTEM
ssl-default-server-ciphers PROFILE=SYSTEM
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
frontend main
bind *:443
mode tcp
option tcplog
default_backend aoai
backend aoai
mode tcp
server azure_openai dais24-aoai-demo.openai.azure.com:443 check
Now you can start HAProxy service on each EC2 instance and they should show healthy status on the target group page:
arn:aws:iam::565502421330:role/private-connectivity-role-<region>For example, if your VPC endpoint service is in region us-east-1, allowlist
arn:aws:iam::565502421330:role/private-connectivity-role-us-east-1Alternatively, you could also allowlist * since the network security of your VPC endpoint service is also guaranteed by manually accepting only the VPC endpoint Databricks created for your VPC endpoint service.
You can skip this step if there is an existing NCC object and a Network Policy (with restricted access) object that you wish to use for your workspace.
Please contact your account team to be enrolled in both previews. Once enrolled, you can:
Log in as a Databricks admin. On the left pane of the Accounts console, navigate to Cloud resources -> Network -> Network Connectivity Configurations, click “Add Network Connectivity Configuration,” enter the NCC name and region, and click “Add” to create the NCC.
Go back to the Cloud resources -> Network -> Network Policies, click “Add Network Policy” to open Create new network policy page, enter the policy name, select “Restricted access” for Serverless Internet Access, click Create button to create the network policy.
Select the NCC you created in Step 4, navigate to Private endpoint rules, click “Add private endpoint rule”, enter Endpoint service and Domain names, and click “Add” to create the private endpoint rule in the NCC.
The Endpoint service is the service name of the VPC endpoint service that you created in Step 4. In our case, it is com.amazonaws.vpce.us-east-1.vpce-svc-090fa8dfc6922d838.
The Domain names is the FQDN of the destination resource. In our case, it is dais24-aoai-demo.openai.azure.com, the Azure Open AI service you created in Step 1. Please note that it doesn’t include the prefix “https://” of the endpoint.
Now the private endpoint rule shows PENDING status.
Go to the VPC endpoint service you created in Step 4, navigate to Endpoint connections, confirm the Endpoint ID matches the VPC endpoint that you created in Step 6, click Actions drop-down menu, select Accept endpoint connection request, and click Accept button on the pop-up window to approve the connection request.
Go back to Private endpoint rules page on Databricks Accounts console, wait for a minute, refresh the page, and now the private endpoint rule shows ESTABLISHED status.
On Databricks Accounts console, navigate to “Workspaces” on the left pane, select an existing workspace, click Update workspace to open Update workspace page, click Network Connectivity Configuration drop-down menu, select the NCC you created in Step 5, and click Update button to attach NCC object to the workspace.
On the workspace configuration tab, click “Update network policy” button in the Network Policy box to open “Update workspace network policy” pop-up window, select the Network Policy you created in Step 5, and click Apply policy button to attach the Network Policy object to the workspace.
On Azure Open AI service you created in step 1, navigate to “Networking” on the left pane, select “Selected Networks and Private Endpoints”, enter the EIPs you created in step 2, and click Save button.
Log in to the workspace as a workspace admin and verify if the NCC and Network Policy are applied properly:
import mlflow
import mlflow.pyfunc
import requests
import json
class TestSEGModel(mlflow.pyfunc.PythonModel):
def load_context(self, context):
pass
def predict(self, _, model_input):
first_row = model_input.iloc[0]
api_key = "xxx" # Please store the API key in Databricks secret and reference it from the notebook using dbutils.secrets.get
api_url = "https://dais24-aoai-demo.openai.azure.com/openai/deployments/gpt35-demo/completions?api-version=2024..."
prompt = first_row['prompt']
headers = {'api-key': f'{api_key}', 'Content-Type': 'application/json'}
json_data = {
"prompt": prompt,
"max_tokens": 128
}
try:
response = requests.post(api_url, json=json_data, headers=headers)
except requests.exceptions.RequestException as e:
# Return the error details as text
return f"Error: An error occurred - {e}"
return [response.json()]
with mlflow.start_run(run_name='dais24-aoai-run'):
wrappedModel = TestSEGModel()
mlflow.pyfunc.log_model(
artifact_path="dais24-aoai",
python_model=wrappedModel,
registered_model_name="dais24-aoai-model"
)
Request body:
{
"dataframe_records": [
{
"prompt": "Write 3 reasons why you should train an AI model on domain specific data sets?"
}
]
}
Since the Azure Open AI firewall whitelisted all EIPs attached to the proxy servers, the query should succeed with the following response:
Now intentionally change to the wrong IPs in Azure Open AI firewall:
Query the endpoint again and it should respond with the 403 access denied error:
We provided the Terraform code to help you quickly deploy all the AWS resources mentioned in Step 2 through Step 4. You just need to adjust the environment variables in myvars.auto.tfvars and run “terraform apply --auto-approve”.
The Terraform code is provided as a sample for reference and testing purposes only. Please review, modify the code according to your needs, and fully test it before using it in your production environment.
Please keep in mind the following notes for the Terraform code:
In this post, we presented a sample solution for establishing a secure and dedicated connection between Databricks' serverless model serving endpoint and the Azure OpenAI service. We explored the key design principles underpinning this solution and provided a Terraform template to facilitate immediate testing.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.