Databricks Community

sfibich1 · ‎10-28-2025

From what I understand of reading the documentation the /api/2.0/serving-endpoints/{name}/ai-gateway supports a "tokens" and a "principals" attribute in the JSON payload.

Documentation link: Update AI Gateway of a serving endpoint | Serving endpoints API | REST API reference | Azure Databri...

When I call the API get the following output as part of 200 response. Is this supported or am I making an incorrect call somehow?

jeffreyaven · ‎10-28-2025

I have dug a bit deeper on this these properties are supported but not as top level request body fields, instead they are available in object element fields under `rate_limits`. The actual payload looks like::

```
{
"guardrails": { /* ... */ },
"inference_table_config": { /* ... */ },
"rate_limits": [
{
"renewal_period": "MINUTE|HOUR|DAY",
"calls": 100,
"tokens": 1000, // ← tokens supported HERE (in rate_limits)
"principal": "user@company.com", // ← principals supported HERE
"key": "USER|ENDPOINT"
}
],
"usage_tracking_config": { /* ... */ },
"fallback_config": { /* ... */ }
}
```

For example to update the config for an ai-gateway resource you would use:

```
curl -X PUT \
"https://<deployment url>/api/2.0/serving-endpoints/{name}/ai-gateway" \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"rate_limits": [
{
"renewal_period": "HOUR",
"calls": 100,
"tokens": 1000,
"principal": "user@company.com"
}
]
}'
```

Let me know how this goes

View solution in original post

jeffreyaven · ‎10-28-2025

I have dug a bit deeper on this these properties are supported but not as top level request body fields, instead they are available in object element fields under `rate_limits`. The actual payload looks like::

```
{
"guardrails": { /* ... */ },
"inference_table_config": { /* ... */ },
"rate_limits": [
{
"renewal_period": "MINUTE|HOUR|DAY",
"calls": 100,
"tokens": 1000, // ← tokens supported HERE (in rate_limits)
"principal": "user@company.com", // ← principals supported HERE
"key": "USER|ENDPOINT"
}
],
"usage_tracking_config": { /* ... */ },
"fallback_config": { /* ... */ }
}
```

For example to update the config for an ai-gateway resource you would use:

```
curl -X PUT \
"https://<deployment url>/api/2.0/serving-endpoints/{name}/ai-gateway" \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"rate_limits": [
{
"renewal_period": "HOUR",
"calls": 100,
"tokens": 1000,
"principal": "user@company.com"
}
]
}'
```

Let me know how this goes

sfibich1 · ‎10-29-2025

Thank you for the help, it looks like from the result of the curl command the it has to be either calls or tokens you can't have both in the rate_limits. Thank you for your help! (I think the docs are wrong or I misinterpret the reading of them and thought you could pass both at once)

sfibich1 · ‎10-29-2025

Here is the code that works based on the above

curl -X PUT \
                "${DATABRICKS_HOST}/api/2.0/serving-endpoints/databricks-claude-opus-4-1/ai-gateway" \
 -H "Authorization: Bearer ${DATABRICKS_TOKEN}" \
 -H "Content-Type: application/json" \
 -d '{
            "rate_limits": [
            {
                        "key":"user",
                        "renewal_period": "minute",
                "tokens": 99999
                },
            {
                        "key":"user",
                        "renewal_period": "minute",
                "calls": 9
                }
                ],
                "usage_tracking_config": { "enabled": true }
         }'
}

To get principal to work the call should look like this based on my experimentation:

curl -X PUT \
                "${DATABRICKS_HOST}/api/2.0/serving-endpoints/databricks-claude-opus-4-1/ai-gateway" \
 -H "Authorization: Bearer ${DATABRICKS_TOKEN}" \
 -H "Content-Type: application/json" \
 -d '{
            "rate_limits": [
            {
                        "key":"user",
                        "principal":"sfibich1@xyz.com",
                        "renewal_period": "minute",
                "tokens": 99999
                },
            {
                        "key":"user",
                        "renewal_period": "minute",
                "calls": 9
                }
                ],
                "usage_tracking_config": { "enabled": true }
         }'
}

Databricks Community

API call to /api/2.0/serving-endpoints/{name}/ai-gateway does not support tokens or principals

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! December 12 – 21, 2025

PSA: Community Edition retires on January 1, 2026. Move to the Free Edition today to keep your work.

🎤 Call for Presentations: Data + AI Summit 2026 is Open!

Last Chance: Help Shape the 2026 Data + AI Summit | Win a Full Conference Pass

Celebrating Our First Brickster Champion: Louis Frolio