<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic API call to /api/2.0/serving-endpoints/{name}/ai-gateway does not support tokens or principals in Administration &amp; Architecture</title>
    <link>https://community.databricks.com/t5/administration-architecture/api-call-to-api-2-0-serving-endpoints-name-ai-gateway-does-not/m-p/136468#M4295</link>
    <description>&lt;P&gt;From what I understand of reading the documentation the&amp;nbsp;&lt;SPAN&gt;/api/2.0/serving-endpoints/{name}/ai-gateway supports a "tokens" and a "principals" attribute in the JSON payload.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Documentation link:&amp;nbsp;&lt;A href="https://docs.databricks.com/api/azure/workspace/servingendpoints/putaigateway" target="_blank"&gt;Update AI Gateway of a serving endpoint | Serving endpoints API | REST API reference | Azure Databricks&lt;/A&gt;&lt;/P&gt;&lt;P&gt;When I call the API get the following output as part of 200 response.&amp;nbsp; Is this supported or am I making an incorrect call somehow?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sfibich1_0-1761683609470.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/21138i79F48646D97D1010/image-size/medium?v=v2&amp;amp;px=400" role="button" title="sfibich1_0-1761683609470.png" alt="sfibich1_0-1761683609470.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 28 Oct 2025 20:34:45 GMT</pubDate>
    <dc:creator>sfibich1</dc:creator>
    <dc:date>2025-10-28T20:34:45Z</dc:date>
    <item>
      <title>API call to /api/2.0/serving-endpoints/{name}/ai-gateway does not support tokens or principals</title>
      <link>https://community.databricks.com/t5/administration-architecture/api-call-to-api-2-0-serving-endpoints-name-ai-gateway-does-not/m-p/136468#M4295</link>
      <description>&lt;P&gt;From what I understand of reading the documentation the&amp;nbsp;&lt;SPAN&gt;/api/2.0/serving-endpoints/{name}/ai-gateway supports a "tokens" and a "principals" attribute in the JSON payload.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Documentation link:&amp;nbsp;&lt;A href="https://docs.databricks.com/api/azure/workspace/servingendpoints/putaigateway" target="_blank"&gt;Update AI Gateway of a serving endpoint | Serving endpoints API | REST API reference | Azure Databricks&lt;/A&gt;&lt;/P&gt;&lt;P&gt;When I call the API get the following output as part of 200 response.&amp;nbsp; Is this supported or am I making an incorrect call somehow?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sfibich1_0-1761683609470.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/21138i79F48646D97D1010/image-size/medium?v=v2&amp;amp;px=400" role="button" title="sfibich1_0-1761683609470.png" alt="sfibich1_0-1761683609470.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 28 Oct 2025 20:34:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/api-call-to-api-2-0-serving-endpoints-name-ai-gateway-does-not/m-p/136468#M4295</guid>
      <dc:creator>sfibich1</dc:creator>
      <dc:date>2025-10-28T20:34:45Z</dc:date>
    </item>
    <item>
      <title>Re: API call to /api/2.0/serving-endpoints/{name}/ai-gateway does not support tokens or principals</title>
      <link>https://community.databricks.com/t5/administration-architecture/api-call-to-api-2-0-serving-endpoints-name-ai-gateway-does-not/m-p/136496#M4297</link>
      <description>&lt;P&gt;I have dug a bit deeper on this these properties are supported but not as top level request body fields, instead they are available in object element fields under `rate_limits`. The actual payload looks like::&lt;/P&gt;
&lt;P&gt;```&lt;BR /&gt;{&lt;BR /&gt;&amp;nbsp; &amp;nbsp; "guardrails": { /* ... */ },&lt;BR /&gt;&amp;nbsp; &amp;nbsp; "inference_table_config": { /* ... */ },&lt;BR /&gt;&amp;nbsp; &amp;nbsp; "rate_limits": [&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; {&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "renewal_period": "MINUTE|HOUR|DAY",&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "calls": 100,&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "tokens": 1000, &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // ← tokens supported HERE (in rate_limits)&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "principal": "user@company.com", // ← principals supported HERE &amp;nbsp;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "key": "USER|ENDPOINT"&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; }&lt;BR /&gt;&amp;nbsp; &amp;nbsp; ],&lt;BR /&gt;&amp;nbsp; &amp;nbsp; "usage_tracking_config": { /* ... */ },&lt;BR /&gt;&amp;nbsp; &amp;nbsp; "fallback_config": { /* ... */ }&lt;BR /&gt;&amp;nbsp; }&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;For example to update the config for an ai-gateway resource you would use:&lt;/P&gt;
&lt;P&gt;```&lt;BR /&gt;curl -X PUT \&lt;BR /&gt;&amp;nbsp; "https://&amp;lt;deployment url&amp;gt;/api/2.0/serving-endpoints/{name}/ai-gateway" \&lt;BR /&gt;&amp;nbsp; -H "Authorization: Bearer &amp;lt;token&amp;gt;" \&lt;BR /&gt;&amp;nbsp; -H "Content-Type: application/json" \&lt;BR /&gt;&amp;nbsp; -d '{&lt;BR /&gt;&amp;nbsp; &amp;nbsp; "rate_limits": [&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; {&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "renewal_period": "HOUR",&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "calls": 100,&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "tokens": 1000,&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "principal": "user@company.com"&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; }&lt;BR /&gt;&amp;nbsp; &amp;nbsp; ]&lt;BR /&gt;&amp;nbsp; }'&lt;BR /&gt;```&lt;/P&gt;
&lt;P&gt;Let me know how this goes&lt;/P&gt;</description>
      <pubDate>Wed, 29 Oct 2025 04:25:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/api-call-to-api-2-0-serving-endpoints-name-ai-gateway-does-not/m-p/136496#M4297</guid>
      <dc:creator>jeffreyaven</dc:creator>
      <dc:date>2025-10-29T04:25:51Z</dc:date>
    </item>
    <item>
      <title>Re: API call to /api/2.0/serving-endpoints/{name}/ai-gateway does not support tokens or principals</title>
      <link>https://community.databricks.com/t5/administration-architecture/api-call-to-api-2-0-serving-endpoints-name-ai-gateway-does-not/m-p/136555#M4299</link>
      <description>&lt;P&gt;Thank you for the help, it looks like from the result of the curl command the it has to be either calls or tokens you can't have both in the rate_limits.&amp;nbsp; Thank you for your help!&amp;nbsp; (I think the docs are wrong or I misinterpret the reading of them and thought you could pass both at once)&lt;/P&gt;</description>
      <pubDate>Wed, 29 Oct 2025 14:14:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/api-call-to-api-2-0-serving-endpoints-name-ai-gateway-does-not/m-p/136555#M4299</guid>
      <dc:creator>sfibich1</dc:creator>
      <dc:date>2025-10-29T14:14:41Z</dc:date>
    </item>
    <item>
      <title>Re: API call to /api/2.0/serving-endpoints/{name}/ai-gateway does not support tokens or principals</title>
      <link>https://community.databricks.com/t5/administration-architecture/api-call-to-api-2-0-serving-endpoints-name-ai-gateway-does-not/m-p/136571#M4302</link>
      <description>&lt;P&gt;Here is the code that works based on the above&lt;/P&gt;&lt;LI-CODE lang="python"&gt;curl -X PUT \
                "${DATABRICKS_HOST}/api/2.0/serving-endpoints/databricks-claude-opus-4-1/ai-gateway" \
 -H "Authorization: Bearer ${DATABRICKS_TOKEN}" \
 -H "Content-Type: application/json" \
 -d '{
            "rate_limits": [
            {
                        "key":"user",
                        "renewal_period": "minute",
                "tokens": 99999
                },
            {
                        "key":"user",
                        "renewal_period": "minute",
                "calls": 9
                }
                ],
                "usage_tracking_config": { "enabled": true }
         }'
}&lt;/LI-CODE&gt;&lt;P&gt;To get principal to work the call should look like this based on my experimentation:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;curl -X PUT \
                "${DATABRICKS_HOST}/api/2.0/serving-endpoints/databricks-claude-opus-4-1/ai-gateway" \
 -H "Authorization: Bearer ${DATABRICKS_TOKEN}" \
 -H "Content-Type: application/json" \
 -d '{
            "rate_limits": [
            {
                        "key":"user",
                        "principal":"sfibich1@xyz.com",
                        "renewal_period": "minute",
                "tokens": 99999
                },
            {
                        "key":"user",
                        "renewal_period": "minute",
                "calls": 9
                }
                ],
                "usage_tracking_config": { "enabled": true }
         }'
}&lt;/LI-CODE&gt;</description>
      <pubDate>Wed, 29 Oct 2025 15:48:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/api-call-to-api-2-0-serving-endpoints-name-ai-gateway-does-not/m-p/136571#M4302</guid>
      <dc:creator>sfibich1</dc:creator>
      <dc:date>2025-10-29T15:48:16Z</dc:date>
    </item>
  </channel>
</rss>

