<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic FMAPI Anthropic endpoint rejects requests with trailing assistant message — known limitation? in Generative AI</title>
    <link>https://community.databricks.com/t5/generative-ai/fmapi-anthropic-endpoint-rejects-requests-with-trailing/m-p/156535#M1801</link>
    <description>&lt;P&gt;Hey all — looking for confirmation on a behavior I'm hitting on the Foundation Model API (pay-per-token) Anthropic-compatible endpoint, in case anyone else has worked around it.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;What I'm doing:&lt;/STRONG&gt; serving Claude models through /serving-endpoints/anthropic/v1/messages on the FMAPI pay-per-token tier. AAD bearer auth, U2M flow.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;What fails:&lt;/STRONG&gt; any request where the messages array ends with a turn of role: "assistant". The endpoint returns:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;HTTP 400 BAD_REQUEST
{
"error_code": "BAD_REQUEST",
"message": "This model does not support assistant message prefill. The conversation must end with a user message."
}&lt;/LI-CODE&gt;&lt;P&gt;&lt;STRONG&gt;Minimal repro shape:&lt;/STRONG&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;{
  "model": "databricks-claude-opus-4-7",
  "max_tokens": 256,
  "messages": [
    {"role": "user", "content": "Complete the sentence:"},
    {"role": "assistant", "content": "The capital of France is "}
  ]
}&lt;/LI-CODE&gt;&lt;P&gt;Native Anthropic accepts this — it's the documented "assistant prefill" pattern where the model continues from where the partial assistant text leaves off. Common uses: forcing output formats, resuming after interruption, certain tool-loop continuations.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Why this is broader than one client:&lt;/STRONG&gt; prefill is foundational in the Anthropic ecosystem. The Anthropic Python/TypeScript SDKs, LangChain's Anthropic provider, autogen and most agent frameworks built on the Anthropic API treat it as a primitive. Anything routed to FMAPI Anthropic that uses prefill gets a 400.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;What I'm doing today&lt;/STRONG&gt;: running a small proxy in front of FMAPI that strips trailing assistant messages before forwarding. Works for cases where prefill is incidental, but silently degrades any client that actually relies on prefill semantics (output-shaping flows especially).&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Questions:&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Is this a known/documented limitation of the FMAPI Anthropic endpoint?&lt;/LI&gt;&lt;LI&gt;Is parity with native Anthropic on this feature planned?&lt;/LI&gt;&lt;LI&gt;Has anyone found an official workaround other than client-side rewriting?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;EM&gt;Thanks&lt;U&gt;!&lt;/U&gt;&lt;/EM&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 11 May 2026 08:50:03 GMT</pubDate>
    <dc:creator>cormierjohn</dc:creator>
    <dc:date>2026-05-11T08:50:03Z</dc:date>
    <item>
      <title>FMAPI Anthropic endpoint rejects requests with trailing assistant message — known limitation?</title>
      <link>https://community.databricks.com/t5/generative-ai/fmapi-anthropic-endpoint-rejects-requests-with-trailing/m-p/156535#M1801</link>
      <description>&lt;P&gt;Hey all — looking for confirmation on a behavior I'm hitting on the Foundation Model API (pay-per-token) Anthropic-compatible endpoint, in case anyone else has worked around it.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;What I'm doing:&lt;/STRONG&gt; serving Claude models through /serving-endpoints/anthropic/v1/messages on the FMAPI pay-per-token tier. AAD bearer auth, U2M flow.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;What fails:&lt;/STRONG&gt; any request where the messages array ends with a turn of role: "assistant". The endpoint returns:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;HTTP 400 BAD_REQUEST
{
"error_code": "BAD_REQUEST",
"message": "This model does not support assistant message prefill. The conversation must end with a user message."
}&lt;/LI-CODE&gt;&lt;P&gt;&lt;STRONG&gt;Minimal repro shape:&lt;/STRONG&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;{
  "model": "databricks-claude-opus-4-7",
  "max_tokens": 256,
  "messages": [
    {"role": "user", "content": "Complete the sentence:"},
    {"role": "assistant", "content": "The capital of France is "}
  ]
}&lt;/LI-CODE&gt;&lt;P&gt;Native Anthropic accepts this — it's the documented "assistant prefill" pattern where the model continues from where the partial assistant text leaves off. Common uses: forcing output formats, resuming after interruption, certain tool-loop continuations.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Why this is broader than one client:&lt;/STRONG&gt; prefill is foundational in the Anthropic ecosystem. The Anthropic Python/TypeScript SDKs, LangChain's Anthropic provider, autogen and most agent frameworks built on the Anthropic API treat it as a primitive. Anything routed to FMAPI Anthropic that uses prefill gets a 400.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;What I'm doing today&lt;/STRONG&gt;: running a small proxy in front of FMAPI that strips trailing assistant messages before forwarding. Works for cases where prefill is incidental, but silently degrades any client that actually relies on prefill semantics (output-shaping flows especially).&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Questions:&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Is this a known/documented limitation of the FMAPI Anthropic endpoint?&lt;/LI&gt;&lt;LI&gt;Is parity with native Anthropic on this feature planned?&lt;/LI&gt;&lt;LI&gt;Has anyone found an official workaround other than client-side rewriting?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;EM&gt;Thanks&lt;U&gt;!&lt;/U&gt;&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 11 May 2026 08:50:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/fmapi-anthropic-endpoint-rejects-requests-with-trailing/m-p/156535#M1801</guid>
      <dc:creator>cormierjohn</dc:creator>
      <dc:date>2026-05-11T08:50:03Z</dc:date>
    </item>
  </channel>
</rss>

