topic Ai Query Prompt Token and Completition token in Get Started Discussions

Ai Query Prompt Token and Completition token

Andreyai — Wed, 27 Aug 2025 04:27:37 GMT

I would like to know how can I get the Completition token and Prompt token quantity when using Ai_Query?

Thanks

Re: Ai Query Prompt Token and Completition token

Khaja_Zaffer — Wed, 27 Aug 2025 08:32:13 GMT

Hello @Andreyai

good day!!

For AI_queries, we have documentation from databricks. :

https://docs.databricks.com/aws/en/sql/language-manual/functions/ai_query I am 100% sure you will get better insights from the documentations.

But I have something for you from internet:

Estimating Token Counts (Without Running the Query) You can use a tokenizer to approximate prompt and completion tokens based on your input text and expected output.

For Databricks foundation models like DBRX or Meta Llama series, use the cl100k_base encoding from OpenAI's tiktoken library (it's compatible).

Install tiktoken in a Databricks notebook (via %pip install tiktoken).

Example Python code to estimate:

python

import tiktoken
def count_tokens(text: str, encoding_name: str = "cl100k_base") -> int:
    encoding = tiktoken.get_encoding(encoding_name)
    return len(encoding.encode(text))

# Example usage
prompt = "Your prompt text here"  # Replace with your actual prompt
estimated_prompt_tokens = count_tokens(prompt)
print(f"Estimated prompt tokens: {estimated_prompt_tokens}")

# For completion, estimate based on expected output length (e.g., max_tokens param)
example_completion = "Sample generated response"  # Simulate or use a sample
estimated_completion_tokens = count_tokens(example_completion)
print(f"Estimated completion tokens: {estimated_completion_tokens}")

Re: Ai Query Prompt Token and Completition token

Andreyai — Wed, 27 Aug 2025 09:41:15 GMT

Thank you for your response.
But I was expecting a response from ai_query with the usage information like when you use a completion.create call on OpenAi. Is it possible? So on it call it will return an response and the usage.

In my case I have a set of images, where for each Ai_Query each image I am passing the prompt consist on text with commands and an image. And it returns a description of the image. And with that I would like to get the token quantity so I can infer the cost of the operation. I am using the Llama 4 maverick and Claude 3.7 Sonnet.

link OpenAI: https://platform.openai.com/docs/api-reference/chat/list

Thanks

Re: Ai Query Prompt Token and Completition token

Krishna_S — Sat, 04 Oct 2025 04:05:02 GMT

Hi @Andreyai

The batch inference requests hit a model serving endpoint; as long as inference tables and usage tracking are enabled on that endpoint, the requests will get logged regardless of how they were submitted to the endpoint.

See the schema for the endpoint usage and inference table schema, and it has both input tokens and output tokens information.

https://docs.databricks.com/aws/en/ai-gateway/inference-tables#query-and-analyze-results-in-the-inference-table

https://docs.databricks.com/aws/en/ai-gateway/configure-ai-gateway-endpoints#systemservingendpoint_usage-usage-tracking-table-schema

https://docs.databricks.com/aws/en/ai-gateway/inference-tables#ai-gateway-enabled-inference-table-schema

Hope this helps.