Databricks Community

Henrik · ‎08-27-2024

We have build a chat solution on LLM RAG chat model, but we face an issue when we spin up a service endpoint to host the model.

According to the documentation, there should be sevral LLM models available as pay-per-token endpoints, for instance the DBRX Instruct.

https://learn.microsoft.com/en-us/azure/databricks/machine-learning/foundation-models/supported-mode...

However, in our workspace we only se two available pay-per-token endpoints (se attachment "serving endpoints.png").

When we "create a new service endpoint", it seems like we can only spin up "provisioned throughtput models, which are currently too expensive to run for our setup (se attachment "issue.png").

Our Databricks environment is in azure west europe.

Any suggestions?

daniel_sahal · ‎08-28-2024

@Henrik
The documentation clearly states that it should be available in west europe, but i'm also unable to see DBRX ppt endpoint.
I think that it would be best to raise an Azure Support ticket - they should either somehow enable it on your workspace or modify the documentation.

View solution in original post

daniel_sahal · ‎08-28-2024

@Henrik
The documentation clearly states that it should be available in west europe, but i'm also unable to see DBRX ppt endpoint.
I think that it would be best to raise an Azure Support ticket - they should either somehow enable it on your workspace or modify the documentation.