06-11-2025 02:13 PM
Hi there,
Just testing the new Databricks free edition. Was trying to play around with LLMs, but I', not able to create serving endpoints with foundational model entities, interact with pay-per-token foundational model APIs or use them in Databricks apps.
With pay-per-token foundational model APIs I get the error below (for any chosen model, not just llama-4-maverick):
{"error_code":"PERMISSION_DENIED","message":"PERMISSION_DENIED: Endpoint databricks-llama-4-maverick is not allowed to be used by your workspace. Please reach out to Databricks to enable this endpoint for your workspace or upgrade your workspace tier."}
With serving endpoints I cannot create them as any possible entity have provisioned throughput I cannot disable/lower to 0.
Anyone had any luck playing with building apps or serving endpoints using LLMs with Databricks free? I cannot see from the limitations page that there should be any issue.
06-12-2025 06:28 AM
Was successful creating an endpoint using the API rather than the UI:
import requests
import json
# Set the name of the MLflow endpoint
endpoint_name = "test-throughput-endpoint"
# Name of the foundation model in Unity Catalog
model_name = "system.ai.llama-4-maverick"
# Get the API endpoint and token for the current notebook context
API_ROOT = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiUrl().get()
API_TOKEN = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()
headers = {"Content-Type": "application/json", "Authorization": f"Bearer {API_TOKEN}"}
# Configuration for Unity Catalog foundation model
data = {
"name": endpoint_name,
"config": {
"served_entities": [
{
"entity_name": model_name,
"entity_version": "1", # Use string version for UC models
"workload_size": "Small", # Foundation models typically need GPU
"scale_to_zero_enabled": True
}
]
},
}
response = requests.post(
url=f"{API_ROOT}/api/2.0/serving-endpoints",
json=data,
headers=headers
)
print(f"Status Code: {response.status_code}")
print(json.dumps(response.json(), indent=4))
a month ago
I cannot get this to work. Even via the API that's been suggested..
a month ago
Are you using work email or personal email
4 weeks ago
Im using my personal mail. The error I receive after creating the serving endpoint with the API is:
- Served entity `llama-4-maverick-1` entered DEPLOYMENT_FAILED state: Container creation failed
- Served entity creation failed for served entity `llama-4-maverick-1`, config version 1. Error message: Container creation failed. Please see build logs for more information.
- Endpoint update failed for endpoint `llama-4-maverick`, config version 1.
The build log is empty.
3 weeks ago
I have the same problem. I am unable to create a Serving Endpoint with any of the foundation models in the Databricks Free edition at the moment. By using the above code snippet, the provisioning starts, then it either hangs for long hours without any progress, or by looking into the Build logs, some installation scripts fail internally. If the Build logs are clean, the deployment hangs in Pending status forever. The same operation takes a few minutes in the Trial version.
Also the UI is strange as it would only allow Provisioned Throughput, which by definition is not supported by the Free edition, and then on clicking Create the error message indeed pops up.
My bet is: either provisioning foundation model Serving Endpoints do not work currently or due to the heavy limitations internally it only works once in a while after many attempts (so far, I could not make it work at all). Note: setting up an external model like OpenAI is OK, but that's just a pointer.
Or there are some parameterization tricks that must be used to make it work.
@TeeVanBee Did you manage to make it work? @Renounce3295 Does it still work for you with that script?
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now