- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-10-2025 12:37 AM
I'm encountering an issue where a serving endpoint I create disappears from the list of serving endpoints after a day. This has happened both when I created the endpoint from the Databricks UI and using the Databricks SDK.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-10-2025 12:50 AM
If your endpoint is tied to a model in MLflow Registry, and that model is archived or deleted, it could cause the endpoint to disappear. so please check if the registered model/version still exists.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-10-2025 01:35 AM
Yes, those models are registered in Unity Catalog and have not been deleted.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-09-2025 11:10 AM
Hey @prashant_089 , what you are experiencing should not happen on its own except for some extremely outlying circumstanctes.
IF YOU ARE USING Databricks Free Edition you shold ignore everything below.
Here are some troubleshooting suggestions/tips:
Likely causes to check first
Workspace disabled/suspended or canceled: When a workspace is disabled or canceled (including transient off/on patterns), the platform can automatically delete serving endpoints. We’ve seen concrete cases where daily subscription toggling resulted in endpoint deletion, with no user-driven delete in audit logs.
Trial or compliance restrictions: If the customer is on a time‑based trial or a compliance-restricted workspace (HIPAA/SHIELD/FedRAMP), serving may be disabled via SAFE flags. As part of enforcement, endpoints are deleted by the NOC process when serving is turned off.
“Scale to zero” does not delete endpoints: The scale_to_zero_enabled flag only stops compute when idle; it does not remove the endpoint. Deletion requires an explicit delete, policy enforcement, or workspace lifecycle event.
Some things to test:
Confirm via REST GET whether the endpoint still exists. If it’s truly deleted, GET returns 404; if it’s stopped or updating, you’ll get a response and can start it.
# Replace URL, token, and endpoint name
curl -s -H "Authorization: Bearer $DATABRICKS_TOKEN" \
https://<workspace-host>/api/2.0/serving-endpoints/<ENDPOINT_NAME>
If this returns 404 Not Found, the endpoint was deleted. If it returns JSON with state, the endpoint still exists (possibly stopped).
Query Audit Logs for a deleteServingEndpoint event around the disappearance window. Model Serving writes audit events under serverlessRealTimeInference with the endpoint name in request params.
-- Replace endpoint name and time bounds
SELECT
timestamp,
service,
action,
request_params,
user_identity
FROM system.access.audit
WHERE service = 'serverlessRealTimeInference'
AND action = 'deleteServingEndpoint'
AND (request_params:name = '<ENDPOINT_NAME>' OR request_params LIKE '%<ENDPOINT_NAME>%')
AND timestamp BETWEEN TIMESTAMP('<START>') AND TIMESTAMP('<END>')
ORDER BY timestamp DESC;
Correlate workspace state changes (disable/enable/suspend) in system tables to the same time window. This helps confirm if deletion was triggered by workspace lifecycle rather than a user/API call.
SELECT *
FROM system.access.workspaces_latest
WHERE workspace_id = CURRENT_WORKSPACE()
ORDER BY updated_at DESC;
Quick checklist to run now
Run the GET endpoint check to confirm deletion vs. stopped/updating.
Query Audit Logs for deleteServingEndpoint in the relevant window to see if a user/API deletion occurred.
Query workspaces_latest to detect disable/enable/suspend events around the time the endpoint vanished.
Determine whether the customer’s workspace is trial or subject to compliance restrictions that might disable serving; if so, expect enforced deletions until serving is enabled.
Best of luck, Louis.