- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-19-2024 01:44 AM
Hi,
I want to access the stage and job information (usually available through Spark UI) through the REST API provided by Spark: http://<server-url>:18080/api/v1/applications/[app-id]/stages. More information can be found at following link: https://spark.apache.org/docs/latest/monitoring.html#rest-api
Now to access this API, we need the server URL. But I am having trouble while trying to find this server URL. Another similar discussion on this forum highlighted that I can obtain this URL by copying the URL present when Spark UI is opened.
Please let me know how can these API's be accessed through Databricks. Thanks in advance.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-21-2024 08:31 AM
Hi @prathameshJoshi The url mentioned by @Retired_mod seems to be the correct one.
You can use this script to interact with the monitoring REST API.
from dbruntime.databricks_repl_context import get_context
import requests
host = get_context().browserHostName
cluster_id = get_context().clusterId
spark_ui_api_url = f"https://{host}/driver-proxy-api/o/0/{cluster_id}/40001/api/v1/"
endpoint = 'applications'
requests.get(spark_ui_api_url + endpoint, headers={"Authorization": f"Bearer {get_context().apiToken}"}).json()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-19-2024 06:22 AM
HI @prathameshJoshi ,
You can find this kind of information when you go to compute and click advanced options:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-20-2024 02:29 AM
@szymon_dybczak I have tried using it directly as well as the http path mentioned in the image you have posted. I have even tried using the spark ui url and even the url with only the cluster id. Nothing has worked for me. Perhaps if possible could you show me a dummy url which works with the Spark REST API for accessing jobs and stages.
For ex. sending request to this url - https://adb-1234.0.azuredatabricks.net/api/v1/applications yields following error:
Even if we add the http path mentioned like: https://adb-1234.azuredatabricks.net/sql/protocolv1/o/4567/cluster_id/api/v1/applications
we get the error Path must be of form /sql/protocolv1/o/<orgId>/<clusterIdent>
The errors are quite obvious but we don't know which url to use in order to remove them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-20-2024 05:53 AM
Hi @prathameshJoshi,
I was able to get the API url using this piece of code and is working on my browser.
Not sure how to authenticate while making calls programmatically.
from databricks_api import DatabricksAPI
from dbruntime.databricks_repl_context import get_context
databricks_api_instance = DatabricksAPI(
host=get_context().apiUrl,
token=get_context().apiToken,
)
host = get_context().browserHostName
cluster_id = get_context().clusterId
spark_context_id = databricks_api_instance.cluster.get_cluster(get_context().clusterId)['spark_context_id']
spark_ui_api_url = f"https://{host}/sparkui/{cluster_id}/driver-{spark_context_id}/api/v1/"
endpoint = 'applications'
print(spark_ui_api_url + endpoint)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-20-2024 11:24 PM
Thanks for providing a starting point, I tried out the URL which you have provided, but its not working when I try to send a request to it. I tried passing the Access token as a bearer token, but the request is sending back some login html page back.. Please find the output attached. Please let me know if there's any chance to fix it.
Thanks in Advance.
Thanks for providing a starting point, I tried out the URL which you have provided, but its not working when I try to send a request to it. I tried passing the Access token as a bearer token, but the request is sending back some login html page back.. Please find the output attached. Please let me know if there's any chance to fix it.
Thanks in Advance.
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="Content-Language" content="en">
<title>Databricks - Sign In</title>
<meta name="viewport" content="width=960">
<link rel="icon" type="image/png" href="https://databricks-ui-assets.azureedge.net/favicon.ico">
<meta http-equiv="content-type" content="text/html; charset=UTF8">
<script id="__databricks_react_script"></script>
<script>window.__DATABRICKS_SAFE_FLAGS__={
"databricks.infra.showErrorModalOnFetchError": true,
"databricks.fe.infra.useReact18": true,
"databricks.fe.infra.useReact18NewAPI": false,
"databricks.fe.infra.fixConfigPrefetch": true
},window.__DATABRICKS_CONFIG__={
"isCuttingEdge": false,
"publicPath": {
"mlflow": "https://databricks-ui-assets.azureedge.net/",
"dbsql": "https://databricks-ui-assets.azureedge.net/",
"feature-store": "https://databricks-ui-assets.azureedge.net/",
"monolith": "https://databricks-ui-assets.azureedge.net/",
"jaws": "https://databricks-ui-assets.azureedge.net/"
}
}</script>
<link rel="icon" href="https://databricks-ui-assets.azureedge.net/favicon.ico">
<script>
function setNoCdnAndReload() {
document.cookie = `x-databricks-cdn-inaccessible=true; path=/; max-age=86400`;
const metric = 'cdnFallbackOccurred';
const browserUserAgent = navigator.userAgent;
const browserTabId = window.browserTabId;
const performanceEntry = performance.getEntriesByType('resource').filter(e => e.initiatorType === 'script').slice(-1)[
0
]
sessionStorage.setItem('databricks-cdn-fallback-telemetry-key', JSON.stringify({ tags: { browserUserAgent, browserTabId
}, performanceEntry
}));
window.location.reload();
}
</script>
<script>
// Set a manual timeout for dropped packets to CDN
function loadScriptWithTimeout(src, timeout) {
return new Promise((resolve, reject) => {
const script = document.createElement('script');
script.defer = true;
script.src=src;
script.onload = resolve;
script.onerror = reject;
document.head.appendChild(script);
setTimeout(() => {
reject(new Error('Script load timeout'));
}, timeout);
});
}
loadScriptWithTimeout('https: //databricks-ui-assets.azureedge.net/static/js/login/login.acadbe8a.js', 10000).catch(setNoCdnAndReload);
</script>
</head>
<body class="light-mode">
<uses-legacy-bootstrap>
<div id="login-page"></div>
</uses-legacy-bootstrap>
<script>const telemetryEndpoint="/telemetry-unauth?t=",uiModuleName="workspaceLogin";function shouldIgnoreError(e){return!1
}function generateUuidV4(){const e=window.crypto?.randomUUID?.();return e||"xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx".replace(/[xy
]/g,(e=>{const n=16*Math.random()|0;return("x"===e?n: 3&n|8).toString(16)
}))
}function networkConnectivityTags(){const e=window.navigator.onLine,n=window.navigator.connection?.rtt??-1,t=window.navigator.connection?.downlink??-1;return{browserNavigatorOnline:e,browserConnectionEstimatedRtt:n,browserConnectionEstimatedDownlink:t,browserConnected:e&&n>0&&t>0
}
}function createTelemetryRequestBody(e,n={},t=null){const o=Math.round(Date.now()/1e3),r={eventId:generateUuidV4(),metric:e,tags: {...n,...networkConnectivityTags(),browserTabId:window.browserTabId,browserUserAgent:navigator.userAgent
},ts:o
};return t&&(r.blob=t),JSON.stringify({uploadTime:o,items: [JSON.stringify(r)
]
})
}function recordTelemetry(e,n={},t=""){const o={method: "POST",credentials: "include",body:createTelemetryRequestBody(e,n,t)
};fetch(telemetryEndpoint+Date.now(),o)
}window.__databricks_networkConnectivityTags=networkConnectivityTags,Object.defineProperty(window,
"browserTabId",
{value:generateUuidV4()
}),window.recordTelemetry=recordTelemetry,recordTelemetry("uiInit",
{uiModule:uiModuleName,eventId: "init",eventClientSource:uiModuleName,eventType: "init"
});let logCount=0;function error_handler(e,n,t,o,r){logCount++>4||shouldIgnoreError(e)||recordTelemetry("uncaughtJsException",
{eventType: "jsExceptionV3",jsExceptionMessage:e,jsExceptionSource:n,jsExceptionLineno:t,jsExceptionColno:o,jsExceptionBeforeInit:!0
},r&&r.stack&&r.stack.toString())
}function sendBeaconOnPageExit(e){if(navigator.sendBeacon){const n=e&&e.type||"unknown",t=(Math.round(Date.now()/1e3),createTelemetryRequestBody("uiInit",
{eventType: "pageExitBeforeAppInitComplete",eventName:n,eventClientSource:uiModuleName
}));navigator.sendBeacon(telemetryEndpoint+Date.now(),t)
}
}window.onerror=error_handler,window.onunhandledrejection=function(e){error_handler(String(e.reason),
null,
null,
null,e.reason)
},window.addEventListener("beforeunload",sendBeaconOnPageExit),window.addEventListener("unload",sendBeaconOnPageExit),window.addEventListener("pagehide",sendBeaconOnPageExit),window.cleanupAfterAppInit=()=>{window.removeEventListener("beforeunload",sendBeaconOnPageExit),window.removeEventListener("unload",sendBeaconOnPageExit),window.removeEventListener("pagehide",sendBeaconOnPageExit)
}</script>
</body>
</html>
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-20-2024 12:12 PM
Hi @prathameshJoshi, Thanks for reaching out! Please review the responses and let us know which best addresses your question. Your feedback is valuable to us and the community.
If the response resolves your issue, kindly mark it as the accepted solution. This will help close the thread and assist others with similar queries.
We appreciate your participation and are here if you need further assistance!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-20-2024 11:26 PM
Hi Kaniz,
The solution posted by @menotron is giving me some errors. Once those are fixed, I will mark the appropriate response as accepted solution.
Thank You
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-21-2024 03:41 AM
Hi @prathameshJoshi, Try this. Should be something like this:
ADMIN_TOKEN="dapi9___________________________"
WORKSPACE="my_workspace.cloud.databricks.com"
CLUSTER="__________"
MPORT="40001"
PREFIX="https://${WORKSPACE}/driver-proxy-api/o/0/${CLUSTER}/${MPORT}"
curl -L -H "Authorization: Bearer $ADMIN_TOKEN" -X GET ${PREFIX}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-21-2024 08:31 AM
Hi @prathameshJoshi The url mentioned by @Retired_mod seems to be the correct one.
You can use this script to interact with the monitoring REST API.
from dbruntime.databricks_repl_context import get_context
import requests
host = get_context().browserHostName
cluster_id = get_context().clusterId
spark_ui_api_url = f"https://{host}/driver-proxy-api/o/0/{cluster_id}/40001/api/v1/"
endpoint = 'applications'
requests.get(spark_ui_api_url + endpoint, headers={"Authorization": f"Bearer {get_context().apiToken}"}).json()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-08-2025 06:17 AM
Hi @menotron this work for cluster which are in running state, is there any way to get the same for terminated clusters?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2024 02:40 AM
Hi @Retired_mod and @menotron ,
Thanks a lot; your solutions are working. I apologise for the delay, as I had some issue logging in.

