cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to retrieve cluster IDs of a deleted All Purpose cluster ?

RiyuLite
New Contributor III

I need to retrieve the event logs of deleted All Purpose clusters of a certain workspace.
databricks list API ({workspace_url}/api/2.0/clusters/list) provides me with the list of all active/terminated clusters but not the clusters that are deleted. 

I tried using /api/2.0/accounts/{account_id}/usage/download, with the root account details, and i am able to fetch that.

I need a way where we don't use the root account details to fetch the cluster IDs of the deleted clusters. Pls help

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @RiyuLiteTo retrieve the event logs of deleted All Purpose clusters without using the root account details, you can use Databricks audit logs. These logs record the activities in your workspace, allowing you to monitor detailed Databricks usage patterns.

However, audit logging is not enabled by default and requires a few API calls to initialize the feature. Once audit logging is enabled on your workspace, you can use it to find information on who deleted a specific cluster configuration.

Here are the steps to load and query the audit logs:

1. Load audit logs as a DataFrame and register the DataFrame as a temp table. You will need to provide the S3 bucket name, the full path to the audit logs, and a name for the table.

scala
val df = spark.read.format("json").load("s3a://<s3-bucket-name>/<path-to-audit-logs>")
df.createOrReplaceTempView("<audit-logs>")

2. Once you have the audit logs in a table, you can use SQL to query them. This query returns details on the cluster deletion event such as who deleted it, when it was deleted, and the cluster ID of the deleted cluster.

sql
select
 workspaceId,
 userIdentity.email,
 sourceIPAddress,
 to_timestamp(timestamp / 1000) as evenTimeStamp,
 ServiceName,
 actionName,
 requestParams.cluster_id as clusterId
from
 <audit-logs>
where
 serviceName = "clusters"
 AND actionName = "permanentDelete"
 AND requestParams.cluster_id = "<cluster-id>"

Please note that if your queries do not return any results for a cluster, it means that the cluster configuration was unpinned and was automatically deleted more than 30 days ago.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.