cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Cluster idle time and usage details

Phani1
Valued Contributor II

How can we find out the usage details of the Databricks cluster? Specifically, we need to know how many nodes are in use, how long the cluster is idle, the time it takes to start up, and the jobs it is running along with their durations. Is there a query we can run on the system table to check if the cluster is over-provisioned or if it is using its configuration effectively? We also need similar information for serverless and SQL warehouse. Please provide any queries that can help us gather these details.

1 REPLY 1

Walter_C
Databricks Employee
Databricks Employee

You can try the following query on your system tables:

WITH cluster_usage AS (
  SELECT 
    u.usage_metadata.cluster_id,
    u.usage_date,
    u.usage_start_time,
    u.usage_end_time,
    DATEDIFF(second, u.usage_start_time, u.usage_end_time) AS duration_seconds,
    u.usage_quantity AS dbus_used,
    CASE 
      WHEN u.usage_metadata.job_id IS NOT NULL THEN 'job'
      WHEN u.usage_metadata.notebook_id IS NOT NULL THEN 'notebook'
      ELSE 'interactive'
    END AS usage_type,
    u.usage_metadata.job_id,
    u.usage_metadata.notebook_id
  FROM system.billing.usage u
  WHERE u.usage_date >= DATEADD(day, -30, CURRENT_DATE)
    AND u.usage_metadata.cluster_id IS NOT NULL
)
SELECT 
  c.cluster_id,
  c.cluster_name,
  c.dbr_version,
  c.worker_count,
  c.auto_termination_minutes,
  cu.usage_date,
  SUM(cu.duration_seconds) AS total_duration_seconds,
  SUM(CASE WHEN cu.usage_type = 'interactive' THEN cu.duration_seconds ELSE 0 END) AS idle_seconds,
  SUM(cu.dbus_used) AS total_dbus_used,
  COUNT(DISTINCT CASE WHEN cu.usage_type = 'job' THEN cu.job_id END) AS unique_jobs_run,
  AVG(CASE WHEN cu.usage_type = 'job' THEN cu.duration_seconds END) AS avg_job_duration_seconds
FROM system.compute.clusters c
JOIN cluster_usage cu ON c.cluster_id = cu.cluster_id
GROUP BY 
  c.cluster_id, 
  c.cluster_name, 
  c.dbr_version, 
  c.worker_count, 
  c.auto_termination_minutes, 
  cu.usage_date
ORDER BY c.cluster_name, cu.usage_date

This query provides:

  1. Cluster configuration details (cluster_id, cluster_name, Databricks Runtime version, worker count, autotermination settings)
  2. Daily usage statistics
  3. Idle time estimation (assuming interactive usage without a job or notebook ID is idle time)
  4. Total DBUs used
  5. Number of unique jobs run
  6. Average job duration

To identify if a cluster is over-provisioned or not using its configuration effectively, look for patterns such as:

  1. High idle time relative to total duration
  2. Low DBU usage compared to the worker count
  3. Few jobs running on clusters with many workers
  4. Long average job durations on clusters with many workers

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group