Databricks Community

pranitha · ‎01-21-2025

Alberto_Umana · ‎01-21-2025

Hi @pranitha,

You can utilize the node_timeline table. https://docs.databricks.com/en/admin/system-tables/compute.html

saurabh18cs · ‎01-21-2025

try this

SELECT

cluster_id,

MAX(max_autoscale_workers) AS max_workers,

DATE_TRUNC('day', change_time) AS day

FROM

system.compute.clusters

WHERE

change_time BETWEEN '2023-01-01' AND '2023-01-31'

GROUP BY

cluster_id,

DATE_TRUNC('day', change_time)

ORDER BY

day;

MadhuB · ‎01-21-2025

Hi @pranitha

Use this query to get the cluster details along with cost info as well.

WITH hourly_metrics AS (

SELECT

date_trunc('hour', usage_start_time) as hour,

usage_metadata.cluster_id,

sku_name,

MAX(usage_quantity) as max_usage,

ROUND(AVG(usage_quantity), 2) as avg_usage,

SUM(usage_quantity) as DBUs_consumed

FROM system.billing.usage

WHERE usage_start_time BETWEEN '2025-01-01' AND '2025-01-21'

GROUP BY

date_trunc('hour', usage_start_time),

usage_metadata.cluster_id,

sku_name

),

cost_calculation AS (

SELECT

h.*,

l.pricing.default AS unitprice,

ROUND((h.DBUs_consumed * l.pricing.default)) AS total_cost

FROM hourly_metrics h

LEFT OUTER JOIN system.billing.list_prices l ON l.sku_name = h.sku_name

)

SELECT

cluster_id,

sku_name,

hour,

max_usage,

avg_usage,

DBUs_consumed,

unitprice,

total_cost

FROM cost_calculation

ORDER BY hour DESC, total_cost DESC;

Databricks Community

How to fetch max or avg worker nodes for a specific time period using system tables?

Photos

Join Us as a Local Community Builder!

Business Intelligence in the Era of AI

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Databricks Community Champion - March 2025 - Takuya Omi

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April