instance_id in compute.node_timelines
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-20-2025 08:34 AM
I am trying to fetch active worker nodes from system tables using the code like below:
select count(distinct instance_id)
from system.compute.node_timelines where cluster_id = "xx"
groupy by instance_id,start_time,end_times
It gives an output like 20 but in the my cluster configuration I have set max worker nodes is 15. How is this possible. Does instance_id refers to the worker nodes ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-20-2025 09:20 AM
Hi @pranitha,
The instance_id
field refers to the specific instance of a node, which can include both driver and worker nodes. In your query, you are counting distinct instance_id
s, which might include instances that are not currently active or are driver nodes, leading to a higher count than the maximum number of worker nodes set in your cluster configuration.
Additionally, the node_timeline
table captures node-level resource utilization data at minute granularity, and each record contains data for a given minute of time per instance. This means that if nodes are frequently added and removed (e.g., due to autoscaling), the count of distinct instance_id
s over a period of time can exceed the maximum number of worker nodes set for the cluster at any single point in time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-20-2025 09:20 AM
To ensure you are only counting active worker nodes, you can use:
SELECT COUNT(DISTINCT instance_id)
FROM system.compute.node_timelines
WHERE cluster_id = 'xx'
AND driver = false
AND end_time IS NULL
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-20-2025 09:30 AM
Hi @Alberto_Umana ,
Thanks for replying.
Even if we add the driver node it should be around 16-17 right, not like 20. I checked for al the clusters, for every cluster there is a difference of 5-7 nodes between max_worker count and count(distinct instance_id) for a specific minute.

