Is there any way to monitor the CPU, disk and memory usage of a cluster while a job is running?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2018 04:08 AM
I am looking for something preferably similar to Windows task manager which we can use for monitoring the CPU, memory and disk usage for local desktop.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-13-2019 09:17 AM
I would also find this really really useful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-30-2019 01:11 PM
Spark UI can give you access to some of this information, just not in real-time. It's also intended for Spark-specific performance information such as job and task breakdowns.
Ganglia metrics can give you real-time metrics along these lines both in real-time and historically.In the Clusters page for your particular cluster, select the "Metrics" link and you'll have access to the "Ganglia UI" link (for real-time) and the historical snapshots list.screen-shot-2019-05-30-at-40457-pm.pngYou can find out more at the Metrics documentation page:
https://docs.databricks.com/user-guide/clusters/metrics.html

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-07-2020 11:41 AM
Ganglia metric are not that much helpful and also with cluster start you lose old data .
Question is how to get live metrics and view historical data .
OMS agent are best in that case. i used in Azure databricks and its wonderful .
should be doable in AWS as well with some modification.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-04-2023 06:07 AM
Which is real real time matrics
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-13-2020 03:17 AM
Ganglia metrics can give you real-time metrics along these lines both in real-time and historically. mcdvoice
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2023 02:10 AM
You can use the Ganglia UI to track the CPU, Network, Disk, and Memory. Keep in mind that Ganglia UI in a snapshot displayed every 15 minutes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2023 03:35 AM
as mentioned by few - Ganglia UI can be used to track it. we use the same in our projects.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-04-2023 11:57 AM
Some important info to look in Gangalia UI in CPU, memory and server load charts to spot the problem:
CPU chart :
- User %
- Idle %
High percentage of user % indicates heavy CPU usage in the cluster.
Memory chart :
- Use %
- Free %
- Swap %
If you see purple line over red line in memory chart then it indicates memory swapping and also highlighting high memory usage.
Server Load Distribution Chart:
Absence of red squares indicates balanced load on the cluster. Presence of red squares means there is hot spot where load is more.

