Databricks has added new metrics to its control panel, replacing the outdated Ganglia tool. These new metrics allow users to monitor the following cluster performance metrics easily:
- CPU utilization
- Memory usage
- Free filesystem space
- Network traffic (received and transmitted)
- Total number of active nodes
- Total number of tasks (completed, failed, and total)
- Total task duration
- Total shuffle read write
- GPU usage