spark cluster monitoring and visibility

Saurav
New Contributor III

Hey. I'm working on a project where I'd like to be able to view and play around with the spark cluster metrics. I'd like to know what the utilization % and max values are for metrics like CPU, memory and network. I've tried using some open source solutions(https://github.com/mspnp/spark-monitoring) but I'm not really getting what I'm looking for. Ideally, a solution that could give me insights on my Azure Databricks instances to optimize usage would be perfect. Currently, I can access some of these metrics on the metrics tab on my spark cluster page as static images but it'd be great if I could export that information to make my own insights or graphs.

User16764241763
Databricks Employee
Databricks Employee

Anonymous
Not applicable

Saurav
New Contributor III

Hey @Kaniz Fatma​, I Appreciate the suggestions and will be looking into them. Haven't gotten to it yet so I didn't want to mention whether they worked for me or not. Since I'm looking to avoid solutions like DataDog, I'll be checking out the Prometheus and @Arvind Ravish​'s first suggestion. Thanks!

Just a friendly follow-up. Did you have time to check? do you still need help or can you mark as best the response that helped you?