cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

SaravananPalani
by New Contributor II
  • 18752 Views
  • 8 replies
  • 9 kudos

Is there any way to monitor the CPU, disk and memory usage of a cluster while a job is running?

I am looking for something preferably similar to Windows task manager which we can use for monitoring the CPU, memory and disk usage for local desktop.

  • 18752 Views
  • 8 replies
  • 9 kudos
Latest Reply
hitech88
New Contributor II
  • 9 kudos

Some important info to look in Gangalia UI in CPU, memory and server load charts to spot the problem:CPU chart :User %Idle %High percentage of user % indicates heavy CPU usage in the cluster.Memory chart : Use %Free %Swap % If you see purple line ove...

  • 9 kudos
7 More Replies
Michael_Galli
by Contributor II
  • 2355 Views
  • 1 replies
  • 1 kudos

Resolved! Pipelines with alot of Spark Caching - best practices for cleanup?

We have the situation where many concurrent Azure Datafactory Notebooks are running in one single Databricks Interactive Cluster (Azure E8 Series Driver, 1-10 E4 Series Drivers autoscaling).Each notebook reads data, does a dataframe.cache(), just to ...

  • 2355 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

This cache is dynamically saved to disk if there is no place in memory. So I don't see it as an issue. However, the best practice is to use "unpersist()" method in your code after caching. As in the example below, my answer, the cache/persist method ...

  • 1 kudos
Labels