Determining spill from system tables
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-03-2025 03:37 PM
I'm trying to optimize machine selection (D, E, or L types on Azure) for job clusters and all-purpose compute and am struggling to identify where performance is sagging on account of disk spill. Disk spill would suggest that more memory is needed. I can get there from the Spark UI but am looking for historical diagnostics.
As of January 2025, system.compute.node_timeline is telling me useful things but not spill, explicitly.
https://docs.databricks.com/en/admin/system-tables/compute.html#node-timeline-table-schema
Help appreciated.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-03-2025 04:42 PM
For historical diagnostics, you might need to consider setting up a custom logging mechanism that captures these metrics over time and stores them in a persistent storage solution, such as a database or a logging service. This way, you can query and analyze historical performance data, including disk spill, at any point in the future.

