cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to get cost per job which runs on ALL_PURPOSE_COMPUTE ??

KUMAR__111
New Contributor II

with system.billing.usage table i could get cost per jobs which are runs on JOB_COMPUTE but not for jobs which runs on ALL_PURPOSE_COMPUTE.

3 REPLIES 3

Brahmareddy
Honored Contributor

Hi Kumar,

How are you? As per my understanding, please consider checking if your jobs running on ALL_PURPOSE_COMPUTE are being tracked properly in the system.billing.usage table. For ALL_PURPOSE_COMPUTE workloads, billing can sometimes be aggregated under interactive clusters, and the costs might not be attributed directly to specific jobs, making it harder to get a job-specific breakdown. You might want to cross-reference cluster usage with job runs using the cluster usage metrics or cluster events logs. This will help you map costs from ALL_PURPOSE_COMPUTE clusters to the jobs they are supporting. Alternatively, you can explore Databricks' cost management tools or integrate with external billing tools to get a more granular view of job-level costs on these compute types.

Give a try and let me know.

Regards,

Brahma

KUMAR__111
New Contributor II

If nowhere DBU is captured for jobs under ALL_PURPOSE_COMPUTE then cost breakdown-based cluster events is very difficult as more than 2 jobs can parallel. So mapping is very difficult to break down cost for specific job.
let me know if I am missing anything. 

You’re right @KUMAR__111—tracking costs for jobs on ALL_PURPOSE_COMPUTE clusters can be tricky since DBU usage isn’t directly tied to specific jobs. When multiple jobs run in parallel on the same cluster, it’s challenging to allocate costs accurately. Consider using cluster tags to label clusters by job, which can help with grouping costs even when jobs share clusters. Running job-specific clusters for key workloads could provide clearer cost attribution. You could also cross-reference job logs with cluster usage metrics, though this can be manual. Leveraging the Databricks REST API can help gather more detailed metrics to better estimate costs per job.

Just a thought. Give a try and let me know.

Regards,

Brahma

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group