cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks System Table system.billing.usage Not Capturing Job Data in Real-Time

vamsi_simbus
New Contributor II

Weโ€™ve observed that the system.billing.usage table in Databricks is not capturing job usage data in real-time. There appears to be a noticeable delay between when jobs are executed and when their corresponding usage records appear in the system table.

This is impacting our ability to monitor DBU consumption and perform timely cost tracking and alerting.
Has anyone else encountered this issue recently? If so:

Is there an expected lag for data to be populated in system.billing.usage?

Are there any known limitations or configuration settings that could affect the real-time ingestion of billing data?

Any recommended workaround for real-time usage tracking?

 

Thanks

Vamsi

5 REPLIES 5

szymon_dybczak
Esteemed Contributor III

 

Hi @vamsi_simbus ,

Yes, this is known issue that applies to all system tables. There is no support for real-time monitoring. Data is updated throughout the day. If you don't see a log for a recent event, check back later.

Here you can find proper description of this behaviour in documentation:

Monitor account activity with system tables | Databricks Documentation

And to be honest, I don't believe there is known workaround for this.

vamsi_simbus
New Contributor II

Hi @szymon_dybczak ,

Thanks for the information. Iโ€™ve also observed that jobs executed using all-purpose clusters arenโ€™t being captured in the system.billing.usage table. Interestingly, for the same job, the runs using serverless compute were recorded.

Do you know how we can track the cost consumption for jobs run via all-purpose clusters?

Thanks
Vamsi.

szymon_dybczak
Esteemed Contributor III

Hi @vamsi_simbus ,

That's weird. Job runs executed via all-purpose cluster should be visible by default. Maybe try to submit ticket to databricks support.

vamsi_simbus
New Contributor II

Hi @szymon_dybczak ,

Is there any alternative approach to find the DBU usage of current running jobs ?

 

szymon_dybczak
Esteemed Contributor III

Hi @vamsi_simbus ,

None that I know of. I've always used system tables for that purpose (but at our client we didn't have a need to query it in near real-time).

I guess as a workardound you can try to calculate it yourself, according to the description in following reddit thread:

How do I calculate Databricks job costs? : r/databricks

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now