cancel
Showing results for 
Search instead for 
Did you mean: 
Technical Blog
Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
cancel
Showing results for 
Search instead for 
Did you mean: 
samer_zabaneh
New Contributor III

Organizations have expressed the need to see trends across their Databricks Accounts and drill down into Workspaces, SKUs, tags, and users. System Tables provide this visibility with little to no setup needed, just an API call to opt into certain schemas. In this blog, we will leverage Databricks System Tables for cost observability by using the billing, access, query, and compute schemas. We include a common template for a Lakeview Dashboard that is versioned and shareable in a public repo.

Sourcing common insights from System Tables 

System tables provide a centralized way to gain insights into how your Databricks resources are functioning and changing over time. They must be enabled by an account admin and operate at the schema level. 

Core System Tables needed to source cost attribution

Here are the tables that will help you gain insights into the Account's cost analysis and drill down into the Workspaces, Warehouses, Clusters, Users, and Tags.

Table name

Definition

system.access.audit

Contains entries for events broken out by Service and Action Names. For example, we filter on the notebook service and runCommand to determine the duration of a command run.   

system.billing.usage

Helps us correlate runs from query history and the access audit tables to what was billed and by whom.

system.query.history

Captures user-level query execution information.

system.compute.clusters

Contains events relating back to the cluster configuration.

system.billing.list_prices

Contains the list prices for a period.

 

Building insights into cost attribution 

Cost attribution by tags

The billing table provides a custom_tags field that can be used as a parameter to filter workloads that are appropriately tagged.  The Tags column in the billing usage table is a map that can be filtered with Lakeview Parameters by leveraging a simple condition to select all tags or specific ones filtering on the key and value of the map as seen below.  These Parameters can then be part of the dashboard to filter the visuals appropriately:

 

       :tag_name :: string = 'all'

       or u.custom_tags [:tag_name::string] = :tag_value :: string

 

Parameters for Tag Name and Tag Value in the Dashboard 

column

data type

source

description

workspace_id

string

system.access.audit

The unique ID of the workspace 

workspace_name

string

system.access.audit

The name of the workspace 

usage_date

date

system.billing.usage

Activity date on the workspace

cost

decimal

system.billing.list_prices

Total cost on the workspace per day

Tag

String

system.billing.usage

Tag name

Value

String

system.billing.usage

Value of tag

 

Cost attribution by account

The combo chart below displays month-over-month analysis for dollars and DBUs consumed for the Databricks account, representing the consumption of all Workspaces provisioned with the cloud provider used.

Account Trends dbus & $dbus.png

Cost attribution by Workspace 

The below bar chart shows a drill down into Workspace consumption month over month.

 

Workspace ID Trends by Month copy.png

 

Cost attribution by SKU

The bar chart below shows the month-over-month $DBU consumption for various Databricks SKUs. The SKUs are taken from the sku_name field in the Usage table, and a simple case statement below gives us the categories of interest. Of course, these can be rolled up to a higher level with the new fields provided in the Usage table, product_features and billing_origin_product.

 

Sku Trends by Month.png

 

You can use this case-when statement to help summarize consumption by product SKU.

 

   Case
     when u.sku_name like '%SERVERLESS_SQL%' then 'SQL-Serverless'
     when u.sku_name like '%SQL_PRO%' then 'SQL-Pro'
     when u.sku_name like '%SQL%' then 'SQL-Classic'
     WHEN u.sku_name like '%DLT_ADVANCED_COMPUTE%'
     and u.sku_name like '%PHOTON%' THEN 'DLT Advanced (PHOTON)'
     WHEN u.sku_name like '%DLT_ADVANCED_COMPUTE%' THEN 'DLT Advanced'
     WHEN u.sku_name like '%DLT_PRO_COMPUTE%'
     and u.sku_name like '%PHOTON%' THEN 'DLT Pro (PHOTON)'
     WHEN u.sku_name like '%DLT_PRO_COMPUTE%' THEN 'DLT Pro'
     WHEN u.sku_name like '%DLT_CORE_COMPUTE%'
     and u.sku_name like '%PHOTON%' THEN 'DLT Core (PHOTON)'
     WHEN u.sku_name like '%DLT_CORE_COMPUTE%' THEN 'DLT Core'
     when u.sku_name like '%JOBS%'
     and u.sku_name like '%PHOTON%' then 'AUTOMATED (PHOTON)'
     when u.sku_name like '%JOBS%' then 'AUTOMATED'
     when u.sku_name like '%ALL_PURPOSE%'
     and u.sku_name like '%PHOTON%' then 'INTERACTIVE (PHOTON)'
     when u.sku_name like '%ALL_PURPOSE%' then 'INTERACTIVE'

 

Cost attribution by Warehouse

This widget in the cost attribution dashboard represents the weekly cost for a DBSQL Warehouse. 

Warehouse Cost over Period copy.png

 

Warehouse insights by user

The below table shows the hourly warehouse consumption pattern and ties back the warehouse usage cost to a user.  How it ties back to the usage is based on grouping the query history table into hours in the day so that the duration of the queries executed within the hour ties back to the Usage table for cost.  A simple calculation of the duration of the query compared to the hourly total cost can give a sense of the percentage of cost that can be attributed to a user. 

*As a disclaimer, the estimates below depict the duration of the query relative to the Usage tables billing for that hour. Other factors may be needed for appropriate chargeback, but this gives a reasonable view of what users are contributing to costs based on query time.

This below table shows the hourly warehouse consumption pattern and ties back the cluster usage cost to a user. 

cost_per_user_warehouse.png

 

Warehouse top users per duration (1).png

 

Below is a snippet of how this is achieved, while the full code can be found in the repo.

 

with usage_by_user as (
  select
    workspace_id,
    executed_by,
    compute.warehouse_id,
    date(start_time) `event_date`,
    hour(start_time) `hour`,
    sum(total_duration_ms) / 1000 `total_duration_by_user_in_hour`,
    (
      select
        sum(qhs.total_duration_ms) / 1000
      from
        system.query.history qhs
      where
        date(qhs.start_time) = date(qh.start_time)
        and date(qhs.end_time) = date(qh.end_time)
        and hour(qhs.start_time) = hour(qh.start_time)
        and qhs.workspace_id = qh.workspace_id
        and qhs.workspace_id = :workspace_id::string
        and qhs.compute.warehouse_id = qh.compute.warehouse_id
      group by
        date(qhs.start_time),
        hour(qhs.start_time),
        qhs.workspace_id,
        qhs.compute.warehouse_id
    ) `total_duration_by_hour`
  from
    system.query.history qh
  where
    date(start_time) >= date_add(week, - :duration_in_weeks :: integer, now())
    and qh.workspace_id = :workspace_id::string
  group by
    all
)
select
  u.workspace_id,
  u.executed_by,
  u.warehouse_id,
  u.event_date,
  u.hour,
  round(t.dollar_dbus, 2) `cost_by_hour`,
  round(t.DBUs, 2) `dbus_by_hour`,
  u.total_duration_by_user_in_hour,
  u.total_duration_by_hour,
  t.custom_tags,
  round(
    (
      total_duration_by_user_in_hour / total_duration_by_hour
    ) * 100,
    2
  ) `percent_consumption_by_user_in_hour`,
  round(
    t.dollar_dbus * `percent_consumption_by_user_in_hour` / 100,
    2
  ) `cost_by_user_in_hour`
from
  usage_by_user u
  inner join (
    select
      date(u.usage_start_time) `event_date`,
      hour(u.usage_start_time) `hour`,
      u.workspace_id,
      u.usage_metadata.warehouse_id `warehouse_id`,
      to_json(u.custom_tags) `custom_tags`,
      round(sum(usage_quantity), 2) `DBUs`,
      sum(usage_quantity * lp.pricing.default) `dollar_dbus`
    from
      system.billing.usage u
      INNER JOIN system.billing.list_prices lp ON u.account_id = lp.account_id
      and u.cloud = lp.cloud
      and u.sku_name = lp.sku_name
      and u.workspace_id = :workspace_id::string
      and u.usage_start_time between lp.price_start_time
      and coalesce(lp.price_end_time, u.usage_end_time)
      and u.usage_date >= date_add(week, - :duration_in_weeks :: integer, now())
      and u.sku_name like '%SQL%'
      and (
        :tag_name :: string = 'all'
        or u.custom_tags [:tag_name::string] = :tag_value :: string
      )
    group by
      all
  ) T ON u.workspace_id = T.workspace_id
  and u.workspace_id = :workspace_id::string
  and u.warehouse_id = t.warehouse_id
  and u.event_date = t.event_date
  and u.hour = t.hour 
order by
  event_date asc,
  hour asc

 


Warehouse insights metrics

column

data type

source

description

workspace_id

String

system.query.history

The ID of the workspace this query statement belongs to

warehouse_id

String

system.query.history.compute

The ID of the warehouse this query statement runs on

total_cost_by_hour

Decimal

system.billing.usage

Usage by SKU(e.g Compute Type)  per Hour in dollars

total_dbus_by_hour

Integer

system.billing.usage

Usage by SKU(e.g Compute Type)  per Hour in dbus

total_consumption_by_hour

Integer

system.query.history

diff(request.starttime - request_params.endttime)

total_consumption_by_user_in_hour

Integer

system.query.history

User’s total usage per hour

percent_of_consumption_by_user_in_hour

Decimal

derived field

User’s total usage percentage per hour in the warehouse

cost_by_user_in_hour

Decimal

derived field

User’s total cost per hour in the warehouse

tags 

MAP<STRING, STRING>

system.billing.usage

The user-supplied tags associated with this warehouse

executed_by

String

system.query.history

The ID of the principal who ran the query

event_date

Date

system.query.history

The day when the query was executed

event_hour

Integer

system.query.history

The time when the query was executed

 

Cost attribution by cluster

The bar chart below looks at cluster costs over a time period. The query uses the audit, usage, and cluster tables to display trends. When working with the clusters table, it is important to ensure the max record is taken to retrieve the latest configuration, as seen in the code snippet below. A subquery is leveraged to retrieve the max change record for a given cluster and join against the usage table.

Cluster Cost over Period (2).png

 

 inner join (
   select
     max(change_time),
     cluster_name,
     cluster_id,
     workspace_id,
     account_id
   from
     system.compute.clusters
   group by
     all
 )

 

Cluster insights by user

The below table shows the hourly cluster consumption and ties back the usage to a user.  How it ties back to the usage is based on grouping the audit log tables and duration of runCommand queries into hours in the day so that the duration of the queries executed within the hour ties back to the Usage table for cost.  A simple calculation of the duration of the query compared to the hourly total cost can give a sense of the percentage of cost that can be attributed to the user.  It is important to note that considering this is using the Audit Log table, the insights gathered are dependent on verbose logs being enabled to capture notebook commands being run.

*As a disclaimer, the estimates below depict the duration of the query relative to the Usage tables billing for that hour. Other factors may be needed for appropriate chargeback, but this gives a reasonable view of what users are contributing to costs based on query time.

The below table shows the hourly cluster consumption pattern and ties back the cluster usage cost to a user.

cost_per_user_cluster.png

 

Cluster top users per duration (1).png

Below is a snippet of how this is achieved while the full code can be found in the repo.

 

with usage_by_user as (
 select
   a.workspace_id,
   a.user_identity.email,
   a.request_params.clusterId `cluster_id`,
   a.event_date,
   hour(a.event_time) `hour`,
   sum(a.request_params.executionTime) `total_duration_by_user_in_hour`
 from
   system.access.audit a
 where
   a.service_name in ('notebook', 'jobs')
   and a.action_name = 'runCommand'
   and a.event_date >= date_add(week, - :duration_in_weeks :: integer, now())
   and a.request_params.clusterId is not null --and a.workspace_id = :workspace_id :: string
 group by
   all
),
total_cost_by_cluster as (
 select
   a1.event_date `event_date`,
   hour(a1.event_time) `hour`,
   a1.request_params.clusterId `clusterId`,
   sum(a1.request_params.executionTime) `total_duration_by_hour`
 from
   system.access.audit a1
 where
   a1.service_name in ('notebook', 'jobs')
   and a1.action_name = 'runCommand'
   and a1.event_date >= date_add(week, - :duration_in_weeks :: integer, now())
   and a1.request_params.clusterId is not null --and a1.workspace_id = :workspace_id :: string
 group by
   all
)
select
 u.workspace_id,
 u.email `executed_by`,
 c.cluster_name,
 u.cluster_id,
 u.event_date,
 u.hour,
 round(t.dollar_dbus, 2) `cost_by_hour`,
 t.DBUs `dbus_by_hour`,
 u.total_duration_by_user_in_hour,
 tc.total_duration_by_hour,
 t.custom_tags,
 round(
   (
     total_duration_by_user_in_hour / total_duration_by_hour
   ) * 100,
   2
 ) `percent_consumption_by_user_in_hour`,
 round(
   t.dollar_dbus * `percent_consumption_by_user_in_hour` / 100,
   2
 ) `cost_by_user_in_hour`
from
 usage_by_user u
 inner join total_cost_by_cluster tc on u.event_date = tc.event_date
 and u.hour = tc.hour
 and u.cluster_id = tc.clusterId
 inner join (
   select
     date(u.usage_start_time) `event_date`,
     hour(u.usage_start_time) `hour`,
     u.workspace_id,
     u.usage_metadata.cluster_id `cluster_id`,
     to_json(u.custom_tags) `custom_tags`,
     round(sum(usage_quantity), 2) `DBUs`,
     sum(usage_quantity * lp.pricing.default) `dollar_dbus`
   from
     system.billing.usage u
     INNER JOIN system.billing.list_prices lp ON u.account_id = lp.account_id
     and u.cloud = lp.cloud
     and u.sku_name = lp.sku_name
     and u.usage_start_time between lp.price_start_time
     and coalesce(lp.price_end_time, u.usage_end_time)
     and u.usage_date >= date_add(week, - :duration_in_weeks :: integer, now())
     and u.sku_name like '%ALL_PURPOSE%'
     and (
       :tag_name :: string = 'all'
       or u.custom_tags [:tag_name::string] = :tag_value :: string
     )
   group by
     all
 ) T ON u.workspace_id = T.workspace_id
 and u.cluster_id = t.cluster_id
 and u.event_date = t.event_date
 and u.hour = t.hour
 inner join (
   select
     max(change_time),
     cluster_name,
     cluster_id,
     workspace_id,
     account_id
   from
     system.compute.clusters
   group by
     all
 ) c on c.cluster_id = u.cluster_id
 and c.workspace_id = u.workspace_id
order by
 event_date asc,
 hour asc

 


Cluster insights metrics

column

data type

source

description

workspace_id

String

system.access.audit

The ID of the workspace this command statement belongs to

cluster_id

String

system.billing.usage

The ID of the cluster this command statement belongs to

cluster_name

String

system.billing.usage

The name of the cluster this command statement belongs to

total_cost_by_hour

Decimal

system.billing.usage

Usage by SKU(e.g Compute Type)  per Hour in dollars

total_dbus_by_hour

Integer

system.billing.usage

Usage by SKU(e.g Compute Type)  per Hour in dbus

total_consumption_by_hour

Integer

system.access.audit

diff(request.starttime - request_params.endttime)

total_consumption_by_user_in_hour

Integer

system.access.audit

User’s total usage per hour

percent_of_consumption_by_user_in_hour

Decimal

derived field

User’s total usage percentage per hour in the cluster

cost_by_user_in_hour

Decimal

derived field

User’s total cost per hour in the cluster

tags

MAP<STRING, STRING>

system.billing.usage

The user-supplied tags associated with this warehouse

executed_by

String

system.access.audit

The ID of the principal who ran the commands

event_date

Date

system.access.audit

The day when the command was executed

event_hour

Integer

system.access.audit

The time when the command was executed


Conclusion

In this blog post, we covered key tables needed to gather insights for cost attribution. The samples provided detailed cost analyses for DBSQL Warehouses, clusters, Workspaces, SKUs, and ysers. This allows organizations to see growing trends in their consumption, which will empower them to make informed decisions, optimize resource utilization, and effectively manage their Databricks consumption across different teams, workspaces, and use cases.  Sharing this template with Lakeview Dashboards is the beginning of more templates to come as System Tables evolve.  This template can be imported easily by downloading it from the public repo and importing it into a Databricks Workspace with Unity Catalog enabled.