<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Collecting Job Usage Metrics Without Unity Catalog in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/collecting-job-usage-metrics-without-unity-catalog/m-p/116417#M45306</link>
    <description>&lt;P&gt;hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/18319"&gt;@William_Scardua&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here’s a comprehensive overview of how to collect usage and job‐execution metrics in Databricks without Unity Catalog,&lt;BR /&gt;using REST APIs, audit logs, system tables, and built-in monitoring features.&lt;BR /&gt;In summary, you can retrieve:&lt;BR /&gt;1. Job and query history via the Query History API and Jobs API.&lt;BR /&gt;2. Cluster activity and performance via the Clusters API (events) and the compute‐metrics UI (or Ganglia charts via REST)&lt;BR /&gt;3. Workspace-level audit events via Premium-tier audit logs delivered to JSON or system tables.&lt;BR /&gt;4. Delta Live Tables pipeline metrics via the DLT event log.&lt;BR /&gt;Below are detailed options, with links to documentation and best practices.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## 1. REST APIs for Jobs, Queries, and Clusters&lt;/STRONG&gt;&lt;BR /&gt;1.1 Query History API (SQL Endpoints)&lt;BR /&gt;Use the Query History API 2.0 to list all SQL queries, their run times, and statuses. This works even if you’re not using Unity Catalog.&lt;BR /&gt;Endpoint: GET /api/2.0/sql/history/queries&lt;BR /&gt;&lt;A href="https://learn.microsoft.com/en-us/answers/questions/1180376/azure-databricks-how-to-get-usage-statistics-from?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-us/answers/questions/1180376/azure-databricks-how-to-get-usage-statistics-from?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Usage: Filter by user, time range, or warehouse to gather per-user or per-table query metrics.&lt;BR /&gt;1.2 Jobs API (Databricks Jobs)&lt;BR /&gt;Retrieve job execution details—start/end times, run duration, and task outcomes—via:&lt;BR /&gt;Endpoint: GET /api/2.1/jobs/runs/list&lt;BR /&gt;&lt;A href="https://docs.databricks.com/api/workspace/clusters?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/api/workspace/clusters?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Usage: Paginate through runs, then jobs/runs/get for detailed metrics on each run.&lt;BR /&gt;1.3 Clusters API (Events &amp;amp; Info)&lt;BR /&gt;Collect cluster lifecycle events (start, resize, terminate) and basic stats:&lt;BR /&gt;Events: GET /api/2.0/clusters/events&lt;BR /&gt;&lt;A href="https://docs.databricks.com/api/workspace/clusters/events?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/api/workspace/clusters/events?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Cluster Info: GET /api/2.0/clusters/get?cluster_id=…&lt;BR /&gt;&lt;A href="https://api-reference.cloud.databricks.com/workspace/clusters/get?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://api-reference.cloud.databricks.com/workspace/clusters/get?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Usage: Build dashboards of cluster uptime, autoscaling events, and node counts over time.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## 2. Built-In Monitoring &amp;amp; Metrics&lt;/STRONG&gt;&lt;BR /&gt;2.1 Compute Metrics UI&lt;BR /&gt;Databricks provides real-time hardware &amp;amp; Spark metrics (CPU, memory, tasks) in the Compute UI—even without Ganglia.&lt;BR /&gt;Docs: View compute metrics in Databricks UI&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/compute/cluster-metrics?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/compute/cluster-metrics?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Tip: Use these charts for ad hoc monitoring, or scrape via Selenium/REST for automation.&lt;BR /&gt;2.2 Ganglia Charts via REST&lt;BR /&gt;If you need historical Ganglia charts (pre-13.x), some have scripted calls to scrape the Ganglia API (undocumented).&lt;BR /&gt;Community Example: “Get cluster metric (Ganglia charts)”&lt;BR /&gt;&lt;A href="https://stackoverflow.com/questions/73505963/get-cluster-metric-ganglia-charts-of-all-clusters-via-rest-api-in-databricks?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://stackoverflow.com/questions/73505963/get-cluster-metric-ganglia-charts-of-all-clusters-via-rest-api-in-databricks?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Caveat: Not officially supported—prefer the Compute Metrics UI or external exporters.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## 3. Audit Logs &amp;amp; System Tables&lt;/STRONG&gt;&lt;BR /&gt;3.1 Workspace Audit Logs (Premium)&lt;BR /&gt;Enable workspace-level audit logs to capture user actions (table reads, notebook runs, cluster ops).&lt;BR /&gt;Reference: Audit log events list&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/admin/account-settings/audit-logs?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/admin/account-settings/audit-logs?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Delivery:&lt;BR /&gt;System Table: Query system.access.audit directly (public preview)&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/admin/system-tables/audit-logs?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/admin/system-tables/audit-logs?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;S3/Blob: Configure JSON log delivery (low latency) to storage&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/admin/account-settings/audit-log-delivery?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/admin/account-settings/audit-log-delivery?utm_source=chatgpt.com&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Verbose Mode: Optionally turn on verbose audit logs to record every command/query text&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/admin/account-settings/verbose-logs?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/admin/account-settings/verbose-logs?utm_source=chatgpt.com&lt;/A&gt;&lt;/P&gt;&lt;P&gt;3.2 Delta Live Tables Event Log&lt;BR /&gt;For DLT pipelines, each pipeline writes an event log (as a Delta table) capturing pipeline progress, data quality checks,&lt;BR /&gt;and audit entries.&lt;BR /&gt;Docs: Monitor DLT pipelines with the event log&lt;BR /&gt;&lt;A href="https://learn.microsoft.com/en-us/answers/questions/1180376/azure-databricks-how-to-get-usage-statistics-from?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-us/answers/questions/1180376/azure-databricks-how-to-get-usage-statistics-from?utm_source=chatgpt.com&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## 4. Best Practices &amp;amp; Integration&lt;/STRONG&gt;&lt;BR /&gt;4.1 Centralize Logs &amp;amp; Metrics&lt;BR /&gt;Ingest REST API outputs (jobs, clusters, queries) into a dedicated Delta table or external system (Time Series DB)&lt;BR /&gt;Archive audit logs in Parquet/Delta on S3/ADLS and query via Spark.&lt;BR /&gt;4.2 Dashboard &amp;amp; Alerting&lt;BR /&gt;Use Databricks SQL or external BI tools (Tableau, Power BI) on your metrics tables.&lt;BR /&gt;For real-time alerts, stream critical events (e.g., job failures) into Slack/MS Teams via webhooks.&lt;BR /&gt;4.3 Security &amp;amp; Access&lt;BR /&gt;Limit REST API tokens to read-only scopes.&lt;BR /&gt;Ensure audit-log storage buckets have least privileges and encryption at rest.&lt;/P&gt;</description>
    <pubDate>Thu, 24 Apr 2025 02:11:24 GMT</pubDate>
    <dc:creator>lingareddy_Alva</dc:creator>
    <dc:date>2025-04-24T02:11:24Z</dc:date>
    <item>
      <title>Collecting Job Usage Metrics Without Unity Catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/collecting-job-usage-metrics-without-unity-catalog/m-p/116391#M45301</link>
      <description>&lt;P class=""&gt;hi,&lt;/P&gt;&lt;P class=""&gt;I would like to request assistance on how to collect usage metrics and job execution data for my Databricks environment. We are currently &lt;SPAN class=""&gt;&lt;STRONG&gt;not using Unity Catalog&lt;/STRONG&gt;&lt;/SPAN&gt;, but I would still like to monitor and analyze usage&lt;/P&gt;&lt;P class=""&gt;Could you please provide guidance or documentation on how to retrieve this information without relying on Unity Catalog?&lt;/P&gt;&lt;P class=""&gt;Any recommendations on APIs, system tables, audit logs, or best practices would be greatly appreciated.&lt;/P&gt;</description>
      <pubDate>Wed, 23 Apr 2025 18:43:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/collecting-job-usage-metrics-without-unity-catalog/m-p/116391#M45301</guid>
      <dc:creator>William_Scardua</dc:creator>
      <dc:date>2025-04-23T18:43:27Z</dc:date>
    </item>
    <item>
      <title>Re: Collecting Job Usage Metrics Without Unity Catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/collecting-job-usage-metrics-without-unity-catalog/m-p/116417#M45306</link>
      <description>&lt;P&gt;hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/18319"&gt;@William_Scardua&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here’s a comprehensive overview of how to collect usage and job‐execution metrics in Databricks without Unity Catalog,&lt;BR /&gt;using REST APIs, audit logs, system tables, and built-in monitoring features.&lt;BR /&gt;In summary, you can retrieve:&lt;BR /&gt;1. Job and query history via the Query History API and Jobs API.&lt;BR /&gt;2. Cluster activity and performance via the Clusters API (events) and the compute‐metrics UI (or Ganglia charts via REST)&lt;BR /&gt;3. Workspace-level audit events via Premium-tier audit logs delivered to JSON or system tables.&lt;BR /&gt;4. Delta Live Tables pipeline metrics via the DLT event log.&lt;BR /&gt;Below are detailed options, with links to documentation and best practices.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## 1. REST APIs for Jobs, Queries, and Clusters&lt;/STRONG&gt;&lt;BR /&gt;1.1 Query History API (SQL Endpoints)&lt;BR /&gt;Use the Query History API 2.0 to list all SQL queries, their run times, and statuses. This works even if you’re not using Unity Catalog.&lt;BR /&gt;Endpoint: GET /api/2.0/sql/history/queries&lt;BR /&gt;&lt;A href="https://learn.microsoft.com/en-us/answers/questions/1180376/azure-databricks-how-to-get-usage-statistics-from?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-us/answers/questions/1180376/azure-databricks-how-to-get-usage-statistics-from?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Usage: Filter by user, time range, or warehouse to gather per-user or per-table query metrics.&lt;BR /&gt;1.2 Jobs API (Databricks Jobs)&lt;BR /&gt;Retrieve job execution details—start/end times, run duration, and task outcomes—via:&lt;BR /&gt;Endpoint: GET /api/2.1/jobs/runs/list&lt;BR /&gt;&lt;A href="https://docs.databricks.com/api/workspace/clusters?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/api/workspace/clusters?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Usage: Paginate through runs, then jobs/runs/get for detailed metrics on each run.&lt;BR /&gt;1.3 Clusters API (Events &amp;amp; Info)&lt;BR /&gt;Collect cluster lifecycle events (start, resize, terminate) and basic stats:&lt;BR /&gt;Events: GET /api/2.0/clusters/events&lt;BR /&gt;&lt;A href="https://docs.databricks.com/api/workspace/clusters/events?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/api/workspace/clusters/events?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Cluster Info: GET /api/2.0/clusters/get?cluster_id=…&lt;BR /&gt;&lt;A href="https://api-reference.cloud.databricks.com/workspace/clusters/get?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://api-reference.cloud.databricks.com/workspace/clusters/get?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Usage: Build dashboards of cluster uptime, autoscaling events, and node counts over time.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## 2. Built-In Monitoring &amp;amp; Metrics&lt;/STRONG&gt;&lt;BR /&gt;2.1 Compute Metrics UI&lt;BR /&gt;Databricks provides real-time hardware &amp;amp; Spark metrics (CPU, memory, tasks) in the Compute UI—even without Ganglia.&lt;BR /&gt;Docs: View compute metrics in Databricks UI&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/compute/cluster-metrics?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/compute/cluster-metrics?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Tip: Use these charts for ad hoc monitoring, or scrape via Selenium/REST for automation.&lt;BR /&gt;2.2 Ganglia Charts via REST&lt;BR /&gt;If you need historical Ganglia charts (pre-13.x), some have scripted calls to scrape the Ganglia API (undocumented).&lt;BR /&gt;Community Example: “Get cluster metric (Ganglia charts)”&lt;BR /&gt;&lt;A href="https://stackoverflow.com/questions/73505963/get-cluster-metric-ganglia-charts-of-all-clusters-via-rest-api-in-databricks?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://stackoverflow.com/questions/73505963/get-cluster-metric-ganglia-charts-of-all-clusters-via-rest-api-in-databricks?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Caveat: Not officially supported—prefer the Compute Metrics UI or external exporters.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## 3. Audit Logs &amp;amp; System Tables&lt;/STRONG&gt;&lt;BR /&gt;3.1 Workspace Audit Logs (Premium)&lt;BR /&gt;Enable workspace-level audit logs to capture user actions (table reads, notebook runs, cluster ops).&lt;BR /&gt;Reference: Audit log events list&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/admin/account-settings/audit-logs?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/admin/account-settings/audit-logs?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;Delivery:&lt;BR /&gt;System Table: Query system.access.audit directly (public preview)&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/admin/system-tables/audit-logs?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/admin/system-tables/audit-logs?utm_source=chatgpt.com&lt;/A&gt;&lt;BR /&gt;S3/Blob: Configure JSON log delivery (low latency) to storage&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/admin/account-settings/audit-log-delivery?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/admin/account-settings/audit-log-delivery?utm_source=chatgpt.com&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Verbose Mode: Optionally turn on verbose audit logs to record every command/query text&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/admin/account-settings/verbose-logs?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/admin/account-settings/verbose-logs?utm_source=chatgpt.com&lt;/A&gt;&lt;/P&gt;&lt;P&gt;3.2 Delta Live Tables Event Log&lt;BR /&gt;For DLT pipelines, each pipeline writes an event log (as a Delta table) capturing pipeline progress, data quality checks,&lt;BR /&gt;and audit entries.&lt;BR /&gt;Docs: Monitor DLT pipelines with the event log&lt;BR /&gt;&lt;A href="https://learn.microsoft.com/en-us/answers/questions/1180376/azure-databricks-how-to-get-usage-statistics-from?utm_source=chatgpt.com" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-us/answers/questions/1180376/azure-databricks-how-to-get-usage-statistics-from?utm_source=chatgpt.com&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;## 4. Best Practices &amp;amp; Integration&lt;/STRONG&gt;&lt;BR /&gt;4.1 Centralize Logs &amp;amp; Metrics&lt;BR /&gt;Ingest REST API outputs (jobs, clusters, queries) into a dedicated Delta table or external system (Time Series DB)&lt;BR /&gt;Archive audit logs in Parquet/Delta on S3/ADLS and query via Spark.&lt;BR /&gt;4.2 Dashboard &amp;amp; Alerting&lt;BR /&gt;Use Databricks SQL or external BI tools (Tableau, Power BI) on your metrics tables.&lt;BR /&gt;For real-time alerts, stream critical events (e.g., job failures) into Slack/MS Teams via webhooks.&lt;BR /&gt;4.3 Security &amp;amp; Access&lt;BR /&gt;Limit REST API tokens to read-only scopes.&lt;BR /&gt;Ensure audit-log storage buckets have least privileges and encryption at rest.&lt;/P&gt;</description>
      <pubDate>Thu, 24 Apr 2025 02:11:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/collecting-job-usage-metrics-without-unity-catalog/m-p/116417#M45306</guid>
      <dc:creator>lingareddy_Alva</dc:creator>
      <dc:date>2025-04-24T02:11:24Z</dc:date>
    </item>
    <item>
      <title>Re: Collecting Job Usage Metrics Without Unity Catalog</title>
      <link>https://community.databricks.com/t5/data-engineering/collecting-job-usage-metrics-without-unity-catalog/m-p/125998#M47607</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Hi &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/18319"&gt;@William_Scardua&lt;/a&gt;&amp;nbsp;, were you able to collect the job metrics?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Jul 2025 13:51:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/collecting-job-usage-metrics-without-unity-catalog/m-p/125998#M47607</guid>
      <dc:creator>alsetr</dc:creator>
      <dc:date>2025-07-22T13:51:37Z</dc:date>
    </item>
  </channel>
</rss>

