cancel
Showing results for 
Search instead for 
Did you mean: 
Technical Blog
Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
cancel
Showing results for 
Search instead for 
Did you mean: 
SriramMohanty
New Contributor III
New Contributor III

How to use System Tables with Overwatch

Welcome to our blog post on integrating system tables with Overwatch! In this article, we'll delve into the exciting world of leveraging system tables to enhance the utility and insight provided by Overwatch. For an introduction to Overwatch, please refer to my initial blog post Overwatch: The Observability Tool for Databricks.

A concise overview of System Tables & Overwatch

System Tables represent a repository of analytical data specific to your account, housed within the Databricks system catalog. They facilitate historical observability across your account, offering insights into usage, operational history, and other facets of your Databricks environment. Notable system tables include Audit logs, Table lineage, and Predictive optimization. Audit logs comprehensively document all audit events occurring within your Databricks account.

Overwatch, on the other hand, gets data from 3 different sources:

  1. Audit Logs
    • AWS audit logs are stored in an S3 container.
    • GCP audit logs are stored in a Google Cloud Storage (GCS) container.
    • Azure audit logs are pushed to Event Hub.
  2. API Calls
  3. Cluster Logs

The audit log serves as a crucial component from which Overwatch retrieves data. It encompasses various events tied to operations within Databricks, spanning DBFS operations, notebook actions, account activities, workspace actions, cluster operations, and login activities. This audit log can be stored in an S3 container for AWS workspaces, a GCS container for GCP workspaces, or an Event Hub for Azure workspaces.

In Unity Catalog (UC) enabled Workspaces, the audit log data is accessible through the system.access.audit table within the "System" catalog. This table aggregates audit data from multiple Workspaces. Therefore, data pertaining to all Workspaces associated with the account can be accessed within the system.access.audit table.

System Tables are simpler than container based Audit Logs

Beginning with version 0800, Overwatch supports System Tables, making it easy to automatically retrieve audit log data. Overwatch also supports a cross-account integration with System Tables. There are several benefits for using the system table as an audit log source:

  • No need to manually intervene and configure each Workspace for audit log delivery. (AWS/GCP)
  • No need to manually create a container and manage the storage of audit log data. (AWS/GCP)
  • No need to manually create an Azure Event Hub for each Workspace, then configure each Workspace to send audit logs to it.
  • Any access issues associated with the audit log container or Event Hub are eliminated.

System Tables are easy to use with Overwatch

The alignment between System Tables and Overwatch ensures a seamless integration in the following ways.

  1. Simplified Setup within Overwatch: Integration with System Tables streamlines setup within Overwatch, eliminating the need for configuring audit log settings separately. This simplicity enhances user-friendliness for setup and ongoing management.
  2. Effortless Migration: Migration to System Tables within the Overwatch framework is designed to be effortless, minimizing disruptions and allowing continuous monitoring capabilities with minimal reconfigurations.
  3. Extended Data Retention for Azure Deployments: Utilizing System Tables extends audit log retention beyond the standard 30 days for Azure deployments in Overwatch, eliminating the reliance on Event Hub's retention period.
  4. Hassle-free Cross-Account Integration: System Table integration facilitates the seamless aggregation of data from different accounts, enabling smooth multi-account deployments.

Enabling system tables as a data source for Overwatch is straightforward. Simply use the keyword "system" in the auditlogprefix_source_path parameter within the Overwatch config file:

  1. GO to overwatch configuration

    SriramMohanty_0-1715854836098.png
  2. Update the configuration
    %sql
    
    update <overwatch_config> set auditlogprefix_source_path = "system" where workspace_id = "<workspace_id>"
  3. Check the configuration
    SriramMohanty_1-1715854836060.png

For further details, please refer to the provided SystemTableConfig.

Summary

System Tables serve as a repository of analytical data, including audit logs, which document various events crucial for monitoring account activities. By utilizing them as an alternative to container-based audit logs, users can streamline setup, eliminate manual intervention, and mitigate access issues. Overwatch seamlessly integrates with System Tables, offering simplified setup, effortless migration, and extended data retention, particularly beneficial for Azure deployments. Moreover, system table integration facilitates hassle-free cross-account integration, empowering users with a unified platform for comprehensive monitoring and analysis. Overall, incorporating system tables into Overwatch provides a robust solution for optimizing observability and enhancing operational efficiency within Databricks workspaces.