Community-produced videos to help you leverage Databricks in your Data & AI journey. Tune in to explore industry trends and real-world use cases from leading data practitioners.
In this video, a Senior Specialist Solutions Architect at Databricks, goes over Observability on Databricks and how this can be achieved using Systems Tables. Observability is the ability to monitor and understand how an application is behaving. This will help users gain insight into the application’s performance and identify potential issues. System tables are a Databricks-hosted analytical store of the account’s operational data. System tables are not enabled by default, watch this video to learn how this can be enabled followed by a 10 min demo! Databricks system tables offer several benefits that enhance the functionality and usability of the Databricks platform:
1. Historical Observability: System tables serve as a Databricks-hosted analytical store for all of Databricks operational data's warm path used for historical customer observability. This includes cost/usage analytics, efficiency analytics, audit analytics, SLO analytics, and data quality analytics for orchestration, warehousing, ML, runtimes, compute, etc.
2. Reliability: System tables are reliable enough for production environments. They provide a standardized structured data format for storing events and an easy way to surface them to customers.
3. Operational Intelligence: System tables are the observability feature of UC, which until recently has been heavy on controls and light on observability. They cover a broad set of operational intelligence use cases and do not curate the business-level metrics of interest that customers may require. Instead, these metrics are latent and computed from System Tables' underlying raw (bronze) operational data.
4. Usage Analytics: System tables can be used for a wide range of use cases, including usage analytics, consumption/cost forecasting, efficiency analysis, security & compliance audits, SLO (Service level objective) analytics and reporting, actionable DataOps, and data quality monitoring and reporting.
5. Security Monitoring: System tables can be used to augment a Zero Trust Architecture on Databricks. They provide automatic data lineage tracking in real-time, down to the column level, and can be queried programmatically.
6. Access to Operational Data: System tables provide access to operational data such as audit logs, table lineage, column lineage, billable usage, pricing, and cluster configurations. This data can be used for historical observability across your account.
7. Ease of Use: With little or no setup, customers can query this usage and event data directly using SQL or Spark APIs, get timely streaming updates, join them with other business data, and explore the data in notebooks or Redash dashboards.
8. Engineering Efficiency: From the engineering perspective, system tables reduce the burden of implementing common infra and data governance tasks for each use case, such as fine-grained access control, data redaction, encryption, residency, and retention requirements.
Target Audience - Data Governance Engineers, Data Analysts, Data/Solution Architects