Databricks Community

jiteshraut20 · ‎08-14-2024

Introduction

Overwatch is a powerful tool for monitoring and analyzing your Databricks environment, providing insights into resource utilization, cost management, and system performance. By leveraging system tables as the data source, you can gain a comprehensive view of all workspaces in your Databricks account. This article will guide you through the process of deploying Overwatch on Databricks (AWS), using system tables as the primary data source.

Official Documentation: General overview and guidance for Overwatch and Overwatch Implementation, though it may not include the latest updates.

Data Dictionary: Detailed definitions of the data elements available after successful deployment of Overwatch for consumption.

Architecture: Reference architecture for deploying Overwatch on AWS.

Prerequisites

1. Enable System Table Schemas

Before deploying Overwatch, you must enable system table schemas. These schemas are governed by Unity Catalog, and you must have at least one Unity Catalog-enabled workspace in your account to access system tables. The system tables aggregate data from all workspaces in your account but can only be accessed from a Unity Catalog-enabled workspace.

Key Points:

- System Table Governance: Unity Catalog governs system tables.

- Access: Only accessible from Unity Catalog-enabled workspaces.

- Enabling Schemas: System schemas must be enabled at the schema level, and an account admin must manually enable them.

How to Enable System Table Schemas:

- List available system schemas using the following `curl` command:

curl -v -X GET -H “Authorization: Bearer <PAT Token>” “https://<workspace>.cloud.databricks.com/api/2.0/unity-catalog/metastores/<metastore-id>/systemschemas”

- Enable a system schema using the following `curl` command:

curl -v -X PUT -H “Authorization: Bearer <PAT Token>” “https://<workspace>.databricks.com/api/2.0/unity-catalog/metastores/<metastore-id>/systemschemas/<SCHEMA_NAME>”

For more details, refer to the official Databricks documentation: Enable System Table Schemas.

2. Generate a Personal Access Token (PAT)

To interact with the Databricks API, you need to generate a Personal Access Token (PAT).

Steps to Generate a PAT:

a. Log in to Databricks: Access your Databricks workspace using your credentials.

b. Access User Settings: Click on your profile icon and select “User Settings.”

c. Generate a New Token: Under the “Access Tokens” section, click “Generate New Token.”

d. Configure the Token: Set a lifetime and optionally provide a description.

e. Save the Token: Copy the token immediately, as it will not be displayed again.

3. Securely Store the PAT in a Secret Scope

To securely store the PAT, create a secret scope using the Databricks CLI.

Steps to Create a Secret Scope:

a. Set Up the Databricks CLI: Install and configure the CLI with your workspace URL and PAT.

pip install databricks-cli

databricks configure — token

b. Create a Secret Scope: Use the following command to create a scope:

databricks secrets create-scope — scope <scope-name>

c. Add the PAT to the Secret Scope: Store the PAT in the secret scope.

databricks secrets put — scope <scope-name> — key <secret-key>

Configuring Overwatch with System Tables

To deploy Overwatch using system tables as the data source, you’ll need to create a configuration file that contains the necessary details for your Databricks environment. Below is a sample schema for the configuration file with sample values:

workspace_name : Workspace-TestOW

workspace_id : 1857915041000000

workspace_url : https://dbc-664fbb00-0000.cloud.databricks.com

api_url : https://mumbai.cloud.databricks.com

cloud : AWS

primordial_date : 28–07–2024

storage_prefix : /mnt/overwatch_global

etl_database_name : overwatch_etl

consumer_database_name : overwatch_global

secret_scope : secret_scope_for_overwatch

secret_key_dbpat : dapicd0c0c0000e0000a0a0d0c0f0f0e000e

auditlogprefix_source_path : system

eh_name :

eh_scope_key :

interactive_dbu_price : 0.55

automated_dbu_price : 0.3

sql_compute_dbu_price : 0.22

jobs_light_dbu_price : 0.1

max_days : 30

excluded_scopes :

active : TRUE

proxy_host :

proxy_port :

proxy_user_name :

proxy_password_scope :

proxy_password_key :

success_batch_size :

error_batch_size :

enable_unsafe_SSL :

thread_pool_size :

api_waiting_time :

Save this configuration as csv and save/ upload it to DBFS (we will need path of this file later in further deployment steps)

workspace_name,workspace_id,workspace_url,api_url,cloud,primordial_date,storage_prefix,etl_database_name,consumer_database_name,secret_scope,secret_key_dbpat,auditlogprefix_source_path,eh_name,eh_scope_key,interactive_dbu_price,automated_dbu_price,sql_compute_dbu_price,jobs_light_dbu_price,max_days,excluded_scopes,active,proxy_host,proxy_port,proxy_user_name,proxy_password_scope,proxy_password_key,success_batch_size,error_batch_size,enable_unsafe_SSL,thread_pool_size,api_waiting_time
Workspace-TestOW,1857915041000000,https://dbc-664fbb00-0000.cloud.databricks.com,https://mumbai.cloud.databricks.com,AWS,28-07-2024,/mnt/overwatch_global,overwatch_etl,overwatch_global,secret_scope_for_overwatch,dapicd0c0c0000e0000a0a0d0c0f0f0e000e,system,,,0.55,0.3,0.22,0.1,30,,TRUE,,,,,,,,,,

Configurations: Configuration settings and options for deploying Overwatch.

Custom Cost: Information on configuring custom cost settings in Overwatch.

Optimal Databricks Configuration

For deploying Overwatch, the following configuration worked optimally:

- Cluster DBR Version: 13.3 LTS

- Overwatch JAR Version: 0.8.1.2 (Latest as of August 2024)

This specific combination of DBR and Overwatch JAR is recommended, as it proved to be the most reliable in recent tests. Other versions may encounter compatibility issues.

Missing Libraries:

In the 13.3 LTS DBR version, a couple of libraries were missing:

- org.scalaj:scalaj-http_2.12:2.4.2

- dataframe_rules_engine_2.12:0.2.0

These libraries might need to be installed manually on the cluster to ensure full functionality.

Steps to Deploy Overwatch on Databricks (AWS)

Here is a comprehensive guide to deploying Overwatch on Databricks using the provided setup and configuration parameters:

1. Import Library

Begin by importing the necessary library for Overwatch deployment:

import com.databricks.labs.overwatch.MultiWorkspaceDeployment

2. Set Configuration Parameters

Define the parameters required for the deployment:


val PATHTOCSVCONFIG = "dbfs:/FileStore/overwatch/configs/overwatch Configuration.csv"
val CONFIGTABLENAME = "overwatch_dev_config"
val TEMPDIR = "dbfs:/mnt/overwatch/tmp" // Update as necessary
val PARALLELISM = 1 // Adjust based on the number of workspaces to deploy
val STORAGEPREFIX = "dbfs:/mnt" // Update as necessary

- PATHTOCSVCONFIG: Path to the CSV file with Overwatch configuration.

- CONFIGTABLENAME: Delta table where the Overwatch configuration will be saved or updated.

- TEMPDIR: Temporary directory for intermediate files during deployment.

- PARALLELISM: Number of workspaces to load simultaneously (up to ~20).

- STORAGEPREFIX: Prefix for storage paths.

3. Save CSV Content to Configuration Table

Load the CSV file and save its contents to the configuration table:

spark.read
.option("header", "true")
.option("ignoreLeadingWhiteSpace", true)
.option("ignoreTrailingWhiteSpace", true)
.csv(PATHTOCSVCONFIG)
.coalesce(1)
.write
.option("mergeSchema", "true")
.mode("overwrite")
.format("delta")
.saveAsTable(CONFIGTABLENAME)

4. Validate Deployment Configuration

Validate the deployment configuration using the `MultiWorkspaceDeployment` class:

MultiWorkspaceDeployment(CONFIGTABLENAME, TEMPDIR).validate(PARALLELISM)

Validation: Guidance on validating your Overwatch deployment.

5. Review Validation Results

Check the validation results to ensure everything is configured correctly:

import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._
// Display the latest validation results
val windowSpec = Window.orderBy(col("snapTS").desc)
display(
spark.read.format("delta")
.load("dbfs:/mnt/overwatch_global_d1/report/validationReport")
.withColumn("rank", rank().over(windowSpec))
.filter(col("rank") === 1)
.drop("rank")
.orderBy("validated")
)

6. Final Deployment

Perform the final deployment for each layer (Bronze, Silver, Gold):

Bronze Layer:

MultiWorkspaceDeployment(CONFIGTABLENAME, TEMPDIR).deploy(PARALLELISM, "Bronze")

Silver Layer:

MultiWorkspaceDeployment(CONFIGTABLENAME, TEMPDIR).deploy(PARALLELISM, "Silver")

Gold Layer:

MultiWorkspaceDeployment(CONFIGTABLENAME, TEMPDIR).deploy(PARALLELISM, "Gold")

Running as Overwatch: Additional details on running Overwatch.

Final Deployment Report

Check the final deployment report using:

SELECT * FROM overwatch_etl.pipReport ORDER BY Pipeline_SnapTs DESC

By following these steps, you will deploy Overwatch on Databricks efficiently, using system tables as your data source and ensuring that all layers of the deployment are properly configured and validated.

Got it! If you need any further assistance or have additional questions, feel free to reach out at contact@jiteshraut.me or connect with you on LinkedIn.

Thank you!

Jitesh Raut