cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Introduction to Metric Views (part 1 of 3)

KrisJohannesen
Contributor

This is part 1 of 3 in a series where I take you through working with Metric Views.

  • Part 1: Introduction to Metric Views

  • Part 2: Metric Views and the Databricks platform (AI/BI Dashboards, Genie, etc.)

  • Part 3: Metric Views with Power BI and Tabular Editor

Why do we need a semantic layer?

Organizations have long used external semantic layers (LookML, Power BI semantic models, Tabular, Metric layer tools) but these all tend to live outside the core data platform. In most organizations, analytical logic is fragmented. The same revenue calculation, customer definition, or KPI might exist in dozens of dashboards, each with its own slightly different logic. Over time, this creates inconsistent numbers, duplicate work, and disconnected understanding between data and business users.

A semantic layer addresses this by introducing a consistent, governed way of defining how business concepts map to data. It sits between the physical data model (tables and views) and the consumption layer (dashboards, notebooks, or AI assistants).

In practical terms, a semantic layer defines:

  • Measures; numeric metrics such as Sales, Profit Margin, or Conversion Rate, including how they are aggregated (e.g., SUM, AVG, COUNT DISTINCT).

  • Dimensions; the axes along which measures can be analyzed, such as Region, Customer Segment, or Month.

  • Relationships; the logical joins between datasets that determine how metrics can be combined or filtered.

  • Metadata and Descriptions; business-friendly names, definitions, formats, and documentation for consistent interpretation.

The result is a shared business vocabulary that can be used across tools and teams. Instead of asking every analyst to recreate logic, the semantic layer provides a central contract for metrics, ensuring that “total revenue” means the same thing in Excel, SQL, Power BI, or an AI-generated insight.

Want to deep dive into Semantic Models? Read this blog post by Kurt Buhler from Tabular Editor: What is a Semantic Model? 

 

Introducing Databricks Metric Views

Metric Views are a recent addition to the Databricks platform, and is currently a Public Preview feature. Databricks announced metric views in the same line of features such as Unity Catalog’s governance, lineage, and the broader narrative of turning Databricks into a unified AI/BI platform that covers the entire space from ingestion to presentation. In this context, the addition of a Semantic Layer in Databricks is a natural progression, that ties together Databricks offering of a Data Engineering Platform, to the recent upgrades to AI/BI Dashboards and the introduction of Genie Spaces.

What is a Metric View?

Metric views are Databricks Semantic Layer offering, and contains the details of a traditional semantic layer as described above; Measures, Dimensions, Relationships and Metadata. Metric views are defined in YAML format, and their output and the definition of them is registered in Unity Catalog (just like standard views). You can create them using SQL or by using the Catalog Explorer UI. For consumption, you can query it using SQL, just as if it was a table or a standard view.

Metric Views - Databricks Definition.png

An important thing to note using Metric Views, is that permissions and governance is inherited throughout Unity Catalog. This means, that in order to query a Metric View, you need the corresponding access to the underlying data as well - after all, it is only a view. This is one of the most important assets of Metric Views, as it allows organizations to keep a single line of governance. We will dive into this topic a bit more in Part 2 and 3, as we look at governance patterns across the different front-end platforms.

Metric Views are not just a new vers

 

ion of SQL views

While the name Metric Views might suggest that this is simply a view, there are some important differences to distinguish between Metric Views what we will refer in this series as Standard Views. I have tried summarizing the key differences below

Metric vs Standard view.png

The anatomy of a Metric View

A Metric View is defined as a yaml file in which you specify a list of top-level fields, each of which can then be populated with their own members. Metric Views are created on top of one or more Tables or Views (or even other Metric Views). The key driver of a Metric View is the source, and additional tables can be added via joins.

  • version: The version of the metric view specification, which specifies what attributes are available in the other definitions. Defaults to 1.1.

  • comment (optional): Description of the metric view which can be used as metadata for e.g. Genie to get a better understanding of the model.

  • source: The source data for the metric view. This can be a table-like asset or a SQL query. In a star-schema model this is often the fact table.

  • joins (optional): Additional tables needed for the definition. Star schema and snowflake schema joins are both supported. Requires an addition on clause.

  • filter (optional): A SQL boolean expression that applies to all queries; equivalent to a traditional SQL WHERE clause.

  • dimensions: An array of dimension definitions, including the dimension name and optionally any expressions needed to calculate them.

  • measures: An array of aggregate expression columns.

version: 1.1

comment: Order Metric View using TPCH data from the samples catalog
source: samples.tpch.orders
filter: o_orderdate > '1990-01-01'


dimensions:
  - name: Order Month
    expr: DATE_TRUNC('MONTH', o_orderdate)

  - name: Order Status
    expr: CASE
      WHEN o_orderstatus = 'O' then 'Open'
      WHEN o_orderstatus = 'P' then 'Processing'
      WHEN o_orderstatus = 'F' then 'Fulfilled'
      END

  - name: Order Priority
    expr: SPLIT(o_orderpriority, '-')[1]

measures:
  - name: Order Count
    expr: COUNT(1)

  - name: Total Revenue
    expr: SUM(o_totalprice)

  - name: Total Revenue per Customer
    expr: SUM(o_totalprice) / COUNT(DISTINCT o_custkey)

  - name: Total Revenue for Open Orders
    expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus='O')

For a full reference catalog of the YAML definitions, and how to work with different expressions, special characters, indentations, versions check out the Databricks Documentation below

 

Working with Metric Views

You can create a Metric View directly in a notebook or through the Databricks UI, alternatively, you can define the same logic declaratively in YAML by hand. To create a metric view, you need the following setup:

  • USE CATALOG, USE SCHEMA and CREATE TABLE privileges in the location you want to create the Metric View.

  • SELECT privileges on the source table(s) or view(s) referenced by the metric view.

  • CAN USE permissions on A SQL warehouse, running on Databricks Runtime 16.4 or above.

Defining a Metric View

Metric views can be defined in a couple of different ways. I think the most common way for developers will be using the SQL syntax, while analysts will most likely default to using the Databricks Catalog UI.

Using the Catalog UI open the table which you want to have as your source. Click Create -gt; Metric View in the top right corner which opens the YAML editor.

You will notice that Databricks by default creates a Metric View across that entire table. While no joins or additional tables are included, this is a great starting point that will help you start with the correct syntax.

Using the SQL syntax a metric view can be defined as what you see below. This query is fully executable in the SQL editor, or in a notebook. You may notice that 90% of this query is the exact same YAML anatomy parts that we saw earlier. The SQL part is merely a wrapper that allows you to execute the code.

CREATE OR REPLACE VIEW orders_metric_view
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "Orders KPIs for sales and financial analysis"
  source: samples.tpch.orders
  filter: o_orderdate > '1990-01-01'
  dimensions:
    - name: Order Month
      expr: DATE_TRUNC('MONTH', o_orderdate)
    - name: Order Status
      expr: CASE
        WHEN o_orderstatus = 'O' then 'Open'
        WHEN o_orderstatus = 'P' then 'Processing'
        WHEN o_orderstatus = 'F' then 'Fulfilled'
        END
    - name: Order Priority
      expr: SPLIT(o_orderpriority, '-')[1]
  measures:
    - name: Order Count
      expr: COUNT(1)
    - name: Total Revenue
      expr: SUM(o_totalprice)
    - name: Total Revenue per Customer
      expr: SUM(o_totalprice) / COUNT(DISTINCT o_custkey)
    - name: Total Revenue for Open Orders
      expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus='O')
$$

After creation, the Metric View will appear in Unity Catalog as a securable object (similar to views or tables) and will include its metadata (measures, dimensions, lineage) in the Catalog Explorer UI. You can also govern permissions to the Metric Views similar to what you are used to for other Unity Catalog objects.

Viewing and editing the YAML definition is available at any time using the UI. Head to Details to view the definition, or press the Edit button to get back into the YAML Catalog UI as detailed above. If you wish to edit using the SQL code, the CREATE OR REPLACE logic ensures you can always overwrite the code with your new definition.

Key Considerations for your Metric Views design

  • Start small by defining a few core, high-confidence metrics with corresponding dimensions.

  • Use clear, consistent naming conventions across metric views, measures, and dimensions.

  • Leverage metadata fields proactively to aid in discoverability, especially for AI support.

  • Organize metric views consistently throughout your Unity Catalog metastore.

  • Validate output before switching over usage from legacy solutions relying on separate table queries or one-big-table definitions.

Advanced calculations

As a person who is used to working with Semantic Models in Power BI I have gotten used to DAX supporting a multitude of rather complex time based calculations out of the box such as Year-over-year or Year-to-date . Recently it was brought to my attention that Metric Views actually support quite a few of these patterns already, and while the list might not currently exhaust all your needs, I would imagine that the most common use-cases are in development as we speak.

version: 1.1

source: samples.tpch.orders
filter: o_orderdate > DATE'1998-01-01'

dimensions:
  - name: date
    expr: o_orderdate
measures:
  - name: previous_day_sales
    expr: SUM(o_totalprice)
    window:
      - order: date
        range: trailing 1 day
        semiadditive: last
  - name: current_day_sales
    expr: SUM(o_totalprice)
    window:
      - order: date
        range: current
        semiadditive: last
  - name: day_over_day_growth
    expr: (MEASURE(current_day_sales) - MEASURE(previous_day_sales)) / MEASURE(previous_day_sales) * 100

This feature is still Experimental, however I think it is important enough that I have included a few examples of advanced calculations below. For the full list of window measures, see the Databricks Documentation.

Known limitations

While Metric Views is yet to reach General Availability, it is production ready, and supports most use cases. However, there are still a few things that are not possible using Metric Views. Databricks keep their own list updated here. However, there is one thing that I stumbled upon myself, that I think deserves its own mentionGit is generally supported for Metric Views, in the sense that you can save and edit the code as YAML. However, there is currently no direct support for deployment using Databricks Asset Bundles. As a result, the best solution for now, is to define SQL queries that create your Metric Views, and have these as part of a Databricks Job, which is then included in your CI/CD setup.

Whats next?

In the next two episodes, we will take (a slightly more advanced) Metric View into action - and see how it works when moving to the front-end layer. After all, that is what our Semantic Layer is meant to do; connect the dots between Data Engineering and Data Analysis.

I hope you will read along next week for Part 2 where we explore Metric Views and the Databricks platform (AI/BI Dashboards, Genie, etc.)

0 REPLIES 0