In Databricks, tags—simple key/value metadata—have long been used to organize and manage resources. With Unity Catalog, we introduced a new type of governed tags was introduced, governed by Tag Policies that enable scalable governance through Attribute-Based Access Control (ABAC) and automated data classification.
Moving forward, Tag Policies will unify governed and ungoverned tags within a single framework, delivering a consistent experience for searching and discovering assets. To fully leverage this, we recommend establishing a clear tag management strategy. This allows data stewards to maintain a standardized governance taxonomy, while giving data practitioners the flexibility to tag assets for discovery and analytics. A well-designed tagging strategy fosters a common language across your organization, improving governance, usability, and collaboration.
Tagging in Databricks has evolved from a basic tool for cost attribution to a core component of governance and discoverability of assets. Initially used to track compute resources and manage infrastructure costs, tagging expanded with Cluster Policies and Serverless Budget Policies to support more granular control. As the platform grew, tags also became essential for asset discovery. To address governance challenges, Databricks has introduced Attribute-Based Access Control (ABAC) using tags to simplify permissions management.
Looking ahead, Databricks aims to unify its tagging system, creating a consistent, scalable framework for cost tracking, discovery, and access control. This unified tagging approach will enable organizations to maintain consistent metadata across diverse assets like models, dashboards, and datasets, streamlining both operational oversight and compliance. Tag Policies at the account level will further standardize governance, reducing administrative overhead and risk. As tagging is fundamental to Databricks administration, it empowers teams to implement dynamic, scalable access controls and cost management strategies. Ultimately, this evolution reflects a broader shift toward metadata-driven operations across modern data platforms. First, let us establish the types of tags that exist on Unity Catalog data+AI assets.
Parent Feature |
Permission Set |
What does it do? |
What scope is it assigned to |
Regular Tag |
APPLY TAG |
Allows the principal to add tags to Unity Catalog securable objects. |
Be the Owner of, or have APPLY TAG |
Tag Policies |
CREATE |
Allows the principal to create tag policy(s) |
Account |
Tag Policies |
MANAGE |
-Allows the principal to edit tag policy |
Individual Tag Policy or Account |
Tag Policies |
ASSIGN |
Allows the principal to apply tag(s) governed by tag policy(s) |
Individual Tag Policy or Account |
Tag Policies (Governed Tags) and Regular Tags will co-exist. Tags governed by Tag policies can be leveraged to set tag standards, enforce permissions, and set tags used in ABAC. Whereas ungoverned tags provide data + AI teams the freedom to introduce domain-specific tags that help with search/discovery across all data & AI assets.
An effective tag strategy helps both data practitioners and data stewards
Capability: Discovery, Search, Filter
Scenario: As a Portfolio Manager in domain:portfolio-management, I want to easily browse relevant datasets, dashboards, and models related to financial assets to more easily find assets that I need for my research. Users can use ungoverned tags to mark personal or team-based categorization while leveraging governed tags to set official company-wide tags such as domain:portfolio-management. Ideally, admins and stewards can mark trusted assets such as tables and dashboards as Certified to ensure these assets are more easily discoverable at a glance.
How To: BROWSE is automatically set at the Catalog level for all users to allow discoverability of an asset.
Global Search allows search on all assets via name/tags/metadata using tag keys/values.
Eg. tag:tag_key tag:tag_key:tag_value - Today UC managed tables, views, & models are discoverable by key only but later search-by-value will also be supported)
Capability: Automation, Monitoring, Resource Organization
Scenario: As a Quant Engineer, I need to build reliable, scheduled pipelines that transform raw data into curated, tagged tables for use by ML and business teams. Tags like domain and team power observability dashboards and targeted alerts, making it easy to monitor pipeline health and trace failures to impacted dashboards and targeted alerts, making it easy to monitor pipeline health and trace failures to impacted areas. Additional attributes for data Freshness and completeness ensure datasets meet quality standards for trusted analysis.
How To: In addition to discovery tags, apply the Databricks system tag, Certified to Gold-level tables to signal they are ready for consumption. Examples of discovery tags may include - Domain: wealth management Team1: Investment specialists Team2: Tax Advisors etc.
Anomaly detection metadata (freshness & completeness) can be extracted from tables and displayed in dashboards. In the future, data quality rules and health indicators will provide additional automated insights.
Capability: Data Quality, Data Classification
Scenario: As an Actuary building predictive models, I need to ensure the data I use is both high quality and free from sensitive personal information. Specific Classification tags, including class.name , automatically signal detected sensitive columns, helping flag them for exclusion. This supports compliance, reduces bias, and promotes ethical model development.
How To: Apply ABAC policies that reference tags like class.name to ensure Data Scientists can safely access appropriate data, while restricting or redacting sensitive information such as columns tagged with class.name.
Capability: Attribution, Usage/Cost
Scenario: As a Chief Architect, I need visibility into how teams—like actuaries, underwriters, and investment managers—collaborate in a shared environment and which resources they consume.
Tags on pipelines, clusters, and jobs enable cost attribution by user and team, helping optimize resource usage. Lineage and audit logs help assess trust by revealing usage patterns while Certified and Deprecated tags highlight reliable vs. outdated datasets. Together, this tagging and observability framework supports governance, accountability, and efficient platform management at scale.
How To:
Capability: Tag Management
To assign ungoverned tags, users must be the asset owner or have APPLY TAG permission.
How To: Data Stewards and Admins collaboratively define tag policies to ensure consistency across the platform.
Examples:
When new domains or use cases are introduced, the Manager updates the allowed values in the policy.
Capability: Governance via ABAC + Data Classification
Scenario: When Data Classification is enabled on a catalog, Databricks automatically detects and classifies sensitive data with classification tags. When a steward or admin has applied an ABAC policy referencing one of these tags, UC automatically ensures that the data is protected by default as additional tables are ingested into this catalog.
How To: A Data Steward enables Data Classification for a given catalog and creates an ABAC policy at the catalog level to ensure sensitive data will be filtered or masked accordingly.
Cloud Providers (Eg. AWS, Azure, GCP) offer tags with coarse-grained permissions making them insufficient for fine-grained governance of Unity Catalog assets. Enterprise catalogs (Eg. Collibra, Alation) support tags but are limited to structured data. In contrast, Databricks enables tagging across most assets (Compute, Workflows, Unity Catalog securables, dashboards, etc). Unity Catalog also supports federated data sources, allowing tags to extend governance and attribution to data and workloads beyond the Databricks platform.
Define a tag | - Organizations should centralize tag policies/definitions at the account level, particularly governed ones. - LoBs can dictate additional tagging policies for Workspace assets like clusters, jobs, workflows, etc. Note: With UC, data/AI assets span across workspaces, so due consideration should be given to the nomenclature. |
Grant Tag Policy Permissions | Define domain / BU / LoB specific data stewards who can create Tag Policies relevant to their specific areas and then delegate to power users. |
Assign Tag to a UC Object | Unity Catalog supports assigning tags, both governed (enforced by Tag Policies) and ungoverned, on securables such as catalogs, schemas, tables, views, columns, models, and volumes. Users must have the appropriate APPLY TAG and ASSIGN TAG POlICY permissions to apply tags to these objects. |
Search for a tag/value: UC Explorer | All users should be able to search (Global search) all objects by tag, value, and metadata. |
Capability | Description | Documentation/Onboarding |
Data Classification (System Tags) |
- PII detection |
Beta AWS Azure GCP |
Anomaly Detection |
- Data Freshness |
Beta AWS Azure GCP |
Request For Access | Allows users to discover assets and request specific access from approved stewards who can control access | Private Preview. Contact your Databricks Account Team |
Tag Policy |
- Control which users can create/manage/assign tag policies |
Private Preview. Contact your Databricks Account Team Beta coming soon! |
ABAC Policy | ABAC enables Data Governance Administrators to define access policies once that are applied broadly across the Data Lake |
Private Preview. Contact your Databricks Account Team |
Tags on AI/BI Dashboards | Allows for dashboard certification, organization, and discovery. | Private Preview. Contact your Databricks Account Team |
UC Governance Insights Dashboards | Designed to give enterprise CDOs and admin teams key insights into the health of their data estate by providing out-of-the-box dashboards based on our system tables. | Private Preview. Contact your Databricks Account Team |
Attribute Based Access Control (ABAC) allows data governance administrators to define scalable access policies that are automatically enforced across the data lake. ABAC policies can be defined at the catalog, schema, or table level, and apply broadly based on tags. This allows administrators and data stewards to write one policy at the catalog level that governs access across many tables matching specific tag conditions.
ABAC policies work in conjunction with tags and are governed by a tag policy. Enforcement happens when someone tries to access a data asset that is tagged. All operations on the data asset are immediately captured, and available in real time in the Databricks Audit Log. The diagram below demonstrates the vision of its working:
Collaborate with your business users, business heads, and data stewards to develop an organization-wide approach that defines who is responsible for creating and managing different parts of the tag taxonomy. Then map these responsibilities to existing Databricks roles and permissions. When applicable, workspace admins should enforce tags using compute policies and budget policies. This ensures clarity, accountability, and consistency in how tags are applied and governed across the platform.
2. Standardize the Nomenclature
Using Tag Policies, Databricks unifies how users interact with governed and ungoverned tags. To avoid confusion, establish clear naming conventions for governed tags while allowing flexibility for ungoverned tags used for discovery.
A few suggestions:
3. Change Management for Tags
Tag changes (create, update, delete) can have significant downstream impacts. Establish an enterprise review process, such as a governance review board, to oversee and control modifications of tags used for data governance.
Recommended Controls:
4. Tag Observability
Leverage Tag Policies and ABAC to manage access using tags. Using information schema and audit logs, you will be able to Monitor tag application and usage, ensuring data quality and compliance. You can also use the Governance Insights dashboard to visualize and set alerts for privileged actions such as tag deletions or modifications.
There are several announcements in DAIS 2025, so watch out for updates as these features are made available.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.