Author: Jeroen Meulemans, Solutions Architect Databricks in Amsterdam
Enterprise data professionals understand the complexities of scaling data operations in a way that promotes both governance and agility. The Databricks Lakehouse, combined with Unity Catalog, offers a robust platform for managing data at scale, but it also introduces challenges related to catalog lifecycle management and ownership delegation.
This blog explores the nuances of federated data catalog ownership in the Databricks Lakehouse, focusing on practical implementation models and leveraging the new MANAGE privilege, which provides teams with owner-like permissions. With this privilege, teams can efficiently manage entitlements, oversee object lifecycles, and handle schema evolution. This blog aims to help data leaders navigate the balance between centralized control and decentralized team autonomy.
A Center of Excellence (CoE) can play a critical role in accelerating digital transformation initiatives. Organizations often leverage their CoEs to build best practices, develop reusable blueprints, and serve as internal consultants. Others emphasize building scalable data platforms to empower individual teams to innovate faster. In both cases, managing data catalogs effectively is key to maintaining operational efficiency and compliance.
For this discussion, we assume a central team provides platform services, while use-case teams build their specific solutions on top of the platform.
Data catalog ownership can be managed using three distinct approaches, each with its own strengths and weaknesses.
Each use-case team creates and owns its own catalog. Either manually, using API or infra as code tooling such as Terraform.
Example:
A fraud detection team creates its own catalog to manage data models and tables independently.
Pros:
Cons:
Security Considerations:
In a decentralized governance model, specific permissions are required for teams to create and manage their own catalogs. For example, teams must be granted the ability to create catalogs, define external locations, and configure storage credentials to use their own storage for managed data.
By default, users do not have these permissions. Granting them should be done selectively to ensure security while enabling autonomy.
When to use:
This model works well for smaller organizations or scenarios where governance requirements are minimal. The use-case team would require special privileges to create the catalogs, which are not granted by default.
The central platform team provisions catalogs based on ticket requests from use-case teams.
Example:
A marketing team files a request via ServiceNow to the central team for a new catalog to store campaign performance data.
Pros:
Cons:
When to use:
Ideal for highly regulated industries where governance and compliance are non-negotiable.
The central team creates catalogs on request and delegates management capabilities to use-case teams using the MANAGE privilege.
Example:
A customer segmentation team requests a catalog from the central team, which provisions it and assigns MANAGE privileges to the team, allowing them to control access and manage schema evolution while the central platform team is still an OWNER of the catalog.
Pros:
Cons:
When to use:
This model is optimal for large enterprises seeking a balance between agility and governance.
The MANAGE privilege in Unity Catalog fills a critical gap in federated ownership by enabling the delegation of management responsibilities without transferring full ownership. This privilege allows users to manage grants, drop objects, and handle schema evolution, while maintaining compliance with enterprise policies, as it does not implicitly grant other permissions like SELECT, which must be explicitly assigned. Requiring USE privileges on the parent catalog and schema, the MANAGE privilege also supports inheritance, propagating permissions to child objects. For example, a data engineering team with MANAGE privileges on a catalog can independently manage grants for data analysts while adhering to the governance policies established by the central team.
In some organizations, especially those using external tools for privilege management, the hybrid model with delegated ownership may not be the ideal approach. Tools such as custom web applications or entitlement management solutions can centralize permission workflows for shared catalogs, requiring teams to follow predefined processes instead of manually configuring security settings. While this ensures consistent governance, it can significantly reduce agility, a trade-off that might be acceptable in highly regulated industries like banking.
Platform teams may also support the creation and management of schemas within catalogs, providing finer control over governance and resource allocation. This approach can raise interesting questions about the distinction between use-case teams and broader business units, as schemas often represent smaller, more specific scopes of ownership. While this is an important topic for some environments, it falls outside the focus of this blog, which concentrates on catalog-level governance and ownership.
Federated ownership of data catalogs is a cornerstone of modern data governance strategies. The MANAGE privilege in Unity Catalog offers a powerful tool to bridge the gap between central control and team-level autonomy. By adopting a hybrid approach, enterprises can empower their teams to innovate while maintaining compliance and operational efficiency.
For more information on how to implement federated ownership with Unity Catalog, refer to the official MANAGE privilege documentation.
"Image designed by Freepik."
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.