cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Is a central UC Catalog management a Good Practice?

Datazilla
New Contributor III

I am working at large company with many more or less independent divisions and we are currently working on the roll out of Unity Catalog in Azure. The idea was to have a central infrastructure repository (deployed via Terraform) to manage all central components like the Databricks Account and the UC metastore. We also wanted to create the UC catalogs here to enforce a specific naming conventions and other standards like tagging etc. The creation of resources within the catalogs is then up to the respective catalog owners. The root Storage Accounts (one per catalog) have also network restriction which requires to allow the networks of corresponding bound workspaces

This month Automatic enablement of Unity Catalog was announced which automatically enables new workspaces for UC. Furthermore, the workspace admins will automatically get the permission on the metastore to create UC catalogs. With this new behaviour we can no longer enforce our central catalog standards.

How do you deal with this situation? Do you also centrally manage all Databricks Workspaces to have full control of all Workspace Admins? It would be great to configure the permissions of workspaces admins in the Account console.

 

6 REPLIES 6

Kaniz_Fatma
Community Manager
Community Manager

Hi @Datazilla , Managing a central UC Catalog in a large organization can be both advantageous and challenging. Let’s explore some considerations and best practices:

 

Unity Catalog Best Practices:

  • Unity Catalog is a fine-grained governance solution for data and AI on the Databricks Lakehouse. It simplifies security and governance by providing a central place for managing data assets.
  • Here are some best practices to consider:
    • Configure a Unity Catalog Metastore: Set up a robust metastore for your Unity Catalog to ensure consistency and reliability.
    • Use Catalogs to Organize Data: Leverage catalogs to organize and categorize your data assets effectively.
    • Configure Access Control: Define access controls at the catalog level to manage permissions for different users and teams.
    • Use Cluster Configurations: Control data assets by configuring cluster settings within the Unity Catalog.
    • Audit Logs: Enable audit logs to track changes and monitor catalog activities.
    • Share Data using Delta Sharing: Utilize Delta Sharing to share data seamlessly across workspaces and...2.

Central vs. Decentralized Management:

  • Central Management:
    • Advantages:
      • Consistent standards: Enforcing naming conventions, tagging, and other standards becomes easier.
      • Streamlined governance: Centralized control simplifies management.
      • Improved visibility: A single point of reference for all components.
    • Challenges:
      • Flexibility: May not accommodate unique requirements of individual divisions.
      • Bottlenecks: Centralized management can slow down decision-making.
  • Decentralized Management:
    • Advantages:
      • Agility: Divisions can tailor solutions to their specific needs.
      • Faster deployment: Independent teams can act swiftly.
      • Customization: Allows for flexibility.
    • Challenges:
      • Inconsistencies: Naming conventions, tagging, and standards may vary.
      • Governance complexity: Ensuring compliance across divisions.
      • Visibility: Lack of centralized oversight.

Balancing Act:

  • Consider a hybrid approach:
    • Central Standards: Enforce naming conventions, tagging, and other standards at the central level.
    • Divisional Autonomy: Allow divisions to manage their resources within the established guidelines.
    • Regular Audits: Periodically review and adjust standards as needed.

Workspace Admin Permissions:

  • While automatic enablement of Unity Catalog simplifies workspace setup, it does impact central catalog standards.
  • Workspace Admins: Consider configuring permissions for workspace admins in the Account console to strike a balance between autonomy and governance.

Remember that there is no one-size-fits-all solution. Evaluate your organization’s unique needs, consider trade-offs, and find a balance that works best for your company’s culture and operational efficiency. 🌟


@Kaniz_Fatma wrote:
  • Workspace Admins: Consider configuring permissions for workspace admins in the Account console to strike a balance between autonomy and governance.

Is there such a configuration in the Account Console? The automatic enablement is rolled out sequentially and our Account is not migrated yet.

SSundaram
Contributor

Without an option to enable/disable the auto creation of catalogs on the account level, this feature can/will never support "Central management" and also causes unnecessary tailwinds for organizations which have been on central governance and a new workspace is created. I prefer the way it was before, workspaces and catalogs just binded. That way it supported all forms of governance. 

Datazilla
New Contributor III

I totally agree.

In our central management we create a dedicated Azure Storage Account for each Catalog. Depending on the Catalogs isolation mode only specific Workspaces have network access to the Storage. The root storage of the Metastore is completely blocked. This means the automatically or de-centrally created Catalogs could not even be used to storage managed data due to missing network access.

Datazilla
New Contributor III

I totally agree.

In our central management we create a dedicated Azure Storage Account for each Catalog. Depending on the Catalogs isolation mode only specific Workspaces have network access to the Storage. The root storage of the Metastore is completely blocked. This means the automatically or de-centrally created Catalogs could not even be used to storage managed data due to missing network access.

Datazilla
New Contributor III
  • Workspace Admins: Consider configuring permissions for workspace admins in the Account console to strike a balance between autonomy and governance.

@Kaniz_Fatma Do you have any information about this configuration? I cannot find such thing in the Account Console. (In my opinion your answer looks LLM generated. So it could be hallucination. If it is not generated, I am sorry)

The automatic enablement for UC has not been rolled out to our account yet.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!