03-30-2023 10:18 AM
I am setting up a new workspace that will use the Unity Catalog. I want all data stored in the Unity Catalog in the following catalogs: dev, staging, prod. I want to prevent users from accidentally reading and writing data elsewhere.
For the above situation, can I hide and/or delete the following default catalogs?
03-31-2023 03:23 AM
@Kevin Rossi Unfortunately hive_metastore can't be hidden as of now. It's not needed for UC, but a Databricks workspace doesn't work well without the default RDS connections which require changes in the way DBR/Spark starts up. Eventually we will have a UC-only workspace with no references to HMS, but that doesn't exist today. (Eng is working on it).
Here are the couple of things as. a workaround.
Configure the default catalog from hive_metastore to another catalog using "spark.databricks.sql.initial.catalog.name" property.
Default catalog can also be set while assigning the workspace to a metastore. If it's already assigned, unassign and reassign the workspace with a default catalog.
samples and system catalogs are read only catalogs, they can't be removed.
Regarding main catalog, we have a feature request called catalog to workspace binding. Be default, a catalog is bound to all workspaces, but using this feature we can bind the catalog only to the desired workspaces. In this case. If we disable all workspace access to main catalog, then it won't be visible on all workspaces. Please reach out to your Databricks contact to onboard your account to this feature.
03-31-2023 03:23 AM
@Kevin Rossi Unfortunately hive_metastore can't be hidden as of now. It's not needed for UC, but a Databricks workspace doesn't work well without the default RDS connections which require changes in the way DBR/Spark starts up. Eventually we will have a UC-only workspace with no references to HMS, but that doesn't exist today. (Eng is working on it).
Here are the couple of things as. a workaround.
Configure the default catalog from hive_metastore to another catalog using "spark.databricks.sql.initial.catalog.name" property.
Default catalog can also be set while assigning the workspace to a metastore. If it's already assigned, unassign and reassign the workspace with a default catalog.
samples and system catalogs are read only catalogs, they can't be removed.
Regarding main catalog, we have a feature request called catalog to workspace binding. Be default, a catalog is bound to all workspaces, but using this feature we can bind the catalog only to the desired workspaces. In this case. If we disable all workspace access to main catalog, then it won't be visible on all workspaces. Please reach out to your Databricks contact to onboard your account to this feature.
03-31-2023 02:39 PM
Short answers that I derived from the above:
OK, cool, thanks. I think that will enable me to effectively govern our users as desired. I am a fan of keeping everything cleanly separated. We are going have two workspaces for our team:
Going forward, I would support features that enable data science teams govern production pipelines in a clean manner. Removing unneeded databases/catalogs and improved management of pipelines would be favorable in my opinion. I think everything that we need to implement this exists now.
Features like the 'catalog to workspace binding' help keep concerns separated; i.e. exposing a research catalog to only our research workspace and preventing access to that catalog in the prod workspace. This feature will prevent us from accidentally writing to the research catalog from a prod pipeline; we will also enforce this with permissions... but I like redundancy.
04-18-2023 08:06 AM
@Kevin Rossi @John Lourdu - I am also new to databricks setting up environment.
Bu default "all users" have read access to below mentioned catalogs,
my question is - i see an option to revoke read access, is it must have read access to all these catalogs to "all users". Can i revoke will there be any impact?
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group