cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Mimic system table functionality at custom catalog level

Venugopal
New Contributor II

Hi,

I am exploring system tables . I want to have our environment specific data in different catalogs. While it is possible to get audit and other usage info from system tables under system catalog,how can I achieve the same in my custom catalog that I want to create for dev,qa and prod? How should the tables be created in the custom catalog? Basically I want to have 1 catalog per environment and map it to a dedicated Storage account so that data is isolated at env level. Unable to find any info related to this in the docs. fyi, we are already using Azure databricks for close to 4 years. Thanks,Venu

3 REPLIES 3

Walter_C
Databricks Employee
Databricks Employee

 

  • Create Separate Catalogs for Each Environment:

    • You can create separate catalogs for development (dev), quality assurance (qa), and production (prod) environments. This can be done using Unity Catalog in Databricks. Then on the catalogs you can handle under permissions which workspaces will have access to.
  • Map Each Catalog to a Dedicated Storage Account:

    • Each catalog can be mapped to a dedicated storage account to ensure data isolation. This involves setting up external locations in Unity Catalog that point to different storage accounts for each environment.

 

Venugopal
New Contributor II

Hi, thanks for the response. In system tables under system schema, data of audit,usage etc flows in automatically. But when I create my own catalog,I see only two schemas created. 

1. Should I create tables under the catalog for the audit and other data to be inserted?

2.If I need not create tables, how will the workspace specific data get stored in my catalog?

3. If the above is not possible, should I write some automation or sql to read workspace specific data from system tables and ingest that into the tables (in my own catalog)that I will create in my own catalog?

Pls clarify. Thanks

Walter_C
Databricks Employee
Databricks Employee

Just to be clear, what you want to do is have a set of system tables such as audit logs in each catalog for your environments, so when you query the data for those tables you just get information from your environment. On this case there is no built in process to do this, system tables are generated in their own catalog and will collect the information of all catalogs and all workspaces with UC.

You can filter this data and create new tables that stores this filtered data in each of your environments but this will require custom code and extra storage efforts.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group