10-04-2023 09:39 AM
Hi all! I am working on a DevOps project to automate the creation of UC objects through different environments (dev-test-prod). Each time we deploy our code to a different environment (using a Github workflow, not really relevant) we want to also create/update the UC object that goes with that (dev_<catalog_name>, test_<catalog_name>, <catalog_name>). During this automation we are going to give certain groups certain grants based on how much access we want them to have.
My initial plan is to copy the dev catalog into a new catalog with a different name (test_<catalog_name>) and then adjust permission after the copy (all via notebook). Is there an easy way to do this initial copy via databricks/azure CLI? I couldn't find anything online. My initial though is to run a notebook to recreate an empty test catalog if it doesn't exists. Then iterate through the dev catalog to fetch the schemas and creating empty schemas in the test catalog. Then iterate through the dev schemas to fetch the tables, view, and functions and recreate them in the test catalog.
Has anyone done anything like this? Is this the best approach?
I originally created this post to ask a specific question about the recreation of functions outlined above but got carried away with explaining the entire project. The function question is below:
I have a function (multiple at this location but describing as 1 for simplicity's sake) at source_catalog.schema1.udf_function I want to copy this function to a new catalog (dest_catalog.schema1.udf_functoin). The CREATE FUNCTION SQL command does not seem capable of this unless I missed something obvious in the documentation. I cannot find anything else online. What is the best way to copy/recreate a function from one UC location to another?
Sorry if this post is all over the place but if anyone has any advice, please let me know. It would be extremely helpful.
Thanks!
10-04-2023 11:51 PM
Hi @DatBoi ,
- Create a new catalog using the Databricks SQL command CREATE CATALOG IF NOT EXISTS test_<catalog_name>
- Use the new catalog with the command %sql USE CATALOG test_<catalog_name>
- Iterate through the schemas in the dev catalog and create them in the new catalog using the command %sql CREATE SCHEMA IF NOT EXISTS <schema_name>
- Recreate the tables and views in the new catalog using the CREATE TABLE
and CREATE VIEW
commands
- For functions, there is no direct way to copy or recreate them from one UC location to another. Manual recreation in the new location is necessary.
- Set up necessary permissions for groups that need access to the new catalog, schemas, and tables using the GRANT
SQL command.
10-04-2023 12:13 PM
Hi @DatBoi, To copy or recreate a function from one UC location to another, you would need first to understand the definition and implementation of the original function and then play that function in the new location using the CREATE FUNCTION
SQL command.
Unfortunately, there is no direct way to copy a function from one catalog to another. The CREATE FUNCTION
command does not support copying or moving parts. It creates a new function based on the provided definition and implementation.
10-04-2023 12:19 PM
Is there a way to get that information programmatically? Also, I started looking into the Databricks SDK because I figured I would have to do something like that.
10-04-2023 11:51 PM
Hi @DatBoi ,
- Create a new catalog using the Databricks SQL command CREATE CATALOG IF NOT EXISTS test_<catalog_name>
- Use the new catalog with the command %sql USE CATALOG test_<catalog_name>
- Iterate through the schemas in the dev catalog and create them in the new catalog using the command %sql CREATE SCHEMA IF NOT EXISTS <schema_name>
- Recreate the tables and views in the new catalog using the CREATE TABLE
and CREATE VIEW
commands
- For functions, there is no direct way to copy or recreate them from one UC location to another. Manual recreation in the new location is necessary.
- Set up necessary permissions for groups that need access to the new catalog, schemas, and tables using the GRANT
SQL command.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group