Hello everyone,
I was lead in a data platform modernization project. This was my first time administrating databricks and I got myself into quite the situation. Essentially i made the mistake of linking our enterprise wide Unity Catalog to our DEV Azure storage account. Meaning all catalogs created going forward will be stored in this dev storage account that is specifically created for an individual project.
My goal is to move towards a databricks managed storage account so that there wont be an storage account for us to manage. I know that there is no way to remove the storage account from the UC, therefore I would have to delete and recreate the catalog. This would lead to us losing all of our metadata in our current UC.
Our current setup looks like this: 3 environments (dev,uat.prod). Each environment has its own dedicated databricks instance an azure data lake gen2 storage account. All of the UC tables are stored in the storage accounts so the customer data will remain after UC deletion. My concerns are losing all of the metadata and permissions that we have setup this far.
I would like to understand what are my options here and if my thought process is correct. I believe that there is no way to do a deep clone to another UC in a secondary region, which would retain metadata, and then deep clone back to the new UC once stood up. But please correct me if I'm wrong.
If I need to manually re-create the tables via script and re-link to the storage location I believe I would lose all of its metadata such as history. I would then have to re-create groups and users to reassign to catalogs/shemas/tables. However, my research thus far shows this as the only option.
In short, i would like to know the best route for backing up and restoring the Unity Catalog in my current situation.