cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results for 
Search instead for 
Did you mean: 

Backing up your unity catalog metadata

DouglasMoore
Databricks Employee
Databricks Employee

Unity Catalog (UC) tracks the metadata and your cloud storage accounts store the your data. This python script will extract the metadata from {catalog}.information_schema into folders in a storage location. Take this and put into a notebook.  Make data backups from the cloud storage console. {catalog} can also include system, which will cover every catalog in UC. The UC configuration is recoverable from the information_schema metadata however I've not had time to make the recovery run in parallel and perform to reasonable expectations.

%python

storage_location = dbutils.widgets.get("storageLocation")
catalog_name = dbutils.widgets.get("catalogName")
(storage_location, catalog_name)
 
%python
#
# For every table in information schema, make a backup copy of it to the storage location and all of the other metadata
#
table_list = spark.catalog.listTables(f"`{catalog_name}`.information_schema")
for table in table_list:
print(f'backing up {table.catalog}.information_schema.{table.name} to {storage_location}/{table.name}...')
info_schema_table_df = spark.sql(f"SELECT * FROM {table.catalog}.information_schema.{table.name}")
info_schema_table_df.write.format("delta").mode("overwrite").save(f"{storage_location}/{table.name}")
print('backup complete')
0 REPLIES 0

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now