cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Discussions
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Sync the production data in environment into test environment

zyang
Contributor

Hello,

I have a database called sales which contain several delta tables and views in both production and test workspace. But the data is not synced because some people develop the code in test workspace. As time passed, both the data and the tables in test become different compared to the ones in the production database. May I ask the best practice to sync the data and the tables/view in the production into the test environment?  I am using Azure platform. Now I have a script to copy the production data in the production storage into the test storage and create the table from the test storage again. I think I can expand it into the whole database but the thing is I cannot get the full list of tables/views in production database by using "show tables in database" in test workspace. Would you mind giving some advice? Any advice regarding to getting the full production table list from the test workspace or the best practice to sync data between production and test.

 

Thanks.

Best regards

5 REPLIES 5

Hemant
Valued Contributor II

HI@zyang  Unity Catalog Enabled workspace, you can access your databases in a cross environment.

Hemant Soni

Chris_Shehu
Valued Contributor III

Not always the best if your trying to keep environments isolated between workspaces. 

Anonymous
Not applicable

Hi @zyang 

Hope everything is going great.

Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you. 

Cheers!

No, I would like to wait for more advice

Kaniz
Community Manager
Community Manager

Hi @zyang

To sync data and tables/views between production and test workspaces in Azure, the recommended approach is to use the Databricks Sync (DBSync) project, which is an object synchronization tool that backs up, restores, and syncs Databricks workspaces. Using the Data Explorer upgrade wizard, you can use the DBSync project to copy complete schemas (databases) and multiple external tables from your default Hive metastore to the Unity Catalog metastore.

Here are the steps you can follow:

1. Use the DBSync project to sync the production workspace to the test workspace.
2. After the sync succeeds, go to the test workspace and click Queries.
3. Click Create Query and choose the name of the SQL warehouse.
4. Enter a query, for example SELECT * FROM database.table.
5. Click Run to display the ingested data. Suppose you cannot get the complete list of tables/views in the production database by using "show tables in database" in the test workspace. In that case, you can use the Databricks REST API to get the list of tables/views in the production database and then create them in the test workspace. You can also use the DBSync project to copy the entire database from production to test, including all tables/views.

 
Additional Resources