Hi everyone,
I'm a Cloud Engineer working on a multi-environment Databricks setup (Dev, QA, Prod), and I've received a request from our Data Engineering team that's a bit unconventional โ they are asking for access to Production Unity Catalogs from lower environments (Dev/QA).
The rationale theyโve provided is that this access would help them:
- Debug production issues more efficiently
- Validate data discrepancies
- Reproduce production scenarios locally without affecting live data
While I understand the operational benefits, I'm concerned about governance, security, and data compliance risks, especially with exposing production data to non-production environments.
Before I proceed further, I wanted to consult the community:
What are the best practices or governance policies around this scenario?
Are there any secure ways to simulate or expose production data safely in lower environments without violating data integrity or compliance rules (e.g., masking, snapshotting, Delta Sharing)?
Is it ever considered a good practice to grant direct access from Dev/QA to Prod catalogs, or should we avoid it altogether?
Are there any official recommendations from Databricks/Microsoft regarding cross-environment catalog access via Unity Catalog?
Would love to hear how other organizations have tackled this. Appreciate any guidance, war stories, or architectural suggestions you can share.
Thanks!
Note: Please explain in detail so that I can progress Fastly and I work in Azure Environment