Hey @JoaoPigozzo — great question. This one comes up all the time with the customers I train.
I’ve been doing this for quite a while now and have had the chance to see a wide range of implementations and approaches out in the wild. While there’s no single “best” answer — it really depends on the business context and the goals you’re trying to achieve — there are a few best practices that we feel pretty strongly about.
With that framing in mind, here’s how I generally think about it…
Your current architecture is in a really good place. It aligns cleanly with Unity Catalog best practices and provides a strong, scalable foundation for mixed Data Engineering and Data Science workloads. In particular, the environment-based catalog separation (dev vs. prod), combined with domain or project-level schemas, reflects the most commonly recommended production pattern I see in the field today.
Let’s dig in.
Your dev/prod catalog split, paired with medallion layers (dev_bronze, dev_silver, dev_gold, prod_bronze, prod_silver, prod_gold), follows what I’d call the “gold standard” Unity Catalog approach. You’re using catalogs as the primary unit of isolation, which is the single most important architectural principle in Unity Catalog. Organizing domains and projects at the schema level within those catalogs is exactly where that responsibility belongs—it gives you clean logical organization while preserving fine-grained access control through the UC privilege hierarchy.
I also want to call out the governance signal here: CI/CD pipelines being the only writers to prod_* catalogs, with humans developing exclusively in dev_*. That’s a strong, intentional pattern, and it’s one I consistently see in mature platforms.
On the question of schema versus catalog isolation, your current choice is the right default.
Using schemas for project isolation gives you several advantages. It simplifies privilege management by allowing team-level USE SCHEMA grants without multiplying catalogs. It reduces operational overhead and avoids the “catalog explosion” anti-pattern. It preserves lineage visibility and makes cross-project analysis far more natural—especially important when teams share upstream data or need to join across domains.
Separate catalogs per project or team do make sense in specific cases—typically when compliance requires physical storage separation or when workspace-catalog binding must be enforced very strictly. The trade-off is increased operational complexity, harder cross-catalog queries, and fragmented lineage. For most organizations, schemas are the right tool unless compliance explicitly forces a different answer.
In short: use schemas for project isolation unless you have a clear, documented requirement for physical data separation.
On staging environments, the answer is situational rather than prescriptive.
Staging adds the most value in regulated environments with formal approval gates, organizations with strict change-control processes, or teams deploying complex transformations that genuinely benefit from prod-scale integration testing before release. It can also help when CI/CD maturity is still evolving and additional safety nets are required.
That said, many teams operate very successfully with just dev and prod. If you have strong CI/CD, automated testing, disciplined code reviews, and reliable rollback strategies, staging often provides diminishing returns—especially when weighed against the cost of duplicating storage and compute.
If you choose not to add staging, there are solid operational controls that compensate well. These include robust automated testing in dev using production-scale samples, feature flags or blue-green deployment patterns, mandatory senior review for production changes, proactive monitoring at the table level, and strict enforcement of service principals (not users) for all production writes.
For mixed DE and DS workloads, one real-world pattern I see work extremely well is a hybrid schema model.
Bronze and silver schemas are organized by data producer or source system—things like salesforce, stripe, or application event streams—and are owned by Data Engineering. Gold schemas, on the other hand, are organized by business domain or product area, such as finance or marketing, and are typically owned by Analytics Engineers and Data Scientists.
This pattern solves a common pain point: bronze-to-silver-to-gold pipelines don’t always map cleanly to a single schema boundary. Separating producer-oriented upstream layers from consumer-oriented downstream layers preserves clarity, ownership, and lineage. In practice, Analytics and DS teams develop in dev_gold schemas and promote via CI/CD, while Data Engineering maintains control over upstream ingestion and refinement.
A few additional governance refinements to consider as you continue to mature the platform:
Transfer ownership of production catalogs and schemas to groups, not individual users. Use service principals for all CI/CD writes to production. Grant BROWSE broadly to support discoverability, while keeping USE CATALOG and USE SCHEMA tightly scoped. And if you need full physical separation between environments, managed storage locations at the catalog level are a clean way to achieve it.
Net-net: your architecture is well aligned with industry patterns and Unity Catalog design principles. The main opportunity for refinement—if it’s not already in place—is adopting the producer versus product hybrid schema model to better support mixed DE, analytics, and ML workloads at scale.
Cheers, Lou.