Datamart creation

billykimber
New Contributor

In a scenario where multiple teams access overlapping but not identical datasets from a shared data lake, is it better to create separate datamarts for each team (despite data redundancy) or to maintain a single datamart and use views for team-specific access? What are the trade-offs in terms of performance, maintenance, and scalability

-werners-
Esteemed Contributor III

IMO there is no single best scenario.
It depends on the case I would say.  Both have pros and cons.
If the difference between teams is really small, views could be a solution.
But on the other hand, if you work on massive data, the views first have to be calculated so this can take a while.
So you could use materialized views...
If there is a big difference between teams, coding that in a view might not be optimal.


Making separate datasets also makes sense as you can optimize each one.  Also all logic resides in a single place (and not in view definitions).
But this might be overkill for your situation.