Datamart creation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2024 12:38 AM
In a scenario where multiple teams access overlapping but not identical datasets from a shared data lake, is it better to create separate datamarts for each team (despite data redundancy) or to maintain a single datamart and use views for team-specific access? What are the trade-offs in terms of performance, maintenance, and scalability
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2024 06:55 AM
IMO there is no single best scenario.
It depends on the case I would say. Both have pros and cons.
If the difference between teams is really small, views could be a solution.
But on the other hand, if you work on massive data, the views first have to be calculated so this can take a while.
So you could use materialized views...
If there is a big difference between teams, coding that in a view might not be optimal.
Making separate datasets also makes sense as you can optimize each one. Also all logic resides in a single place (and not in view definitions).
But this might be overkill for your situation.

