cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Best Practice: Data Modeling for Customer 360 with Refined/Gold Source Data

rabbitturtles
New Contributor

Hi community,

I'm looking for advice on the best data modeling approach for a Customer 360 (C360) project where our source data is already highly refined.

I understand the standard Medallion architecture guidelines, which often recommend using Data Vault in the Silver layer for its auditability and integration capabilities, and a Kimball-style star schema in the Gold layer for optimized analytics and BI. This works great when ingesting raw data through a Bronze -> Silver -> Gold pipeline.

However, our situation is a bit different. The source datasets for our C360 are coming from the Gold layers of other teams within the organization. For example, we receive:

  • A clean, validated customer_transactions fact table from the Sales team.

  • A conformed customer_dimension from the master data team.

  • A refined support_tickets table from the Customer Service team.

Since we are essentially starting with "Gold" data, building a full Bronze and Silver layer feels redundant. My primary goal is to create a single, unified view of the customer for operational use cases (like a "single pane of glass" for our support agents), not necessarily for complex OLAP analytics.

Given this context, I'm leaning towards a simpler approach: creating a single, wide, denormalized C360 table directly in our Gold layer. This seems straightforward to build and very easy for end-users to query.

My questions for the community are:

  1. Is creating a wide, denormalized C360 table directly from high-quality sources a sound architectural choice, or are there long-term pitfalls I'm not considering (e.g., maintainability, scalability, tracking history)?

  2. In this "Gold-to-Gold" integration pattern, is there still value in creating an intermediate Silver layer to integrate the source tables before building the final C360 view?

  3. Has anyone implemented a similar C360 model? I'd be very interested to hear about your approach and any lessons learned.

There is a blog from databricks which inlines with c360 in gold layer with a kimball modelling approach, however since kimball aligns more with business processes, a c360 team has not inherent business process and actors and is more aligned to represent a 360 view of customer. https://www.databricks.com/blog/2022/06/24/data-warehousing-modeling-techniques-and-their-implementa...

Thoughts?

Thanks in advance for your insights!

2 REPLIES 2

BS_THE_ANALYST
Esteemed Contributor II

@rabbitturtles , if one of the objectives you have is to enable your business users to take advantage of AI in Databricks platform, there's value in having the Gold tables in Databricks. You could create Genie spaces, for instance, that only interact with the gold tables that you provide it access to. Genie spaces would be one of the many AI related use cases, of course.

The denormalised table doesn't sound like a bad idea in my opinion. Part of data modelling is to enable the business users to be able to interact with the model, if this suits your business needs, then it's worth at least trying it out. You could always run a type of "A/B" test and see how the users interact with both forms of the model. You could also benchmark the effiencies/cost and trial out things like altering the table i.e. datatypes, column names, removing/adding columns. Which ones cause the biggest impact to you. 

On a slightly different note, thinking longer term about your table, and the size it may be, you could think further ahead about partitioning/liquid clustering: https://docs.databricks.com/aws/en/delta/clustering  https://docs.databricks.com/aws/en/tables/partitions .. this could be quite cool in terms of query performance and cost savings if you know common query patterns that the team would use.

All the best,
BS

rabbitturtles
New Contributor

@BS_THE_ANALYST Thank you so much for your response.

The goal is to keep it flexible as a platform rather than a data product mindset. Keeping this in mind, essentially the customer data platform should enable contribution from different teams preventing the core data engineering team to act as a bottleneck to new data requirements. The end user should not be limited to business users, rather made available to different teams to use this data for varied use cases primarily as the single source of truth of customer single view -> Customer 360.

Thanks for the suggestion of Genie. I agree it adds value for the AI integration and other possible use cases.

Given there are already refined source team data sources, the idea of Gold in a different team specifically for Customer 360 data platform does not make sense to me. It conveys the data quality of it, though the context of bronze and silver is missing in the team scope.

Reviewing articles over the internet, I see most suggesting a unified view but may be there are internal abstractions which have not been shared explicitly.

https://aws.amazon.com/blogs/big-data/create-an-end-to-end-data-strategy-for-customer-360-on-aws/

If not a denormalized table, what would be your suggestion for data modelling to include contribution model from different stakeholders? Since in my opinion, data modelling is derived from business rules and Customer 360 as a team do not essentially have business processes with actors and entities but are more an integrating platform for a holistic customer 360 data view. What's your take on this? Would data modelling from the perspective of only performance and structure make sense?

Would also love to hear your experience of having such systems exposed as an A/B experiment and how they have helped in the process.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now