cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

best practices for implementing early arriving fact handling

Phani1
Valued Contributor II

 

Hi All,

Can you please share us the best practices for implementing early arriving fact handling in databricks for streaming data processed in near real time using structured streaming.

There are many ways to handle this use case in batch/mini batch. Specially we are looking for best practices to handle this use case using structured streaming in near real time.

example:

Phani1_0-1724754033290.png

 

 

Example of early arriving fact:

Please refer to the below tables explaining early arriving fact scenarios.

  • One record is received (highlighted in red) in SalesDetail transaction data where corresponding customer (C4) is not loaded into DimCustomer dimension yet.
  • The data for fact (FactSalesDetail) table arrived earlier than corresponding dimension (C4 in DimCustomer) data.

Regards,

Phani

1 REPLY 1

Phani1
Valued Contributor II

Greetings Team, I would like to inquire if any of you have suggestions regarding the query.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now