cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

The Hardest Part of Our SAP Migration Wasn't the Data. It Was Timing

savlahanish27
Databricks Partner

Part 2 of my series on building an enterprise data platform on Databricks โ€” this one's about Silver.

Part 1 covered why we ran two ingestion paths in parallel (GoldenGate CDC + JDBC batch) and kept them as separate bronze tables. If you missed it:
https://medium.com/@savlahanish/why-we-used-two-bronze-tables-instead-of-one-and-why-it-mattered-9c4...

Part 2 is where it got harder.

When both Bronze tables exist simultaneously, you inevitably end up with the same logical record in two places โ€” captured differently, timestamped differently, and neither timestamp is fully reliable on its own.

Three things this covers that most CDC tutorials don't:

โ†’ The 5-minute overlap window where _ingest_time alone gives you the wrong answer - and the tiebreaker we added to fix it

โ†’ How CDC DELETE events silently keep deleted SAP records alive in Silver if you don't handle them explicitly in your MERGE statement

โ†’ The natural key mistake we made on one table - only caught when a business analyst noticed transaction counts in Silver didn't match SAP

Full post: https://medium.com/@savlahanish/the-hardest-part-of-our-sap-migration-wasnt-the-data-it-was-timing-e...

Has anyone else hit timing issues during the initial load window on a similar migration?
Curious how others handled the overlap period between snapshot and streaming.

0 REPLIES 0