pradeep_singh
Contributor III

To process late arriving data correctly you would need a business column that identifies when the data was created/updated at the source and have that as the first column in your struct of sequence_by columns . 

A right sequence_by column/columns must be a monotonically increasing representation of the correct event order, with one distinct update per key at each sequencing value ._commit_timestamp alone is only enough if bronze commit time is the exact event order you want to preserve.

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev