Hey @Avnish_Jain, we're implementing a very similar inter and intra-batch deduplication process albeit with SCD type-1. However, we are afraid the drop_duplicates() (in your case dropDuplicates(["data_hash"])) might be looking at the whole stream of...