Hi Ruc
How are you doing today?, as per my understanding, Since your source table has a ReqDetails column with dynamic key-value pairs and only ProductID is fixed, you can parse the ReqDetails column dynamically and merge it into the target table using PySpark and Delta Lake. First, read the source data into a DataFrame and use a UDF (User-Defined Function) to split ReqDetails into a dictionary of key-value pairs. Then, extract only the available keys (e.g., ProductName, Price, ProductParentgroup) and create new columns dynamically. Finally, use Delta Lake’s MERGE operation to update only the modified values in the target table, using COALESCE() to retain existing data when a column is missing in the request. This way, you avoid a rigid schema while ensuring that updates happen correctly. Hope this helps!
Regards,
Brahma