I am loading data from CSV into live tables. I have a live delta table with data like this:
WaterMeterID, ReadingDateTime1, ReadingValue1, ReadingDateTime2, ReadingValue2
It needs to be unpivoted into this:
WaterMeterID, ReadingDateTime1, ReadingValue1
WaterMeterID, ReadingDateTime2, ReadingValue2
There are typically 24 sets of the pivoted pairs - ReadingDateTime1 - 24. I could write a SQL statement with 24 unions to unpivot the data. It seems the UNPIVOT clause can't handle this.
Can you recommend how this could be done and work with live tables to do full or incremental refreshes of the unpivoted data? We could write Spark code to do it, but we'd need to unpivot all the data each time we process the upstream data - or keep track of what data we have loaded by using a watermark table and process only the added data. But then, if a full refresh is performed via pipeline, we'd also need to reset the watermark table, which complicates the maintenance.
Once we have all the water meters running, we'll have about 300K pivoted lines per day to process.