Databricks Community

SamGreene · ‎12-13-2023

I am loading data from CSV into live tables. I have a live delta table with data like this:

WaterMeterID, ReadingDateTime1, ReadingValue1, ReadingDateTime2, ReadingValue2

It needs to be unpivoted into this:

WaterMeterID, ReadingDateTime1, ReadingValue1

WaterMeterID, ReadingDateTime2, ReadingValue2

There are typically 24 sets of the pivoted pairs - ReadingDateTime1 - 24. I could write a SQL statement with 24 unions to unpivot the data. It seems the UNPIVOT clause can't handle this.

Can you recommend how this could be done and work with live tables to do full or incremental refreshes of the unpivoted data? We could write Spark code to do it, but we'd need to unpivot all the data each time we process the upstream data - or keep track of what data we have loaded by using a watermark table and process only the added data. But then, if a full refresh is performed via pipeline, we'd also need to reset the watermark table, which complicates the maintenance.

Once we have all the water meters running, we'll have about 300K pivoted lines per day to process.

SamGreene · ‎12-14-2023

Thanks Kaniz. I had chatGPT write me a python script to generate the UNION, but I will keep this function in mind if I revisit it.

Databricks Community

Unpivoting data in live tables

Connect with Databricks Users in Your Area

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences

Databricks Community Champion - December 2024 - Sujesh Menon