@De Vos Meakerโ :
One potential reason why the code works in a notebook but returns an empty array in a Delta Live Tables pipeline is that there may be differences in the data being processed. It's possible that the pipeline is processing different data that doesn't have any keys in the json_map column, leading to an empty array result.
As for alternative methods to extract the different fields of a map or json column to separate columns, you can try using the
getItem
function in PySpark. Here's an example code snippet:
from pyspark.sql.functions import col
df = df.select(
col("json_map").getItem("key1").alias("column1"),
col("json_map").getItem("key2").alias("column2"),
col("json_map").getItem("key3").alias("column3")
)
This code creates new columns column1, column2, and column3 by extracting the values of the keys "key1", "key2", and "key3" from the json_map column using the getItem function. You can customize this code to extract the specific keys you need.