Replace one tag in a Jason file in the data bricks table .

Bhabs
New Contributor

 

There is a column (src_json) in emp_table . I need to replace (ages to age )in each json in the src_json column in emp_table.

Can you pls suggest the best way to do it .

szymon_dybczak
Esteemed Contributor III

Hi @Bhabs ,

You can do it in following way (assuming that src_json contains json string):

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, expr

spark = SparkSession.builder.appName("Replace JSON Keys").getOrCreate()

data = [
    ('{"name": "John", "ages": 30, "department": "HR"}',),
    ('{"name": "Jane", "ages": 25, "department": "Finance"}',),
    ('{"name": "Doe", "ages": 35, "department": "IT"}',)
]


columns = ["src_json"]
emp_table = spark.createDataFrame(data, columns)

# Use expr to update the JSON string
emp_table = emp_table.withColumn("src_json", expr("regexp_replace(src_json, '\"ages\"', '\"age\"')"))

emp_table.show(truncate=False)