Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-14-2024 07:39 AM
Hello,
I managed to parse using the "json_tuple" function
There are other functions that can help
schema_of_json
get_json_object
from_json
dd = (
df.select(json_tuple(col("AdditionalRequestParameters"), "Locale", "Setversion", "Flags", "ScalingID", "VersionInfo", "SegmentCountry", "KnowledgeType"))
.toDF("Locale", "Setversion", "Flags", "ScalingID", "VersionInfo", "SegmentCountry", "KnowledgeType")
.select("*", json_tuple(col("VersionInfo"), "PracticeSubType", "Version"))
.drop("VersionInfo")
.toDF("Locale", "Setversion", "Flags", "ScalingID", "SegmentCountry", "PracticeSubType", "Version", "KnowledgeType")
.withColumn("SegmentCountry", regexp_replace(col("SegmentCountry"), "[\[\]]", ""))
.select("*", json_tuple(col("SegmentCountry"), "Country", "IndustrySegment"))
.drop("SegmentCountry")
.toDF("Locale", "Setversion", "Flags", "ScalingID", "KnowledgeType", "PracticeSubType", "Version", "Country", "IndustrySegment")
.withColumn("KnowledgeType", regexp_replace(col("KnowledgeType"), "[\[\]]", ""))
.select("*", json_tuple(col("KnowledgeType"), "Name", "Name"))
.drop("KnowledgeType")
.toDF("Locale", "Setversion", "Flags", "ScalingID", "PracticeSubType", "Version", "Country", "IndustrySegment", "Name", "Name1")
)
dd.display()
Att.
Thomaz Antonio Rossito Neto
Master Data Specialist - Data Architect | Data Engineer @ CI&T
Thomaz Antonio Rossito Neto
Master Data Specialist - Data Architect | Data Engineer @ CI&T