Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-19-2022 09:50 AM
you can use pyspark to remove the unicodes.
This example removes null unicode. You will have to search or match yours or you can find some solution on google.
# Change null to empty in DataFrame
null = u'\u0000'
dfCnae = df\
.withColumn('id', regexp_replace(df['id'], null, ''))\
.withColumn('description', regexp_replace(df['description'], null, ''))