transform a dataframe column as concatenated string
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-17-2023 08:48 AM
Hello,
I have a single column dataframe and I want to transform the content into a string
EG df=
abc |
def |
xyz |
To
abc, def, xyz |
Thanks
3 REPLIES 3
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-23-2023 07:01 AM
Thanks, but can you give me an example for this because following still gives me an error:
result = ', '.join(df.select(col("meterId")).tolist())
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-23-2023 06:00 AM
I get following error:
'DataFrame' object has no attribute 'tolist'
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-23-2023 06:24 AM
sure:
%python
from pyspark.sql.functions import from_json, col, concat_ws
from pyspark.sql.types import *
schema = StructType([StructField('meterDateTime', StringType(), True), StructField('meterId', LongType(), True), StructField('meteringState', StringType(), True), StructField('value', DoubleType(), True), StructField('versionTimestamp', StringType(), True), StructField('file_name', StringType(), False), StructField('file_modification_time', TimestampType(), False)])
df = ( spark
.read
.format("json")
.schema(schema)
.load(f'{path_sep}/*/*/*/*.json')
.select("meterId")
)
result = ', '.join(df.tolist())
print(result)

