transform a dataframe column as concatenated string

geertvanhove
New Contributor III

Hello,
I have a single column dataframe and I want to transform the content into a string

EG df=

abc

def

xyz

To

abc, def, xyz

 

Thanks

Thanks, but can you give me an example for this because following still gives me an error: 

result = ', '.join(df.select(col("meterId")).tolist())

geertvanhove
New Contributor III

I get following error:

'DataFrame' object has no attribute 'tolist'

geertvanhove
New Contributor III

sure:

 

%python
from pyspark.sql.functions import from_json, col, concat_ws
from pyspark.sql.types import *

schema = StructType([StructField('meterDateTime', StringType(), True), StructField('meterId', LongType(), True), StructField('meteringState', StringType(), True), StructField('value', DoubleType(), True), StructField('versionTimestamp', StringType(), True), StructField('file_name', StringType(), False), StructField('file_modification_time', TimestampType(), False)])

df = ( spark
        .read
        .format("json")
        .schema(schema)
        .load(f'{path_sep}/*/*/*/*.json')
        .select("meterId")
  )

result = ', '.join(df.tolist())
print(result)