ilir_nuredini
Honored Contributor

Hello,

There are a couple of ways how you can define an empty spark dataframe, here are some of them:

1. Create an empty dataframe with a schema

schema = StructType([
StructField('name', StringType(), True),
StructField('age', IntegerType(), True)
])

empty_df = spark.createDataFrame([], schema)

2. Create an empty dataframe without specifying any cols

empty_df_without_cols = spark.createDataFrame([], StructType([]))

3. Creating empty RDD then converting it to dataframe (just fyi, this option won't work in free edition, because of the serverless compute)

schema = StructType([
StructField('name', StringType(), True),
StructField('age', IntegerType(), True)
])

emptyRDD = spark.sparkContext.emptyRDD()
empty_df1 = emptyRDD.toDF(schema)

Hope that helps.

Best, Ilir