Hello,
There are a couple of ways how you can define an empty spark dataframe, here are some of them:
1. Create an empty dataframe with a schema
schema = StructType([
StructField('name', StringType(), True),
StructField('age', IntegerType(), True)
])
empty_df = spark.createDataFrame([], schema)
2. Create an empty dataframe without specifying any cols
empty_df_without_cols = spark.createDataFrame([], StructType([]))
3. Creating empty RDD then converting it to dataframe (just fyi, this option won't work in free edition, because of the serverless compute)
schema = StructType([
StructField('name', StringType(), True),
StructField('age', IntegerType(), True)
])
emptyRDD = spark.sparkContext.emptyRDD()
empty_df1 = emptyRDD.toDF(schema)
Hope that helps.
Best, Ilir