Hi, There is a way to retain the copy of data frame, even if the data in underling table is manipulated but that's a memory expensive operation, be careful while using it.
df1 = spark.createDataFrame(df.rdd.map(lambda x: x), schema=df.schema)
Here we are creating a new data frame from the existing data frame with the help of RDD.
RDD is the fundamental data structure in Spark that represents an immutable distributed collection of objects.
Harshit Kesharwani
Self-taught Data Engineer | Seeking Remote Full-time Opportunities