Is the following type of union safe with spark structured streaming?
union multiple streaming dataframes, and each from a different source.
Anything better solution ?
for example,
df1 = spark.readStream.table(f"{bronze_catalog}.{bronze_schema}.table1")
df2 = spark.readStream.table(f"{bronze_catalog}.{bronze_schema}.table2")
df3 = spark.readStream.table(f"{bronze_catalog}.{bronze_schema}.table3")
df1a = df1.select(....).transform(....)
df2a = df1.select(....).transform(....)
df3a = df1.select(....).transform(....)
df = df1a.unionByName(df2a).unionByName(df3a).dropDuplicates(.....)
df.writeStream.format("delta").outputMode("append").option(
"checkpointLocation", my_checkpoint_path)
).trigger(availableNow=True).table(f"{silver_catalog}.{silver_schema_}.my_silver_table")
df.