Hi All,
We are trying to use the Spark 3 structured streaming feature/option ".option('cleanSource','archive')" to archive processed files.
This is working as expected using the standard spark implementation, however does not appear to work using autoloader. I cannot see any documentation to specify whether this supported or not. Whether it is a bug or expected. We have tried various tweaks etc to no avail.
is this a bug or expected?
Is there a an alternate approach using autoloader?
Thanks Larry
df = (
spark.readStream
.format("cloudFiles") \
.option("cloudFiles.format", "csv") \
.option("cleanSource","archive")
.option("sourceArchiveDir",archivePath)
.option('header', 'true')
.schema(schema)
.load(path)
.withColumn("loadDate",lit(datetime.utcnow()))
)