Hi,
I am using Matillion architecture where autoloader picks files from AWS S3 and saves in delta lake. Next layer picks the changes from delta lake and does some processing. I am able to set batch size in autoloader and its working. But in bronze to silver layer, unable to set batch limit, its picking all files in one go. Here is my code from bronze to silver layer..
(spark.readStream.format("delta")
.option("useNotification","true")
.option("includeExistingFiles","true")
.option("allowOverwrites",True)
.option("ignoreMissingFiles",True)
.option("maxFilesPerTrigger", 100)
.load(bronze_path)
.writeStream
.option("checkpointLocation", silver_checkpoint_path)
.trigger(processingTime="1 minute")
.foreachBatch(foreachBatchFunction)
.start()
)
Appreciate any help.
Regards,
Sanjay