Hi @Miguel_Salas, While the primary purpose of using Auto Loader with checkpointing is to process new files incrementally, you can still achieve reading only the last updated file within a folder. One approach is to use the cloudFiles.includeExistingFiles
option set to true
and then filter the files based on their modification timestamp. This way, you can identify and process only the most recently updated file.
To obtain information when Auto Loader returns an empty DataFrame without running a count, you can use the cloudFiles
options to get metadata about the files processed. Specifically, the cloudFiles.maxBytesPerTrigger
and cloudFiles.maxFilesPerTrigger
options can help y.... Additionally, you can monitor the streaming queryโs progress to get insights into the number of files processed.
Would you like more details on configuring these options?