Hi,
I'm importing large numbers of parquet files (ca 5200 files per day, they each land in a separate folder) into Azure ADLS storage.
I have a DLT streaming table reading from the root folder.
I noticed a massive spike in storage account costs due to file system reads.
Questions: How does DLT identify newly arriving files? Does it always have to monitor the entire folder including all historical files?
Are there any design patterns to resolve this (i.e regarding folder structure, archiving of processed files)?
Many thanks for your help!