Databricks Community

fjrodriguez · ‎10-09-2025

Hey !

I would like to migrate one ADF batch ingestion which has a TumblingWindowTrigger on top of the pipeline which pretty much check each 15 min if a file is landing, normally the files land in daily basis so will process it accordingly once in a day, and this self-dependency allow us to guarantee that the File + 1 which is arriving will be processed if the previous one was ingested correctly.

I see in Databricks workflow there are 3 kind of triggers: Schedule, Files Arrival and Continuous - what should be the homologue to the TumblingWindowTrigger and how to set Self Dependency in order to maintain the same approach.

szymon_dybczak · ‎10-09-2025

Hi @fjrodriguez ,

What about using databricks autoloader and triggering workflow every 15 min? Autoloader automatically detects what new files has arrived since last trigger of a job and will load only new files to target table. You can use available now trigger option which consumes all available records as an incremental batch.

So, let's say you prepare a notebook that will use autoloader. Now you will schedule this notebook using databricks workflows with option Max concurrent runs = 1. This will ensure that your job will run every 15 minutes, it will consume all new files that appeared within that period and if processing takes longer than 15 minutes it will wait for a previois job to finish,

View solution in original post

szymon_dybczak · ‎10-09-2025

Hi @fjrodriguez ,

What about using databricks autoloader and triggering workflow every 15 min? Autoloader automatically detects what new files has arrived since last trigger of a job and will load only new files to target table. You can use available now trigger option which consumes all available records as an incremental batch.

So, let's say you prepare a notebook that will use autoloader. Now you will schedule this notebook using databricks workflows with option Max concurrent runs = 1. This will ensure that your job will run every 15 minutes, it will consume all new files that appeared within that period and if processing takes longer than 15 minutes it will wait for a previois job to finish,

fjrodriguez · ‎10-09-2025

@szymon_dybczak - so lets assume tomorrow morning Files ingestion fail - What will happen with next one ? I want the next ingestion should not happen and retain it till fixing the stuck one.

fjrodriguez · ‎10-09-2025

With ADF is straightforward with re-triggering the one that failed and then will automatically ingest the files with are queued after fix the failed ingestion.

szymon_dybczak · ‎10-09-2025

That's the beauty of autoloader. It stores succesfully processed files in checkpoint location. But if processing fails for whatever reason, autoloader will try to reingest all the files that weren't succefully loaded in previous run + all the new files that appeared.