szymon_dybczak
Esteemed Contributor III

Hi @ChristianRRL ,

Great question. In my case, we use Autoloader exactly as you described - let's called it "batch mode" using .trigger(availableNow=True). Our pipeline runs once a day and uses Autoloader to load new data into the bronze layer.
Regarding compute recommendations: Databricks always recommends using job compute clusters. However, I don’t see anything wrong with using an all-purpose cluster (especially if you plan to use the ‘available now’ option).

But nonetheless, the official recommendation is to use job compute.

Production considerations for Structured Streaming - Azure Databricks | Microsoft Learn

PS. If the answer was helpful to you, consider marking it as the accepted answer. This way, we help others find the solution to their problem more quickly.

View solution in original post