@Kenny Shaevelโ :
You are correct that Auto Loader automatically writes new data files continuously as they land in cloud storage. This means that Auto Loader does not wait for a batch of files to arrive before processing them. Instead, it reads each new file as it lands in cloud storage and automatically converts the data to Delta format, allowing you to immediately query the data using Spark SQL or other tools.
The statement "Auto Loader incrementally ingests new data files in batches" is not entirely accurate. While Auto Loader does process data incrementally, it does not necessarily do so in batches. Instead, it processes each new data file as a separate incremental batch, which allows you to query the new data immediately without waiting for a larger batch to accumulate.
It is important to note that Auto Loader's performance can depend on the size and frequency of the incoming data files, as well as the configuration of the Auto Loader job. For example, you can configure Auto Loader to perform additional processing steps, such as data validation or transformation, before converting the data to Delta format. Additionally, you can adjust the batch size or other settings to optimize performance based on your specific workload and data processing requirements.