Databricks Autoloader Best practice

- - Certifications
- - Learning Paths
- - Databricks Product Tours
- - Get Started Guides
- - Product Platform Updates
- - What's New in Databricks

- - Get Started Resources
- - Events
- - Support FAQs
- - Technical Blog
- - Knowledge Sharing Hub
- - Announcements
- - DatabricksTV

- - Private Groups
- - Skills@Scale

- - Databricks Community Champions
- - Khoros Community Forums Support (Not for Databricks Product Questions)
- - Databricks Community Code of Conduct

Data Engineering

Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.

Databricks Autoloader is a popular mechanism for ingesting data/files from cloud storage into Delta; for a very high throughput source, what are the best practices to be following while scaling up an autoloader based pipeline to the tune of millions of events per minute;

While looking at tuning "cloudFiles.fetchParallelism" is something to look at are they any other configurations that need tuning? presumably fetch rate increase should be paired with delete rate from sqs/aqs as well ?