Databricks Autoloader Best practice

- - Certifications
- - Learning Paths

- - Community Discussions
- - GenAI Insight Hub
- - Get Started Discussions

- - Get Started Resources
- - Events
- - Product Platform Updates
- - Support FAQs
- - Technical Blog
- - What's New in Databricks
- - Get Started Guides
- - Knowledge Sharing Hub
- - Announcements

- - Technical Councils
- - Private Groups
- - Skills@Scale

- - Databricks Community Champions
- - Khoros Community Forums Support (Not for Databricks Product Questions)
- - Databricks Community Code of Conduct
- - Community Newsletter

Data Engineering

Databricks Autoloader is a popular mechanism for ingesting data/files from cloud storage into Delta; for a very high throughput source, what are the best practices to be following while scaling up an autoloader based pipeline to the tune of millions of events per minute;

While looking at tuning "cloudFiles.fetchParallelism" is something to look at are they any other configurations that need tuning? presumably fetch rate increase should be paired with delete rate from sqs/aqs as well ?