What is the most efficient way to read in a partitioned parquet file with pyspark?

- - Certifications
- - Learning Paths

- - Community Discussions
- - GenAI Insight Hub
- - Get Started Discussions

- - Get Started Resources
- - Events
- - Product Platform Updates
- - Support FAQs
- - Technical Blog
- - What's New in Databricks
- - Get Started Guides
- - Knowledge Sharing Hub
- - Announcements

- - Technical Councils
- - Private Groups
- - Skills@Scale

- - Databricks Community Champions
- - Khoros Community Forums Support (Not for Databricks Product Questions)
- - Databricks Community Code of Conduct
- - Community Newsletter

Data Engineering

I work with parquet files stored in AWS S3 buckets. They are multiple TB in size and partitioned by a numeric column containing integer values between 1 and 200, call it my_partition. I read in and perform compute actions on this data in Databricks with autoscaling turned off.