cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16790091296
by Contributor II
  • 1661 Views
  • 0 replies
  • 1 kudos

What is the most efficient way to read in a partitioned parquet file with pyspark?

I work with parquet files stored in AWS S3 buckets. They are multiple TB in size and partitioned by a numeric column containing integer values between 1 and 200, call it my_partition. I read in and perform compute actions on this data in Databricks w...

  • 1661 Views
  • 0 replies
  • 1 kudos
Labels