Hello everyone,
I’m encountering an issue when querying large Parquet files in Databricks, particularly with files exceeding 1 GB in size. The queries are running extremely slow, and at times, they even time out. I’ve tried optimizing the file size and partitioning strategy, but the problem persists.
Has anyone faced a similar issue or have any insights on optimizing performance for large Parquet files in Databricks?
Thanks in advance.