Super slow SQL queries on an HC cluster

brickster_2018 — Fri, 25 Jun 2021 17:37:03 GMT

I have a high concurrency cluster where multiple users are running. However, I see the queries are running very slow. I did debug the logs and see more time is spent on the Spark driver. on the Spark UI, I do not see slowness.

Re: Super slow SQL queries on an HC cluster

brickster_2018 — Fri, 25 Jun 2021 17:40:29 GMT

It's possible the connectivity to hive metastore is causing the delay here. When there is a high degree of concurrency and contention for metastore access. Interactive clusters in DBR are configured to use up to 5 (spark.databricks.hive.metastore.client.pool.size) hive clients. So if there are more than 5 concurrently running queries that are accessing the hive for a longer time, then there could be slowness.

The easy solution to try is to increase "spark.databricks.hive.metastore.client.pool.size" . Try increasing to 32 and see if there is an improvement.

topic Super slow SQL queries on an HC cluster in Data Engineering

Super slow SQL queries on an HC cluster

Re: Super slow SQL queries on an HC cluster