Databricks Community

katiej02 · ‎06-02-2025

Hi, I am trying to test our delta sharing server using UC in our workspace.
We noticed that when we execute a query, for example: SELECT COUNT(1) FROM table_name WHERE col1 = 'value' , it sends two /query requests to our server. The first request has empty predicateHints and limitHints while the second request contains the predicateHints from there WHERE clause. What's the purpose of the first request? We were also wondering if we could ignore the first request. Thanks

Vidhi_Khaitan · ‎06-03-2025

This happens because Databricks' Delta Sharing client performs a two-phase query planning process:
First request sent without predicateHints or limitHints and its goal is to fetch the basic metadata -> the schema, partitioning info, statistics, etc.
Second request (actual query planning) is sent with predicateHints and limitHints and the actual data access request, incorporating pruning logic and filters.

Thus, ideally we should not ignore the first request.

katiej02 · ‎06-03-2025

Hi, we understand that the client performs a two-phase query planning process. In our case, the table has around 145,000 Parquet files, and we've observed that the first request becomes a significant bottleneck: the response body is large (655 MB) and takes ~10 seconds. We also tried supporting gzip response but it still takes 10 seconds before second (actual query) request is sent to server. The majority of those files are ultimately filtered out in the second request.

Is this response cached or reused in the client (e.g., Databricks Runtime), or is this metadata fetch expected to occur for every query?

For simple queries like:
SELECT COUNT(1) FROM table WHERE part_col1 = 'val1'

Even though such queries will ultimately be pruned down to a small number of files, the protocol requires returning all 145,000 file entries in the initial response just for the client to begin planning.

Is this the expected performance behavior in such scenarios, or are there best practices to mitigate the overhead?

Appreciate your insights and design rationale here.

Databricks Community

Delta Sharing Inquiry: Requests from Databricks SQL Editor

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples