cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Materilized view quite slower than table and lots of time on "Optimizing query & pruning files"

Roxio
New Contributor II

I have a query that calls different materialized views, anyway most of the time of the query is spent in "Optimizing query & pruning files" vs the execution.

The difference is like 2-3 secs for the optimization and 300-400ms for the execution

Similar if I run a query against tables, with the same data structure and data, but in the case of tables the difference is 700-800ms for optimization and 70-80ms for the execution

These are results when data is returned by the cache.

My question is:

- Why such a big difference between materialized views and table, when in theory it should be much similar results

- Is there a way that I can try to reduce the time used for "Optimizing query & pruning files"?

 
 
1 ACCEPTED SOLUTION

Accepted Solutions

Brahmareddy
Valued Contributor III

Hi Roxio, How are you doing today?

The difference in query times between materialized views and tables likely comes from the complexity of the views, as they often involve more steps in the background. To reduce the optimization time, you can try simplifying the materialized view, ensuring good data partitioning, and using techniques like Z-ordering or clustering. Also, make sure your caching is efficient and check the query execution plan for any bottlenecks. These steps should help speed things up!

Reagtds,

Brahma

View solution in original post

1 REPLY 1

Brahmareddy
Valued Contributor III

Hi Roxio, How are you doing today?

The difference in query times between materialized views and tables likely comes from the complexity of the views, as they often involve more steps in the background. To reduce the optimization time, you can try simplifying the materialized view, ensuring good data partitioning, and using techniques like Z-ordering or clustering. Also, make sure your caching is efficient and check the query execution plan for any bottlenecks. These steps should help speed things up!

Reagtds,

Brahma

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group