cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

SELECT from VIEW to CREATE a table or view

TamD
New Contributor III

Hi; I'm new to Databricks, so apologies if this is a dumb question.

I have a notebook with SQL cells that are selecting data from various Delta tables into temporary views.  Then I have a query that joins up the data from these temporary views.

I'd like to use that query to populate a materialized view, however, the CREATE statement throws an error complaining that the temp views cannot be found.  This is course rubbish, because it finds the views just fine when it's just a SELECT.  The error only happens when the CREATE AS is appended to the front.

So,

"SELECT foo, bar FROM viewa" --  works fine. 

"CREATE OR REPLACE MATERIALIZED VIEW matview AS SELECT foo, bar FROM viewa" -- says viewa cannot be found.

I just ran this with CREATE OR REPLACE TABLE instead, and that worked.  So I guess this is a limitation of materialized views?  If so,  1) can this limitation be noted in the mat view documentation, and b) can the error message be corrected to explain that?

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

TamD
New Contributor III

Thanks, FelixIvy.  Just to clarify, the reason you can't use temporary views to load a materialized view is because materialized views (like regular views) must be created using a single query that is saved as part of the view definition.  So the solution in this case is to rewrite the temporary views as ctes or subqueries, so all of the code is in one single query -- as if it were a regular view.

Or, if the code is complex enough, it may be cleaner, more visible, and easier to maintain and optimise to leave the code as-is, in a notebook:  load the data into a table, and then schedule the notebook to refresh the data in the table periodically.

 

View solution in original post

6 REPLIES 6

Akshay_Petkar
New Contributor III

Hi @TamD 

You cannot create a materialized view on a temporary view because temporary views are session-based. Once the session expires, the temporary view will be dropped. To resolve this, you need to create either a regular view or a table, and then you can successfully create a materialized view.
The best practice is to create a materialized view directly on a table.

TamD
New Contributor III

Hi Akshay

Yes, the temporary view exists for the duration of the session.  The table in my example above is loaded in the same session as the temp views and that works fine.  The materialized view is also loaded in the same session as the temp views, only the mat view creation "cannot find" the temp views.  

It makes sense that a VIEW would not work created against temporary views, however the data in a materialized view is persisted and does not change until the notebook is run again to refresh the data:  once the data is loaded, the materialized view has no dependency on the temp views used to load it until the notebook is run again (and the temp views are recreated).  Therefore, I don't see a reason why a table load would work and a mat view load would not.

Akshay_Petkar
New Contributor III

Hi @TamD 

A materialized view looks for a permanent table. In your case, since the temporary view used to load the data is deleted after the session expires, the materialized view cannot find it when the session ends. For example, if you create a materialized view based on a temporary view and then refresh it daily, the refresh will fail if the temporary view is no longer available. This is why a materialized view requires a permanent table or view.

TamD
New Contributor III

I see.  The issue is that although a mat view has to be scheduled and loaded (like a table), it still must be defined as if it were a regular view.  A table can be loaded and refreshed via a notebook, but a view (even a materialised one) has to have a single query as its underlying definition (you can't use a notebook).  So I can either keep my notebook as-is and schedule it to load/refresh a table, or I can re-write it into a bunch of ctes and use that to define a mat view.

 

FelixIvy
New Contributor

Temporary views are session-based and might not be used for materialized views. Use a permanent table or view.

TamD
New Contributor III

Thanks, FelixIvy.  Just to clarify, the reason you can't use temporary views to load a materialized view is because materialized views (like regular views) must be created using a single query that is saved as part of the view definition.  So the solution in this case is to rewrite the temporary views as ctes or subqueries, so all of the code is in one single query -- as if it were a regular view.

Or, if the code is complex enough, it may be cleaner, more visible, and easier to maintain and optimise to leave the code as-is, in a notebook:  load the data into a table, and then schedule the notebook to refresh the data in the table periodically.

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group