cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Cached Views in MERGE INTO operation

deng_dev
New Contributor III

Hi everyone!

I want to use in-memory cached views in a merge into operation, but I am not entirely sure if the exactly saved in-memory view is used in this operation or not.

So, suppose I have a table named table_1 and a cached view named cached_view_1.
When I run explain select * from cached_view_1 from the output I see that in-memory view is used:

== Physical Plan ==
Scan In-memory table cached_view_1 [path#1415]
   +- InMemoryRelation [path#1415], StorageLevel(disk, memory, deserialized, 1 replicas)
....

However, when I run explain merge into table_1 tgt using cached_view_1 src ..., I don't see any information indicating that it's using the in-memory view:

== Physical Plan ==
Execute MergeIntoCommandEdge
   +- MergeIntoCommandEdge SubqueryAlias src, SubqueryAlias tgt, `catalog`.`db`.`table_1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, ....

 

1 REPLY 1

shan_chandra
Honored Contributor III
Honored Contributor III

@deng_dev - Are you using external metastore by any chance. From the physical plan, we could see the catalog`.`db`.`table_1` is not cached.  If it is glue catalog, then caching can be enabled based on the below configs in the article below

https://docs.databricks.com/en/archive/external-metastores/aws-glue-metastore.html#higher-latency-wi...

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.