cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Cached Views in MERGE INTO operation

deng_dev
New Contributor III

Hi everyone!

I want to use in-memory cached views in a merge into operation, but I am not entirely sure if the exactly saved in-memory view is used in this operation or not.

So, suppose I have a table named table_1 and a cached view named cached_view_1.
When I run explain select * from cached_view_1 from the output I see that in-memory view is used:

== Physical Plan ==
Scan In-memory table cached_view_1 [path#1415]
   +- InMemoryRelation [path#1415], StorageLevel(disk, memory, deserialized, 1 replicas)
....

However, when I run explain merge into table_1 tgt using cached_view_1 src ..., I don't see any information indicating that it's using the in-memory view:

== Physical Plan ==
Execute MergeIntoCommandEdge
   +- MergeIntoCommandEdge SubqueryAlias src, SubqueryAlias tgt, `catalog`.`db`.`table_1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, ....

 

1 REPLY 1

shan_chandra
Esteemed Contributor
Esteemed Contributor

@deng_dev - Are you using external metastore by any chance. From the physical plan, we could see the catalog`.`db`.`table_1` is not cached.  If it is glue catalog, then caching can be enabled based on the below configs in the article below

https://docs.databricks.com/en/archive/external-metastores/aws-glue-metastore.html#higher-latency-wi...