<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Cached Views in MERGE INTO operation in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/cached-views-in-merge-into-operation/m-p/67182#M33300</link>
    <description>&lt;P&gt;Hi everyone!&lt;/P&gt;&lt;P&gt;I want to use in-memory cached views in a merge into operation, but I am not entirely sure if the exactly saved in-memory view is used in this operation or not.&lt;/P&gt;&lt;P&gt;So, suppose I have a table named table_1 and a cached view named cached_view_1.&lt;BR /&gt;When I run&amp;nbsp;&lt;STRONG&gt;&lt;FONT color="#808080"&gt;&lt;EM&gt;explain select * from&amp;nbsp;cached_view_1&amp;nbsp;&lt;/EM&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT color="#000000"&gt;from the output I see that in-memory view is used:&lt;BR /&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;== Physical Plan ==
Scan In-memory table cached_view_1 [path#1415]
   +- InMemoryRelation [path#1415], StorageLevel(disk, memory, deserialized, 1 replicas)
....&lt;/LI-CODE&gt;&lt;P&gt;&lt;FONT color="#000000"&gt;However, when I run explain &lt;EM&gt;&lt;STRONG&gt;&lt;FONT color="#808080"&gt;merge into table_1 tgt using cached_view_1 src&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/EM&gt; ..., I don't see any information indicating that it's using the in-memory view:&lt;/FONT&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;== Physical Plan ==
Execute MergeIntoCommandEdge
   +- MergeIntoCommandEdge SubqueryAlias src, SubqueryAlias tgt, `catalog`.`db`.`table_1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, ....&lt;/LI-CODE&gt;&lt;P&gt;&lt;FONT color="#000000"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 24 Apr 2024 12:18:03 GMT</pubDate>
    <dc:creator>deng_dev</dc:creator>
    <dc:date>2024-04-24T12:18:03Z</dc:date>
    <item>
      <title>Cached Views in MERGE INTO operation</title>
      <link>https://community.databricks.com/t5/data-engineering/cached-views-in-merge-into-operation/m-p/67182#M33300</link>
      <description>&lt;P&gt;Hi everyone!&lt;/P&gt;&lt;P&gt;I want to use in-memory cached views in a merge into operation, but I am not entirely sure if the exactly saved in-memory view is used in this operation or not.&lt;/P&gt;&lt;P&gt;So, suppose I have a table named table_1 and a cached view named cached_view_1.&lt;BR /&gt;When I run&amp;nbsp;&lt;STRONG&gt;&lt;FONT color="#808080"&gt;&lt;EM&gt;explain select * from&amp;nbsp;cached_view_1&amp;nbsp;&lt;/EM&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT color="#000000"&gt;from the output I see that in-memory view is used:&lt;BR /&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;== Physical Plan ==
Scan In-memory table cached_view_1 [path#1415]
   +- InMemoryRelation [path#1415], StorageLevel(disk, memory, deserialized, 1 replicas)
....&lt;/LI-CODE&gt;&lt;P&gt;&lt;FONT color="#000000"&gt;However, when I run explain &lt;EM&gt;&lt;STRONG&gt;&lt;FONT color="#808080"&gt;merge into table_1 tgt using cached_view_1 src&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/EM&gt; ..., I don't see any information indicating that it's using the in-memory view:&lt;/FONT&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;== Physical Plan ==
Execute MergeIntoCommandEdge
   +- MergeIntoCommandEdge SubqueryAlias src, SubqueryAlias tgt, `catalog`.`db`.`table_1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, ....&lt;/LI-CODE&gt;&lt;P&gt;&lt;FONT color="#000000"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Apr 2024 12:18:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cached-views-in-merge-into-operation/m-p/67182#M33300</guid>
      <dc:creator>deng_dev</dc:creator>
      <dc:date>2024-04-24T12:18:03Z</dc:date>
    </item>
    <item>
      <title>Re: Cached Views in MERGE INTO operation</title>
      <link>https://community.databricks.com/t5/data-engineering/cached-views-in-merge-into-operation/m-p/67211#M33304</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/95043"&gt;@deng_dev&lt;/a&gt;&amp;nbsp;- Are you using external metastore by any chance. From the physical plan, we could see the catalog`.`db`.`table_1` is not cached.&amp;nbsp; If it is glue catalog, then caching can be enabled based on the below configs in the article below&lt;/P&gt;
&lt;P&gt;&lt;A href="https://docs.databricks.com/en/archive/external-metastores/aws-glue-metastore.html#higher-latency-with-glue-catalog-than-databricks-hive-metastore" target="_blank"&gt;https://docs.databricks.com/en/archive/external-metastores/aws-glue-metastore.html#higher-latency-with-glue-catalog-than-databricks-hive-metastore&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Apr 2024 19:19:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cached-views-in-merge-into-operation/m-p/67211#M33304</guid>
      <dc:creator>shan_chandra</dc:creator>
      <dc:date>2024-04-24T19:19:04Z</dc:date>
    </item>
  </channel>
</rss>

