<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Does Databricks get cached result for a subquery? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/does-databricks-get-cached-result-for-a-subquery/m-p/35890#M25985</link>
    <description>&lt;P&gt;&lt;SPAN&gt;If I run a query as "&lt;/SPAN&gt;&lt;EM&gt;SELECT fare_amount FROM nyctaxi.trips where fare_amount &amp;gt; 1.5"&lt;/EM&gt;&lt;SPAN&gt;.&amp;nbsp; The query results will be cached for 24 hours.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I then c&lt;/SPAN&gt;&lt;SPAN&gt;ompose a second query using the previous query as a subquery &lt;EM&gt;"&lt;/EM&gt;&lt;/SPAN&gt;&lt;EM&gt;SELECT * FROM nyctaxi.trips WHERE fare_amount IN (SELECT fare_amount FROM nyctaxi.trips where fare_amount &amp;gt; 1.5)"&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;Will Databricks get the cached result for the subquery to speed up the second query execution?&lt;/P&gt;&lt;P&gt;Note that the ask is not to cache subquery result. Rather, to get the cached result for the subquery when the subquery was run independently before&lt;/P&gt;</description>
    <pubDate>Wed, 28 Jun 2023 21:27:46 GMT</pubDate>
    <dc:creator>whleeman</dc:creator>
    <dc:date>2023-06-28T21:27:46Z</dc:date>
    <item>
      <title>Does Databricks get cached result for a subquery?</title>
      <link>https://community.databricks.com/t5/data-engineering/does-databricks-get-cached-result-for-a-subquery/m-p/35890#M25985</link>
      <description>&lt;P&gt;&lt;SPAN&gt;If I run a query as "&lt;/SPAN&gt;&lt;EM&gt;SELECT fare_amount FROM nyctaxi.trips where fare_amount &amp;gt; 1.5"&lt;/EM&gt;&lt;SPAN&gt;.&amp;nbsp; The query results will be cached for 24 hours.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I then c&lt;/SPAN&gt;&lt;SPAN&gt;ompose a second query using the previous query as a subquery &lt;EM&gt;"&lt;/EM&gt;&lt;/SPAN&gt;&lt;EM&gt;SELECT * FROM nyctaxi.trips WHERE fare_amount IN (SELECT fare_amount FROM nyctaxi.trips where fare_amount &amp;gt; 1.5)"&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;Will Databricks get the cached result for the subquery to speed up the second query execution?&lt;/P&gt;&lt;P&gt;Note that the ask is not to cache subquery result. Rather, to get the cached result for the subquery when the subquery was run independently before&lt;/P&gt;</description>
      <pubDate>Wed, 28 Jun 2023 21:27:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/does-databricks-get-cached-result-for-a-subquery/m-p/35890#M25985</guid>
      <dc:creator>whleeman</dc:creator>
      <dc:date>2023-06-28T21:27:46Z</dc:date>
    </item>
  </channel>
</rss>

