<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Sub-Query behavior in sql statements in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/sub-query-behavior-in-sql-statements/m-p/87347#M37428</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/111456"&gt;@Harsha777&lt;/a&gt;&amp;nbsp;,&lt;BR /&gt;&lt;BR /&gt;What occurs is called column shadowing.&lt;BR /&gt;What happens is that the column names in main query and sub query&amp;nbsp; are identica and the databricks engine after not finding it in sub query searches in the main query.&lt;/P&gt;&lt;P&gt;The simplest way to avoid the issue is to add to the column the table alias like below:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;SELECT count(*) FROM `catalog`.`schema`.`t_table` AS main
WHERE main._col_check IN (
    SELECT DISTINCT sub._col_check FROM `catalog`.`schema`.`t_check_table` AS sub
)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 02 Sep 2024 19:02:06 GMT</pubDate>
    <dc:creator>filipniziol</dc:creator>
    <dc:date>2024-09-02T19:02:06Z</dc:date>
    <item>
      <title>Sub-Query behavior in sql statements</title>
      <link>https://community.databricks.com/t5/data-engineering/sub-query-behavior-in-sql-statements/m-p/87242#M37424</link>
      <description>&lt;P&gt;Hi Team,&lt;BR /&gt;&lt;BR /&gt;I have a query with below construct in my project&lt;BR /&gt;&lt;BR /&gt;SELECT count(*) FROM `catalog`.`schema`.`t_table`&lt;BR /&gt;WHERE _col_check IN (SELECT DISTINCT _col_check FROM `catalog`.`schema`.`t_check_table`)&lt;BR /&gt;&lt;BR /&gt;Actually, there is no column "_col_check" in the sub-query table "t_check_table". (but present in main table&amp;nbsp;"t_table"&lt;BR /&gt;The expectation is that the query will fail but interestingly the query is giving the total count of the main table "t_table".&lt;BR /&gt;When the sub-query is executed individually, it fails with column not found error.&lt;BR /&gt;&lt;BR /&gt;If I use a column which is not present in the main table then the query fails.&lt;/P&gt;&lt;P&gt;I assume it is using the values of the main table when the column is present in the main table and raising error when the column is not present in both the tables.&lt;BR /&gt;&lt;BR /&gt;I am not sure if I am over-looking anything here or is it an issue, could someone comment of any such observations?&lt;/P&gt;</description>
      <pubDate>Mon, 02 Sep 2024 14:38:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/sub-query-behavior-in-sql-statements/m-p/87242#M37424</guid>
      <dc:creator>Harsha777</dc:creator>
      <dc:date>2024-09-02T14:38:30Z</dc:date>
    </item>
    <item>
      <title>Re: Sub-Query behavior in sql statements</title>
      <link>https://community.databricks.com/t5/data-engineering/sub-query-behavior-in-sql-statements/m-p/87347#M37428</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/111456"&gt;@Harsha777&lt;/a&gt;&amp;nbsp;,&lt;BR /&gt;&lt;BR /&gt;What occurs is called column shadowing.&lt;BR /&gt;What happens is that the column names in main query and sub query&amp;nbsp; are identica and the databricks engine after not finding it in sub query searches in the main query.&lt;/P&gt;&lt;P&gt;The simplest way to avoid the issue is to add to the column the table alias like below:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;SELECT count(*) FROM `catalog`.`schema`.`t_table` AS main
WHERE main._col_check IN (
    SELECT DISTINCT sub._col_check FROM `catalog`.`schema`.`t_check_table` AS sub
)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 02 Sep 2024 19:02:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/sub-query-behavior-in-sql-statements/m-p/87347#M37428</guid>
      <dc:creator>filipniziol</dc:creator>
      <dc:date>2024-09-02T19:02:06Z</dc:date>
    </item>
    <item>
      <title>Re: Sub-Query behavior in sql statements</title>
      <link>https://community.databricks.com/t5/data-engineering/sub-query-behavior-in-sql-statements/m-p/87798#M37445</link>
      <description>&lt;P&gt;thanks&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/117376"&gt;@filipniziol&lt;/a&gt;&amp;nbsp;for sharing the information, that helps!!&lt;/P&gt;&lt;P&gt;Just curious to know if it is the databricks sql behavior or in general sql behavior with all databases?&lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2024 08:56:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/sub-query-behavior-in-sql-statements/m-p/87798#M37445</guid>
      <dc:creator>Harsha777</dc:creator>
      <dc:date>2024-09-03T08:56:46Z</dc:date>
    </item>
    <item>
      <title>Re: Sub-Query behavior in sql statements</title>
      <link>https://community.databricks.com/t5/data-engineering/sub-query-behavior-in-sql-statements/m-p/87799#M37446</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/111456"&gt;@Harsha777&lt;/a&gt;&amp;nbsp;,&lt;BR /&gt;It's general, not related to databricks only.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2024 09:05:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/sub-query-behavior-in-sql-statements/m-p/87799#M37446</guid>
      <dc:creator>filipniziol</dc:creator>
      <dc:date>2024-09-03T09:05:19Z</dc:date>
    </item>
  </channel>
</rss>

