<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic NamedStruct fails in the 'IN' query in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/namedstruct-fails-in-the-in-query/m-p/69257#M33879</link>
    <description>&lt;P&gt;I've posted the same question on stackoverflow (&lt;A href="https://stackoverflow.com/questions/78495050/namedstruct-fails-in-the-in-query" target="_self"&gt;link&lt;/A&gt;) as well. I will post any solution I get there.&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I was trying to understand using many columns in the&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;IN&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;query and came across this statement.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;SELECT (1, 2) IN (SELECT c1, c2 FROM VALUES(1, 2), (3, 4) AS (c1, c2));&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Its output is as follows&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="100%"&gt;&lt;STRONG&gt;(named_struct('1', 1, '2', 2) IN (listquery()))&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;true&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;SPAN&gt;I tried using&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;named_struct&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;directly as follows&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;SELECT NAMED_STRUCT('1', 1, '2', 2) IN (SELECT c1, c2 FROM VALUES(1, 2), (3, 4) AS (c1, c2));&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;It gives the following error&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Cannot resolve "(named_struct(1, 1, 2, 2) IN (listquery()))" due to data type mismatch:
The number of columns in the left hand side of an IN subquery does not match the number of columns in the output of subquery.
Left hand side columns(length: 1): ["named_struct(1, 1, 2, 2)"],
right hand side columns(length: 2): ["c1", "c2"].;
line 1 pos 36;
'Project [unresolvedalias(named_struct(1, 1, 2, 2) IN (list#1529887 []), None)]
:  +- Project [c1#1529888, c2#1529889]
:     +- SubqueryAlias AS
:        +- LocalRelation [c1#1529888, c2#1529889]
+- OneRowRelation&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Why does the number of columns becomes an issue now? The above query works fine even when I replace&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;named_struct&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;with&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;struct&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;Any help is appreciated.&lt;/P&gt;</description>
    <pubDate>Fri, 17 May 2024 11:30:53 GMT</pubDate>
    <dc:creator>Dhruv-22</dc:creator>
    <dc:date>2024-05-17T11:30:53Z</dc:date>
    <item>
      <title>NamedStruct fails in the 'IN' query</title>
      <link>https://community.databricks.com/t5/data-engineering/namedstruct-fails-in-the-in-query/m-p/69257#M33879</link>
      <description>&lt;P&gt;I've posted the same question on stackoverflow (&lt;A href="https://stackoverflow.com/questions/78495050/namedstruct-fails-in-the-in-query" target="_self"&gt;link&lt;/A&gt;) as well. I will post any solution I get there.&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I was trying to understand using many columns in the&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;IN&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;query and came across this statement.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;SELECT (1, 2) IN (SELECT c1, c2 FROM VALUES(1, 2), (3, 4) AS (c1, c2));&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Its output is as follows&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="100%"&gt;&lt;STRONG&gt;(named_struct('1', 1, '2', 2) IN (listquery()))&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="100%"&gt;true&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;SPAN&gt;I tried using&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;named_struct&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;directly as follows&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;SELECT NAMED_STRUCT('1', 1, '2', 2) IN (SELECT c1, c2 FROM VALUES(1, 2), (3, 4) AS (c1, c2));&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;It gives the following error&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Cannot resolve "(named_struct(1, 1, 2, 2) IN (listquery()))" due to data type mismatch:
The number of columns in the left hand side of an IN subquery does not match the number of columns in the output of subquery.
Left hand side columns(length: 1): ["named_struct(1, 1, 2, 2)"],
right hand side columns(length: 2): ["c1", "c2"].;
line 1 pos 36;
'Project [unresolvedalias(named_struct(1, 1, 2, 2) IN (list#1529887 []), None)]
:  +- Project [c1#1529888, c2#1529889]
:     +- SubqueryAlias AS
:        +- LocalRelation [c1#1529888, c2#1529889]
+- OneRowRelation&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Why does the number of columns becomes an issue now? The above query works fine even when I replace&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;named_struct&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;with&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;struct&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;Any help is appreciated.&lt;/P&gt;</description>
      <pubDate>Fri, 17 May 2024 11:30:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/namedstruct-fails-in-the-in-query/m-p/69257#M33879</guid>
      <dc:creator>Dhruv-22</dc:creator>
      <dc:date>2024-05-17T11:30:53Z</dc:date>
    </item>
  </channel>
</rss>

