<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Databricks-Connect shows different partitions than Databricks for the same delta table in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26716#M18733</link>
    <description>&lt;P&gt;db-connect version 9.1.9&lt;/P&gt;&lt;P&gt;cluster db-runtime 9.1 LTS&lt;/P&gt;&lt;P&gt;Python 3.8.10&lt;/P&gt;</description>
    <pubDate>Wed, 02 Mar 2022 10:53:51 GMT</pubDate>
    <dc:creator>s_plank</dc:creator>
    <dc:date>2022-03-02T10:53:51Z</dc:date>
    <item>
      <title>Databricks-Connect shows different partitions than Databricks for the same delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26714#M18731</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;here is a small code-snippet:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('example_app').getOrCreate()
&amp;nbsp;
spark.sql('SHOW PARTITIONS database.table').show()&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt; The output inside the Databricks-Notebook:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;+-------------+-------+--------------------+
|projectNumber|plantId|                name|
+-------------+-------+--------------------+
|         xxxx|     P0|***.yyyy............|
|         yyyy|     P2|***.yyyy............|
...&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;When I run the same code as above in Visual Studio Code, connected to the same cluster through Databricks-Connect, I receive this output:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;+---------+
|partition|
+---------+
|     xxxx|
|     yyyy|
...&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;This output has the wrong column name and shows only the first partition.&lt;/P&gt;&lt;P&gt;This is strange. Everything is identical so the output should be the same.&lt;/P&gt;&lt;P&gt;I receive the correct partitions through sql-describe in both databricks-connect and databricks:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark.sql('describe table database.table').show()
&amp;nbsp;
+--------------+-------------+-------+
|      col_name|    data_type|comment|
+--------------+-------------+-------+
|# Partitioning|             |       |
|        Part 0|projectNumber|       |
|        Part 1|      plantId|       |
|        Part 2|         name|       |
+--------------+-------------+-------+&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The table is a delta-table, located in an azure blob-storage.&lt;/P&gt;&lt;P&gt;I tried to refresh the table but this makes no difference.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I found a difference in the Spark-UI SQL tab.&lt;/P&gt;&lt;P&gt;There are 3 queries for the db-connect run and 4 for the databricks run. &lt;/P&gt;&lt;P&gt;The physical execution plan is identical but the second query "&lt;I&gt;Execute ShowPartitionsDeltaCommand"  &lt;/I&gt;is missing in the db-connect run.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Queries for db-connect:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Execute ShowPartitionsDeltaCommand   |  Output: [projectNumber, plantId, name]&lt;/LI&gt;&lt;LI&gt;bigger execution plan (identical in both cases)   |  Output: [projectNumber, plantId, name]&lt;/LI&gt;&lt;LI&gt;LocalTableScan   |  Output: [&lt;B&gt;partition&lt;/B&gt;]&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Queries for databricks:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Execute ShowPartitionsDeltaCommand | Output: [projectNumber, plantId, name]&lt;/LI&gt;&lt;LI&gt;bigger execution plan (identical in both cases)  |  Output: [projectNumber, plantId, name]&lt;/LI&gt;&lt;LI&gt;Execute ShowPartitionsDeltaCommand   |  Output: [projectNumber, plantId, name]&lt;/LI&gt;&lt;LI&gt;LocalTableScan   |  Output: [projectNumber, plantId, name]&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I don´t know why and how but the 2 partitions get lost with the db-connect query.&lt;/P&gt;&lt;P&gt;Any ideas?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2022 12:13:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26714#M18731</guid>
      <dc:creator>s_plank</dc:creator>
      <dc:date>2022-03-01T12:13:20Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks-Connect shows different partitions than Databricks for the same delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26715#M18732</link>
      <description>&lt;P&gt;docs say the sql api is supported for delta lake, so I would assume they return the same results.&lt;/P&gt;&lt;P&gt;But clearly that is not the case.&lt;/P&gt;&lt;P&gt;What version of db-connect do you use?&lt;/P&gt;</description>
      <pubDate>Wed, 02 Mar 2022 08:38:33 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26715#M18732</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-03-02T08:38:33Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks-Connect shows different partitions than Databricks for the same delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26716#M18733</link>
      <description>&lt;P&gt;db-connect version 9.1.9&lt;/P&gt;&lt;P&gt;cluster db-runtime 9.1 LTS&lt;/P&gt;&lt;P&gt;Python 3.8.10&lt;/P&gt;</description>
      <pubDate>Wed, 02 Mar 2022 10:53:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26716#M18733</guid>
      <dc:creator>s_plank</dc:creator>
      <dc:date>2022-03-02T10:53:51Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks-Connect shows different partitions than Databricks for the same delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26717#M18734</link>
      <description>&lt;P&gt;Hi @Stefan Plank​&amp;nbsp;&lt;/P&gt;&lt;P&gt;There seems some issue with databricks connect and SQL queries. Could you please try SQL connectors? &lt;/P&gt;&lt;P&gt;more info:  &lt;A href="https://docs.databricks.com/dev-tools/python-sql-connector.html" target="test_blank"&gt;https://docs.databricks.com/dev-tools/python-sql-connector.html&lt;/A&gt; ? &lt;/P&gt;&lt;P&gt;It is usually recommended to use an SQL connector if you are using Python development with SQL queries. &lt;/P&gt;&lt;P&gt;more info: &lt;A href="https://docs.databricks.com/dev-tools/databricks-connect.html#overview" target="test_blank"&gt;https://docs.databricks.com/dev-tools/databricks-connect.html#overview&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Let me if this works for you. &lt;/P&gt;</description>
      <pubDate>Tue, 15 Mar 2022 10:08:07 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26717#M18734</guid>
      <dc:creator>User16763506477</dc:creator>
      <dc:date>2022-03-15T10:08:07Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks-Connect shows different partitions than Databricks for the same delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26718#M18735</link>
      <description>&lt;P&gt;Hi @Stefan Plank​&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Just checking if you still need help. Did @Gaurav Rupnar​&amp;nbsp;recommendation help you to resolve your issue? &lt;/P&gt;</description>
      <pubDate>Tue, 05 Apr 2022 23:20:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26718#M18735</guid>
      <dc:creator>jose_gonzalez</dc:creator>
      <dc:date>2022-04-05T23:20:31Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks-Connect shows different partitions than Databricks for the same delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26719#M18736</link>
      <description>&lt;P&gt;Hi @Jose Gonzalez​&amp;nbsp;,&lt;/P&gt;&lt;P&gt;yes the SQL-Connector works fine. Thank you!&lt;/P&gt;</description>
      <pubDate>Wed, 06 Apr 2022 06:16:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26719#M18736</guid>
      <dc:creator>s_plank</dc:creator>
      <dc:date>2022-04-06T06:16:48Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks-Connect shows different partitions than Databricks for the same delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26720#M18737</link>
      <description>&lt;P&gt;Hi @Stefan Plank​&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for your reply, I will mark the response a "best".&lt;/P&gt;</description>
      <pubDate>Mon, 11 Apr 2022 18:29:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/databricks-connect-shows-different-partitions-than-databricks/m-p/26720#M18737</guid>
      <dc:creator>jose_gonzalez</dc:creator>
      <dc:date>2022-04-11T18:29:06Z</dc:date>
    </item>
  </channel>
</rss>

