<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Issue with quotes in struct type columns when using ODBC in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32812#M23937</link>
    <description>&lt;P&gt;Hey Bilal,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for your reply. This indeed works and is what I'm currently using as a work around (see one of the previous posts).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Derk&lt;/P&gt;</description>
    <pubDate>Mon, 03 Jan 2022 18:46:34 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2022-01-03T18:46:34Z</dc:date>
    <item>
      <title>Issue with quotes in struct type columns when using ODBC</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32802#M23927</link>
      <description>&lt;P&gt;I'm trying to connect to Databricks using pyodbc and I'm running into an issue with struct columns. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As far as I understand, struct columns and array columns are not supported by pyodbc, but they are converted to JSON. However, when there are nested columns that contain quotes (") , the JSON that is returned is invalid, i.e. the quotes are not escaped.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Dataset I used:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql import Row
&amp;nbsp;
data = [
  Row(nested=Row(nested_column="nested with a \" character")),
]
df = spark.sparkContext.parallelize(data).toDF()
&amp;nbsp;
(
  df.write
  .format("delta")
  .mode("overwrite")
  .option("path", "/delta/quotes_issue/")
  .saveAsTable("default.quotes_issue")
)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Querying the data afterwards through pyodbc (using docs &lt;A href="https://docs.databricks.com/dev-tools/pyodbc.html?_ga=2.124165066.1358822627.1639993561-717396139.1632140283" alt="https://docs.databricks.com/dev-tools/pyodbc.html?_ga=2.124165066.1358822627.1639993561-717396139.1632140283" target="_blank"&gt;here&lt;/A&gt;)&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;conn = pyodbc.connect("DSN=Databricks_Cluster", autocommit=True)
&amp;nbsp;
cursor = conn.cursor()
cursor.execute("SELECT * FROM default.quotes_issue")
rows = cursor.fetchall()
for r in rows:
    print(r)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;This gives me the following invalid JSON output.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;'{"nested_column":"nested with a " character"}'&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I've tried with both the Databricks Compute cluster endpoint, as well as the newer SQL Endpoint. Also tried the &lt;A href="https://docs.databricks.com/dev-tools/python-sql-connector.html" alt="https://docs.databricks.com/dev-tools/python-sql-connector.html" target="_blank"&gt;Databricks SQL Connection for Python&lt;/A&gt;, but they all have the same problem.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there a way to configure Databricks, or the Simba driver somehow to escape quotes correctly?&lt;/P&gt;</description>
      <pubDate>Mon, 20 Dec 2021 14:57:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32802#M23927</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-12-20T14:57:10Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with quotes in struct type columns when using ODBC</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32804#M23929</link>
      <description>&lt;P&gt;Hi @Kaniz Fatma​&amp;nbsp;, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for the reply. I'm still curious to the answer for the question above, but I did manage to find a work around.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;U&gt;Work Around&lt;/U&gt;&lt;/P&gt;&lt;P&gt;Using the &lt;A href="https://spark.apache.org/docs/latest/api/sql/#to_json" alt="https://spark.apache.org/docs/latest/api/sql/#to_json" target="_blank"&gt;to_json&lt;/A&gt; method, you can let Spark convert all the complex columns into json, instead of relying on the SQL endpoint / Simba Driver to do this for you. If you want to completely automate the use of to_json, the &lt;A href="https://spark.apache.org/docs/latest/sql-ref-syntax-aux-describe-table.html" alt="https://spark.apache.org/docs/latest/sql-ref-syntax-aux-describe-table.html" target="_blank"&gt;describe table&lt;/A&gt; statement can help you find out which datatypes your columns have.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 22 Dec 2021 08:39:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32804#M23929</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-12-22T08:39:27Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with quotes in struct type columns when using ODBC</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32806#M23931</link>
      <description>&lt;P&gt;Hi @Kaniz Fatma​&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My question wasn't related to connecting Databricks to SQL Server, but on an error I'm getting while connecting to Databricks using pyodbc.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Derk&lt;/P&gt;</description>
      <pubDate>Wed, 22 Dec 2021 13:03:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32806#M23931</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-12-22T13:03:29Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with quotes in struct type columns when using ODBC</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32808#M23933</link>
      <description>&lt;P&gt;Hi @Kaniz Fatma​&amp;nbsp;, thanks for the reply. I think I misunderstood. So the suggestion is to install a number of package and update pyodbc on the Databricks cluster you are using? Will give it a try.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 22 Dec 2021 14:12:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32808#M23933</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-12-22T14:12:32Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with quotes in struct type columns when using ODBC</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32810#M23935</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;A href="https://community.databricks.com/s/profile/0053f000000tvpbAAA" alt="https://community.databricks.com/s/profile/0053f000000tvpbAAA" target="_blank"&gt;@Kaniz Fatma&lt;/A&gt;&amp;nbsp;(Databricks)​&amp;nbsp;. Have tried adding above script as a startup script to the cluster configuration I'm using within databricks, but issue remains.&lt;/P&gt;</description>
      <pubDate>Thu, 23 Dec 2021 10:48:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32810#M23935</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-12-23T10:48:26Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with quotes in struct type columns when using ODBC</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32811#M23936</link>
      <description>&lt;P&gt;@Derk Crezee​&amp;nbsp;- I learned something today. Apparently ODBC does not convert to JSON. There is no defined spec on how to return complex types, in fact that was added only in SQL 2016. That's exactly what you are running into!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;End of history lesson &lt;span class="lia-unicode-emoji" title=":grinning_face_with_smiling_eyes:"&gt;😄&lt;/span&gt; Here's a fix:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;SELECT to_json(nested)
FROM default.quotes_issue&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;This will return:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;{"nested_column":"nested with a \" character"}	&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;which is valid JSON.&lt;/P&gt;</description>
      <pubDate>Tue, 28 Dec 2021 00:21:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32811#M23936</guid>
      <dc:creator>BilalAslamDbrx</dc:creator>
      <dc:date>2021-12-28T00:21:43Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with quotes in struct type columns when using ODBC</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32812#M23937</link>
      <description>&lt;P&gt;Hey Bilal,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for your reply. This indeed works and is what I'm currently using as a work around (see one of the previous posts).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Derk&lt;/P&gt;</description>
      <pubDate>Mon, 03 Jan 2022 18:46:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32812#M23937</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-01-03T18:46:34Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with quotes in struct type columns when using ODBC</title>
      <link>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32813#M23938</link>
      <description>&lt;P&gt;@Derk Crezee​&amp;nbsp; sorry I missed that you already had a fix for this.&lt;/P&gt;</description>
      <pubDate>Tue, 04 Jan 2022 10:53:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/issue-with-quotes-in-struct-type-columns-when-using-odbc/m-p/32813#M23938</guid>
      <dc:creator>BilalAslamDbrx</dc:creator>
      <dc:date>2022-01-04T10:53:48Z</dc:date>
    </item>
  </channel>
</rss>

