<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Google BigQuery Foreign Catalog - Incorrect Data Format in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/google-bigquery-foreign-catalog-incorrect-data-format/m-p/109017#M43205</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/98258"&gt;@RobsonNLPT&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;This is a limitation, the data conversion issue you are facing is expected behavior due to the current data type mappings supported by the Lakehouse Federation platform. Unfortunately, this means that the JSON format you see in Google BigQuery results is not preserved when the data is accessed through the foreign catalog in Databricks.&amp;nbsp;BigQuery types such as &lt;CODE&gt;array&lt;/CODE&gt;, &lt;CODE&gt;geography&lt;/CODE&gt;, &lt;CODE&gt;interval&lt;/CODE&gt;, &lt;CODE&gt;json&lt;/CODE&gt;, &lt;CODE&gt;string&lt;/CODE&gt;, and &lt;CODE&gt;struct&lt;/CODE&gt; are mapped to &lt;CODE&gt;VarcharType&lt;/CODE&gt; in Spark. I will check if there is a feature request to adjust this.&lt;/P&gt;</description>
    <pubDate>Wed, 05 Feb 2025 20:39:23 GMT</pubDate>
    <dc:creator>Alberto_Umana</dc:creator>
    <dc:date>2025-02-05T20:39:23Z</dc:date>
    <item>
      <title>Google BigQuery Foreign Catalog - Incorrect Data Format</title>
      <link>https://community.databricks.com/t5/data-engineering/google-bigquery-foreign-catalog-incorrect-data-format/m-p/109013#M43203</link>
      <description>&lt;P&gt;I've tested a foreign catalog connected to a google bigquery project.&lt;/P&gt;&lt;P&gt;The connection was ok and I was able to see my datasets and tables&lt;/P&gt;&lt;P&gt;The problem: for columns with regular data types the data format is perfect but the columns with type record and repeated(arrays) I was expecting the see the json format like I see in google big query results.&lt;/P&gt;&lt;P&gt;The data is a json but with a completely different schema and it doesn't make any sense. The foreign catalog maps the record and repeated data types to&amp;nbsp;&lt;SPAN&gt;varchar(65535).&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Federation is a great feature but those incorrect data conversions are a disaster.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Any helps?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2025 19:17:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/google-bigquery-foreign-catalog-incorrect-data-format/m-p/109013#M43203</guid>
      <dc:creator>RobsonNLPT</dc:creator>
      <dc:date>2025-02-05T19:17:52Z</dc:date>
    </item>
    <item>
      <title>Re: Google BigQuery Foreign Catalog - Incorrect Data Format</title>
      <link>https://community.databricks.com/t5/data-engineering/google-bigquery-foreign-catalog-incorrect-data-format/m-p/109017#M43205</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/98258"&gt;@RobsonNLPT&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;This is a limitation, the data conversion issue you are facing is expected behavior due to the current data type mappings supported by the Lakehouse Federation platform. Unfortunately, this means that the JSON format you see in Google BigQuery results is not preserved when the data is accessed through the foreign catalog in Databricks.&amp;nbsp;BigQuery types such as &lt;CODE&gt;array&lt;/CODE&gt;, &lt;CODE&gt;geography&lt;/CODE&gt;, &lt;CODE&gt;interval&lt;/CODE&gt;, &lt;CODE&gt;json&lt;/CODE&gt;, &lt;CODE&gt;string&lt;/CODE&gt;, and &lt;CODE&gt;struct&lt;/CODE&gt; are mapped to &lt;CODE&gt;VarcharType&lt;/CODE&gt; in Spark. I will check if there is a feature request to adjust this.&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2025 20:39:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/google-bigquery-foreign-catalog-incorrect-data-format/m-p/109017#M43205</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2025-02-05T20:39:23Z</dc:date>
    </item>
    <item>
      <title>Re: Google BigQuery Foreign Catalog - Incorrect Data Format</title>
      <link>https://community.databricks.com/t5/data-engineering/google-bigquery-foreign-catalog-incorrect-data-format/m-p/109020#M43207</link>
      <description>&lt;P&gt;Hi Alberto.&lt;/P&gt;&lt;P&gt;One thing is you convert as string.&lt;/P&gt;&lt;P&gt;The other thing is delivering a json completely wrong. They should deliver at least the json as string&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is not only limitations. You can't release a feature with those unacceptable issues. Data is asset..&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2025 20:47:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/google-bigquery-foreign-catalog-incorrect-data-format/m-p/109020#M43207</guid>
      <dc:creator>RobsonNLPT</dc:creator>
      <dc:date>2025-02-05T20:47:50Z</dc:date>
    </item>
    <item>
      <title>Re: Google BigQuery Foreign Catalog - Incorrect Data Format</title>
      <link>https://community.databricks.com/t5/data-engineering/google-bigquery-foreign-catalog-incorrect-data-format/m-p/109164#M43232</link>
      <description>&lt;P&gt;Hi Alberto&lt;/P&gt;&lt;P&gt;I've found a solution using spark connector with the credentials&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"bigquery"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;This returns the correct data and format I expect.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;I highly recommend a fix on federation engine to support bigquery as a foreign catalog&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Best regards&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 06 Feb 2025 11:27:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/google-bigquery-foreign-catalog-incorrect-data-format/m-p/109164#M43232</guid>
      <dc:creator>RobsonNLPT</dc:creator>
      <dc:date>2025-02-06T11:27:05Z</dc:date>
    </item>
  </channel>
</rss>

