<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to export a struct field of Big Query to Parquet/DeltaTable with all struct fields? in Warehousing &amp; Analytics</title>
    <link>https://community.databricks.com/t5/warehousing-analytics/how-to-export-a-struct-field-of-big-query-to-parquet-deltatable/m-p/23804#M560</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I don't know if anyone can help me with a question about BIG QUERY and Parquet in Databricks.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We have export a field named EVENT_PARAMS from a BIG QUERY to a PARQUET table in databricks.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In BIG QUERY, we notice that this column is a STRUCT that has this composition:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;  ARRAY&amp;lt;STRUCT&amp;lt;key STRING, value STRUCT&amp;lt;string_value STRING, int_value bigint, float_value float, double_value float&amp;gt;&amp;gt;&amp;gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In PARQUET, we noticed that the data was exported with "raw" formatting, without exactly reflecting the names of the sub-fields of this STRUCT:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;  {"v":[{"v":{"f":[{"v":"firebase_conversion"},{"v":{"f":[{"v":null},{"v":"1"},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"item_list_name"},{"v":{"f":[{"v":"lista-premios"},{"v":null},{"v":null},{"v":null}]}}]}}]}&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;As you can see, i cannot see the name of fields inside this returno. Only the values of these fields. So it´s very difficult to manipulate it in Pyspark or Spark SQL.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We would like to know if there is any way to translate this raw formatting, in order to make it as faithful as possible to your metadata definition.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Something that then allows you to directly mention the fields in python, pyspark or SQL and has a similar appearance to the one configured below (example):&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;{"v":[{{"key":"firebase_conversion"},{"value":{"f":[{"string_value":null},{"int_value":"1"},{"float_value":null},{"double_value":null}]}}},{{"key":"item_list_name"},{"value":{"f":[{"string_value":"lista_premios"},{"int_value":null},{"float_value":null},{"double_value":null}]}}}]}
  &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for any help you can give me.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best regards,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sergio Coutinho&lt;/P&gt;</description>
    <pubDate>Thu, 20 Mar 2025 17:06:37 GMT</pubDate>
    <dc:creator>Coutinho</dc:creator>
    <dc:date>2025-03-20T17:06:37Z</dc:date>
    <item>
      <title>How to export a struct field of Big Query to Parquet/DeltaTable with all struct fields?</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/how-to-export-a-struct-field-of-big-query-to-parquet-deltatable/m-p/23804#M560</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I don't know if anyone can help me with a question about BIG QUERY and Parquet in Databricks.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We have export a field named EVENT_PARAMS from a BIG QUERY to a PARQUET table in databricks.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In BIG QUERY, we notice that this column is a STRUCT that has this composition:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;  ARRAY&amp;lt;STRUCT&amp;lt;key STRING, value STRUCT&amp;lt;string_value STRING, int_value bigint, float_value float, double_value float&amp;gt;&amp;gt;&amp;gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In PARQUET, we noticed that the data was exported with "raw" formatting, without exactly reflecting the names of the sub-fields of this STRUCT:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;  {"v":[{"v":{"f":[{"v":"firebase_conversion"},{"v":{"f":[{"v":null},{"v":"1"},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"item_list_name"},{"v":{"f":[{"v":"lista-premios"},{"v":null},{"v":null},{"v":null}]}}]}}]}&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;As you can see, i cannot see the name of fields inside this returno. Only the values of these fields. So it´s very difficult to manipulate it in Pyspark or Spark SQL.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We would like to know if there is any way to translate this raw formatting, in order to make it as faithful as possible to your metadata definition.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Something that then allows you to directly mention the fields in python, pyspark or SQL and has a similar appearance to the one configured below (example):&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;{"v":[{{"key":"firebase_conversion"},{"value":{"f":[{"string_value":null},{"int_value":"1"},{"float_value":null},{"double_value":null}]}}},{{"key":"item_list_name"},{"value":{"f":[{"string_value":"lista_premios"},{"int_value":null},{"float_value":null},{"double_value":null}]}}}]}
  &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for any help you can give me.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best regards,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sergio Coutinho&lt;/P&gt;</description>
      <pubDate>Thu, 20 Mar 2025 17:06:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/how-to-export-a-struct-field-of-big-query-to-parquet-deltatable/m-p/23804#M560</guid>
      <dc:creator>Coutinho</dc:creator>
      <dc:date>2025-03-20T17:06:37Z</dc:date>
    </item>
  </channel>
</rss>

