<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Databricks XML - Bypassing rootTag and rowTag in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59786#M6496</link>
    <description>&lt;P&gt;I see the current conversion of dataframe to xml need to be improved.&lt;/P&gt;&lt;P&gt;My dataframe schema is a perfect nested schema based on structs but when I create a xml I have the follow issues:&lt;/P&gt;&lt;P&gt;1) I can't add elements to root&lt;/P&gt;&lt;P&gt;2) rootTag and rowTag are required&lt;/P&gt;&lt;P&gt;In the end I remove the first level of hierarchy (rowTag) using string methods or manually. The rowTag is already part of the dataframe nested schema so it doesn't make any sense&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 09 Feb 2024 12:05:59 GMT</pubDate>
    <dc:creator>RobsonNLPT</dc:creator>
    <dc:date>2024-02-09T12:05:59Z</dc:date>
    <item>
      <title>Databricks XML - Bypassing rootTag and rowTag</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59786#M6496</link>
      <description>&lt;P&gt;I see the current conversion of dataframe to xml need to be improved.&lt;/P&gt;&lt;P&gt;My dataframe schema is a perfect nested schema based on structs but when I create a xml I have the follow issues:&lt;/P&gt;&lt;P&gt;1) I can't add elements to root&lt;/P&gt;&lt;P&gt;2) rootTag and rowTag are required&lt;/P&gt;&lt;P&gt;In the end I remove the first level of hierarchy (rowTag) using string methods or manually. The rowTag is already part of the dataframe nested schema so it doesn't make any sense&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Feb 2024 12:05:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59786#M6496</guid>
      <dc:creator>RobsonNLPT</dc:creator>
      <dc:date>2024-02-09T12:05:59Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks XML - Bypassing rootTag and rowTag</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59789#M6498</link>
      <description>&lt;P&gt;Hi Kaniz. Willl test your suggestions but I think the documentation provided by Databricks / Spark&amp;nbsp; should include those relevant topics in depth. I've seen lots of posts on web regarding this topic.&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Fri, 09 Feb 2024 12:16:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59789#M6498</guid>
      <dc:creator>RobsonNLPT</dc:creator>
      <dc:date>2024-02-09T12:16:19Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks XML - Bypassing rootTag and rowTag</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59790#M6499</link>
      <description>&lt;P&gt;Hi Kaniz . I tested &lt;SPAN&gt;option("rowTag", "")&amp;nbsp;&lt;/SPAN&gt;using the library&amp;nbsp;com.databricks:spark-xml_2.12:0.17.0 and also adb native format (runtime 14.3) but in both I got the error&amp;nbsp;&amp;nbsp;"&lt;SPAN&gt;requirement failed: 'rowTag' option should not be empty string".&lt;/SPAN&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Feb 2024 12:36:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59790#M6499</guid>
      <dc:creator>RobsonNLPT</dc:creator>
      <dc:date>2024-02-09T12:36:25Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks XML - Bypassing rootTag and rowTag</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59832#M6500</link>
      <description>&lt;P&gt;Here is one of the ways to use the struct field name as rowTag:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;LI-CODE lang="java"&gt;import org.apache.spark.sql.types._
val schema = new StructType().add("Record",
  new StructType().add("age", IntegerType).add("name", StringType))
val data = Seq(Row(Row(18, "John Doe")), Row(Row(19, "Mary Doe")))

val df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema)
val rowTag = schema.fields.head.name
df.coalesce(1).select(s"$rowTag.*").write.mode("Overwrite").option("rowTag", rowTag).xml("/tmp/xml_test")&lt;/LI-CODE&gt;
&lt;P&gt;If the generated XML file above read again, it will have a flattened schema with two fields ('age' and 'name') instead of a single struct column.&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Sat, 10 Feb 2024 06:15:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59832#M6500</guid>
      <dc:creator>sandip_a</dc:creator>
      <dc:date>2024-02-10T06:15:09Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks XML - Bypassing rootTag and rowTag</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59839#M6501</link>
      <description>&lt;P&gt;Hi. In this case rootTag is required also. Otherwise it will be the default "ROWS".&lt;/P&gt;&lt;P&gt;I have attributes at root level (in bold) before rows&lt;/P&gt;&lt;P&gt;&amp;lt;?xml version="1.0" encoding="UTF-8" standalone="yes"?&amp;gt;&lt;BR /&gt;&amp;lt;root x = 1&amp;gt;&lt;BR /&gt;&amp;nbsp;&lt;STRONG&gt;&amp;lt;rat1&amp;gt;434343&amp;lt;/rat1&amp;gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;lt;rat2&amp;gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;lt;x&amp;gt;4&amp;lt;/x&amp;gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;lt;y&amp;gt;6&amp;lt;/y&amp;gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;lt;/rat2&amp;gt;&lt;/STRONG&gt;&lt;BR /&gt;&amp;nbsp;&amp;lt;rows&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;lt;row&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;nbsp;&amp;lt;a&amp;gt;5&amp;lt;/a&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;nbsp;&amp;lt;b&amp;gt;5&amp;lt;/b&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;lt;/row&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;lt;row&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;nbsp;&amp;lt;a&amp;gt;5&amp;lt;/a&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;nbsp;&amp;lt;b&amp;gt;5&amp;lt;/b&amp;gt;&lt;BR /&gt;&amp;nbsp; &amp;lt;/row&amp;gt;&lt;BR /&gt;&amp;lt;/rows&amp;gt;&lt;BR /&gt;&amp;lt;/root&amp;gt;&lt;/P&gt;&lt;P&gt;The best would be bypassing rootTag and rowTag as my dataframe has the full nested structure. The behaviour should be same as json libraries&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 10 Feb 2024 17:35:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-xml-bypassing-roottag-and-rowtag/m-p/59839#M6501</guid>
      <dc:creator>RobsonNLPT</dc:creator>
      <dc:date>2024-02-10T17:35:43Z</dc:date>
    </item>
  </channel>
</rss>

