<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: XML Unmarshalling using JAXB from JavaRDD in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/xml-unmarshalling-using-jaxb-from-javardd/m-p/58543#M31197</link>
    <description>&lt;P&gt;I hope this should work&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;JavaPairRDD&amp;lt;String, PortableDataStream&amp;gt; jrdd = javaSparkContext.binaryFiles(&lt;SPAN&gt;"&amp;lt;path_to_file&amp;gt;&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;);&lt;BR /&gt;Map&amp;lt;String, PortableDataStream&amp;gt; mp = jrdd.collectAsMap();&lt;BR /&gt;OutputStream os = &lt;SPAN&gt;new &lt;/SPAN&gt;FileOutputStream(f);&lt;BR /&gt;mp.values().forEach(pd -&amp;gt; {&lt;BR /&gt;    &lt;SPAN&gt;try &lt;/SPAN&gt;{&lt;BR /&gt;        &lt;SPAN&gt;os&lt;/SPAN&gt;.write(pd.toArray());&lt;BR /&gt;    } &lt;SPAN&gt;catch &lt;/SPAN&gt;(IOException e) {&lt;BR /&gt;        &lt;SPAN&gt;throw new &lt;/SPAN&gt;RuntimeException(e);&lt;BR /&gt;    }&lt;BR /&gt;});&lt;BR /&gt;os.flush();&lt;BR /&gt;&lt;BR /&gt;And then supplying file to jaxb unmarshaller. Not sure if there is a better way.&lt;/PRE&gt;&lt;/DIV&gt;</description>
    <pubDate>Sat, 27 Jan 2024 18:08:52 GMT</pubDate>
    <dc:creator>ShankarReddy</dc:creator>
    <dc:date>2024-01-27T18:08:52Z</dc:date>
    <item>
      <title>XML Unmarshalling using JAXB from JavaRDD</title>
      <link>https://community.databricks.com/t5/data-engineering/xml-unmarshalling-using-jaxb-from-javardd/m-p/58540#M31196</link>
      <description>&lt;P&gt;&lt;SPAN&gt;I have a JavRDD with complex nested xml content that I want to unmarshall using JAXB and get the data in to java objects. Can anyone please help with how can I achieve?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thanks&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 27 Jan 2024 17:04:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/xml-unmarshalling-using-jaxb-from-javardd/m-p/58540#M31196</guid>
      <dc:creator>ShankarReddy</dc:creator>
      <dc:date>2024-01-27T17:04:15Z</dc:date>
    </item>
    <item>
      <title>Re: XML Unmarshalling using JAXB from JavaRDD</title>
      <link>https://community.databricks.com/t5/data-engineering/xml-unmarshalling-using-jaxb-from-javardd/m-p/58543#M31197</link>
      <description>&lt;P&gt;I hope this should work&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;JavaPairRDD&amp;lt;String, PortableDataStream&amp;gt; jrdd = javaSparkContext.binaryFiles(&lt;SPAN&gt;"&amp;lt;path_to_file&amp;gt;&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;);&lt;BR /&gt;Map&amp;lt;String, PortableDataStream&amp;gt; mp = jrdd.collectAsMap();&lt;BR /&gt;OutputStream os = &lt;SPAN&gt;new &lt;/SPAN&gt;FileOutputStream(f);&lt;BR /&gt;mp.values().forEach(pd -&amp;gt; {&lt;BR /&gt;    &lt;SPAN&gt;try &lt;/SPAN&gt;{&lt;BR /&gt;        &lt;SPAN&gt;os&lt;/SPAN&gt;.write(pd.toArray());&lt;BR /&gt;    } &lt;SPAN&gt;catch &lt;/SPAN&gt;(IOException e) {&lt;BR /&gt;        &lt;SPAN&gt;throw new &lt;/SPAN&gt;RuntimeException(e);&lt;BR /&gt;    }&lt;BR /&gt;});&lt;BR /&gt;os.flush();&lt;BR /&gt;&lt;BR /&gt;And then supplying file to jaxb unmarshaller. Not sure if there is a better way.&lt;/PRE&gt;&lt;/DIV&gt;</description>
      <pubDate>Sat, 27 Jan 2024 18:08:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/xml-unmarshalling-using-jaxb-from-javardd/m-p/58543#M31197</guid>
      <dc:creator>ShankarReddy</dc:creator>
      <dc:date>2024-01-27T18:08:52Z</dc:date>
    </item>
  </channel>
</rss>

