<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: What types of files does autoloader support for streaming ingestion ? I see good support for CSV and JSON, how can I ingest files like XML, avro, parquet etc ? would XML rely on Spark-XML ? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/what-types-of-files-does-autoloader-support-for-streaming/m-p/20322#M13706</link>
    <description>&lt;P&gt;Please raise a feature request via &lt;A href="https://docs.databricks.com/resources/ideas.html" alt="https://docs.databricks.com/resources/ideas.html" target="_blank"&gt;ideas&lt;/A&gt; portal for XML support in autoloader  &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As a workaround, you could look at reading this with &lt;A href="https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#wholeTextFiles-java.lang.String-int-" alt="https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#wholeTextFiles-java.lang.String-int-" target="_blank"&gt;wholeTextFiles&lt;/A&gt; (which loads the data into a PairRDD with one record per input file) and parsing it with &lt;A href="https://github.com/databricks/spark-xml/blob/2ba6736a385bd5dc8e18d64aedb1d82891d227dc/README.md#parsing-nested-xml" alt="https://github.com/databricks/spark-xml/blob/2ba6736a385bd5dc8e18d64aedb1d82891d227dc/README.md#parsing-nested-xml" target="_blank"&gt;from_xml&lt;/A&gt; from the spark-xml package &lt;/P&gt;</description>
    <pubDate>Fri, 25 Jun 2021 00:26:26 GMT</pubDate>
    <dc:creator>sajith_appukutt</dc:creator>
    <dc:date>2021-06-25T00:26:26Z</dc:date>
    <item>
      <title>What types of files does autoloader support for streaming ingestion ? I see good support for CSV and JSON, how can I ingest files like XML, avro, parquet etc ? would XML rely on Spark-XML ?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-types-of-files-does-autoloader-support-for-streaming/m-p/20321#M13705</link>
      <description>&lt;P&gt;What types of files does autoloader support for streaming ingestion ? I see good support for CSV and JSON, how can I ingest files like XML, avro, parquet etc ? would XML rely on Spark-XML ?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 24 Jun 2021 21:07:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-types-of-files-does-autoloader-support-for-streaming/m-p/20321#M13705</guid>
      <dc:creator>User16783853501</dc:creator>
      <dc:date>2021-06-24T21:07:43Z</dc:date>
    </item>
    <item>
      <title>Re: What types of files does autoloader support for streaming ingestion ? I see good support for CSV and JSON, how can I ingest files like XML, avro, parquet etc ? would XML rely on Spark-XML ?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-types-of-files-does-autoloader-support-for-streaming/m-p/20322#M13706</link>
      <description>&lt;P&gt;Please raise a feature request via &lt;A href="https://docs.databricks.com/resources/ideas.html" alt="https://docs.databricks.com/resources/ideas.html" target="_blank"&gt;ideas&lt;/A&gt; portal for XML support in autoloader  &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As a workaround, you could look at reading this with &lt;A href="https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#wholeTextFiles-java.lang.String-int-" alt="https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#wholeTextFiles-java.lang.String-int-" target="_blank"&gt;wholeTextFiles&lt;/A&gt; (which loads the data into a PairRDD with one record per input file) and parsing it with &lt;A href="https://github.com/databricks/spark-xml/blob/2ba6736a385bd5dc8e18d64aedb1d82891d227dc/README.md#parsing-nested-xml" alt="https://github.com/databricks/spark-xml/blob/2ba6736a385bd5dc8e18d64aedb1d82891d227dc/README.md#parsing-nested-xml" target="_blank"&gt;from_xml&lt;/A&gt; from the spark-xml package &lt;/P&gt;</description>
      <pubDate>Fri, 25 Jun 2021 00:26:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-types-of-files-does-autoloader-support-for-streaming/m-p/20322#M13706</guid>
      <dc:creator>sajith_appukutt</dc:creator>
      <dc:date>2021-06-25T00:26:26Z</dc:date>
    </item>
  </channel>
</rss>

