What types of files does autoloader support for streaming ingestion ? I see good support for CSV and JSON, how can I ingest files like XML, avro, parquet etc ? would XML rely on Spark-XML ?
What types of files does autoloader support for streaming ingestion ? I see good support for CSV and JSON, how can I ingest files like XML, avro, parquet etc ? would XML rely on Spark-XML ?
Please raise a feature request via ideas portal for XML support in autoloader
As a workaround, you could look at reading this with wholeTextFiles (which loads the data into a PairRDD with one record per input file) and parsing it with from_xml from the spark-xml package