Autoloader Solution for Binary files

Veeru245 — Tue, 30 May 2023 09:26:33 GMT

We have solution implemented for ingesting binary file ( .ZIP ) into delta lake, Currently we are using the below solution within our pipeline.

This solution is working fine for small set of files ( 25 ). When we are processing large set of files ( 650 ) it is taking more time than expected.

Would like to know if we have a better solution to speed up the process.

Few things to note about the Xml file, This is a nested XML file which is having around 600 columns.

topic Autoloader Solution for Binary files in Data Engineering