Flatfiles ingestion on Bronze layer, 'to schema' or 'not to schemarize'?

- - Certifications
- - Learning Paths
- - Databricks Product Tours
- - Get Started Guides
- - Product Platform Updates
- - What's New in Databricks

- - Get Started Resources
- - Events
- - Support FAQs
- - Technical Blog
- - Knowledge Sharing Hub
- - Announcements
- - DatabricksTV

- - Private Groups
- - Skills@Scale

- - Databricks Community Champions
- - Khoros Community Forums Support (Not for Databricks Product Questions)
- - Databricks Community Code of Conduct

Data Engineering

Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.

Hi all, What is the general guideline for handling flatfiles (xml, json with several nested hierarchies that is also schema evolving) in the bronze layer?

Should I persist the file content into a single column as text in the parquet file

should I let spark infer a schema and have it output a parquet file with several columns representing the content of the xml/json file?