How to handle complex json schema

- - Certifications
- - Learning Paths
- - Databricks Product Tours
- - Get Started Guides

- - Get Started Resources
- - Events
- - Support FAQs
- - Technical Blog
- - Community Articles
- - Announcements
- - DatabricksTV
- - Product Platform Updates

- - Private Groups
  - Princeton Life Sciences Databricks User Group
- - Skills@Scale

- - Databricks Community Champions
- - Khoros Community Forums Support (Not for Databricks Product Questions)
- - Databricks Community Code of Conduct

Get Started Discussions

Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.

I have a mounted external directory that is an s3 bucket with multiple subdirectories containing call log files in json format. The files are irregular and complex, when i try to use spark.read.json or spark.sql (SELECT *) i get the UNABLE_TO_INFER_SCHEMA error. the files are too complex to try and build a schema manually, plus there are thousands of files. what is the best approach for creating a dataframe with this data?