Databricks Community

turagittech · 2 weeks ago

Hi All,

We are planning to store some mixed json files in blob store and read into Databricks. I am questioning whether we should have a container for each structure or if the various tools in Databricks can successfully read the different types. I have my doubts being there is no way to separate them as it's a flat file structure regardless of what we write the files to look like in the storage to us humans.

I can filter the files in a python script, but that prevents them from things like autoloader or am I missing something in how to use autoloader in this scenario.

How have others approached this?

holly · a week ago

If they're all JSON but have different structure you can use the variant type

https://docs.databricks.com/aws/en/sql/language-manual/data-types/variant-type

There's a few examples in this blog too: https://www.databricks.com/blog/introducing-open-variant-data-type-delta-lake-and-apache-spark

turagittech · Sunday

This doesn't hit the mark as I am referring to each json file representing a different table of data. I think multiple structures in a blob container confuse a lot of tools and that means you have to do file by file loading and that is going to be the least efficient approach.

Databricks Community

Reading different file structures for json files in blob stores

Photos

Join Us as a Local Community Builder!

Business Intelligence in the Era of AI

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Databricks Community Champion - March 2025 - Takuya Omi

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April