How do I efficiently read image data for a deep learning model?

User16788317466
Databricks Employee
Databricks Employee

How do I efficiently read image data for a deep learning model?

User16788317466
Databricks Employee
Databricks Employee

Utilize Uber’s Petastorm https://github.com/uber/petastorm to read in and generate a parquet dataset from the image data. The petastorm API’s can then be used to generate a TF Dataset, etc.

Joseph_B
Databricks Employee
Databricks Employee

Our documentation provides nice examples of preparing image data for training and inference.

Training: See docs for AWS, Azure, GCP

Inference: See reference solution for AWS, Azure, GCP