cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Delta tables and YOLO computer vision tasks

Andrewcon
New Contributor

 

Hi all,

I would really appreciate if someone could help me out. I feel it’s both a data engineering and ML question.

One thing we use at wo is YOLO for object detection. I’ve managed to run YOLO by loading data from the blob storage, but I’ve seen that the best way to do deep learning tasks in Databricks is to train your ML models on Delta Live Tables.

I currently have my training dataset as a Delta table, and I was wondering if anyone has managed to train computer vision models on Delta tables.

I’ve read the documentations and have seen repos such as petastorm that try to implement training on delta tables, but I can’t for the life of me understand how to actually run yolo this way, especially since YOLO uses yaml for config.

Thank in advance for your help! 😇

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @AndrewconTraining computer vision models on Delta Live Tables in Databricks is an interesting challenge. Let’s break it down:

  1. Delta Live Tables:

  2. Training Computer Vision Models:

    • While YOLO (You Only Look Once) is a powerful object detection algorithm, integrating it with Delta Live Tables requires some additional steps.
    • Here’s a high-level approach to train YOLO using your Delta table:
  3. Steps:

    • Data Preparation:

    • Model Configuration:

      • YOLO uses a yaml configuration file to define model architecture, hyperparameters, and other settings.
      • Adapt your YOLO configuration to work with Delta Live Tables. You might need to modify the data input pipeline to read from Delta tables instead of blob storage.
    • Training:

      • Set up a Databricks cluster with appropriate resources for training.
      • Write custom code (likely in Python) that:
        • Reads data from the Delta table.
        • Parses the yaml configuration.
        • Initializes the YOLO model.
        • Trains the model using your data.
    • Evaluation and Deployment:

      • After training, evaluate the model’s performance using validation data.
      • Once satisfied, deploy the model for inference.
  4. Challenges:

    • YOLO’s yaml configuration: Adapting it to work with Delta tables might involve custom code to load data dynamically.
    • Ensuring efficient data access: Delta Live Tables provides optimizations, but you’ll need to handle data loading efficiently.
  5. Resources:

Remember that this integration might require some experimentation and custom development. Good luck, and feel free to ask if you need further assistance! 😊

 
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.