Hi Folks,
I'm evaluating Delta Lake to store image / data version control to be used to train models. I looked at a session explaining how to do this and also using MLflow to manage training (https://databricks.com/session_na21/image-processing-on-delta-lake).
Note: it'd be interesting to have a link to the source code used in the demo.
I have a slightly different scenario, though. Testing is being performed on a local machine following the quick tutorial (https://docs.delta.io/latest/quick-start.html). In this scenario, what is the best way (using as much out-of-the-box components as possible) to "grab" a local folder with images organized into subfolders (classes) and dump them into delta lake and then use a specific snapshot on tensorflow?
Thanks