<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Delta Lake as source of images to train a classification model on a local computer in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/delta-lake-as-source-of-images-to-train-a-classification-model/m-p/17429#M935</link>
    <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi Folks,&lt;/P&gt;
&lt;P&gt;I'm evaluating Delta Lake to store image / data version control to be used to train models. I looked at a session explaining how to do this and also using MLflow to manage training (https://databricks.com/session_na21/image-processing-on-delta-lake). &lt;/P&gt;
&lt;P&gt;Note: it'd be interesting to have a link to the source code used in the demo.&lt;/P&gt;
&lt;P&gt;I have a slightly different scenario, though. Testing is being performed on a local machine following the quick tutorial (https://docs.delta.io/latest/quick-start.html). In this scenario, what is the best way (using as much out-of-the-box components as possible) to "grab" a local folder with images organized into subfolders (classes) and dump them into delta lake and then use a specific snapshot on tensorflow?&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;Thanks 
&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 23 Jul 2021 15:52:32 GMT</pubDate>
    <dc:creator>MCosta</dc:creator>
    <dc:date>2021-07-23T15:52:32Z</dc:date>
    <item>
      <title>Delta Lake as source of images to train a classification model on a local computer</title>
      <link>https://community.databricks.com/t5/machine-learning/delta-lake-as-source-of-images-to-train-a-classification-model/m-p/17429#M935</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi Folks,&lt;/P&gt;
&lt;P&gt;I'm evaluating Delta Lake to store image / data version control to be used to train models. I looked at a session explaining how to do this and also using MLflow to manage training (https://databricks.com/session_na21/image-processing-on-delta-lake). &lt;/P&gt;
&lt;P&gt;Note: it'd be interesting to have a link to the source code used in the demo.&lt;/P&gt;
&lt;P&gt;I have a slightly different scenario, though. Testing is being performed on a local machine following the quick tutorial (https://docs.delta.io/latest/quick-start.html). In this scenario, what is the best way (using as much out-of-the-box components as possible) to "grab" a local folder with images organized into subfolders (classes) and dump them into delta lake and then use a specific snapshot on tensorflow?&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;Thanks 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jul 2021 15:52:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/delta-lake-as-source-of-images-to-train-a-classification-model/m-p/17429#M935</guid>
      <dc:creator>MCosta</dc:creator>
      <dc:date>2021-07-23T15:52:32Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Lake as source of images to train a classification model on a local computer</title>
      <link>https://community.databricks.com/t5/machine-learning/delta-lake-as-source-of-images-to-train-a-classification-model/m-p/17431#M937</link>
      <description>&lt;P&gt;I can think of 3 ways for doing this:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;using the web UI (the create table option or upload data into DBFS)&lt;/LI&gt;&lt;LI&gt;using databricks-connect, which bridges your local machine with the remote databricks clusters&lt;/LI&gt;&lt;LI&gt;using the databricks-cli to copy local files to dbfs&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;your cloud vendor might also have a tool to copy local data into the cloud environment.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For your purpose (evaluating)  the web UI option might be the easiest.&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/data/data.html" alt="https://docs.databricks.com/data/data.html" target="_blank"&gt;https://docs.databricks.com/data/data.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.microsoft.com/en-us/azure/databricks/data/databricks-file-system#file-upload-interface" alt="https://docs.microsoft.com/en-us/azure/databricks/data/databricks-file-system#file-upload-interface" target="_blank"&gt;https://docs.microsoft.com/en-us/azure/databricks/data/databricks-file-system#file-upload-interface&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Sep 2021 14:17:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/delta-lake-as-source-of-images-to-train-a-classification-model/m-p/17431#M937</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2021-09-09T14:17:40Z</dc:date>
    </item>
  </channel>
</rss>

