<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Best practice for Image manipulation in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/best-practice-for-image-manipulation/m-p/23166#M1310</link>
    <description>&lt;P&gt;Spark has a built-in 'image' data source which will read a directory of images files as a DataFrame: spark.read.format("image").load(...). The resulting DataFrame has the pixel data, dimensions, channels, etc.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can also read image files 'manually' by using the 'binaryFiles' data source, which will give you the raw bytes of image files. You would then read them with (for example) PIL in Python.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For Python, PIL is pretty much the standard for image manipulation. For the JVM, I think I'd still use the old java.awt classes like BufferedImage.&lt;/P&gt;</description>
    <pubDate>Thu, 17 Jun 2021 18:13:58 GMT</pubDate>
    <dc:creator>sean_owen</dc:creator>
    <dc:date>2021-06-17T18:13:58Z</dc:date>
    <item>
      <title>Best practice for Image manipulation</title>
      <link>https://community.databricks.com/t5/machine-learning/best-practice-for-image-manipulation/m-p/23165#M1309</link>
      <description>&lt;P&gt;Can you please recommend suggestions for image manipulation once you read the data as an image ? Any specific library to use?&lt;/P&gt;</description>
      <pubDate>Thu, 17 Jun 2021 16:28:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/best-practice-for-image-manipulation/m-p/23165#M1309</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-06-17T16:28:44Z</dc:date>
    </item>
    <item>
      <title>Re: Best practice for Image manipulation</title>
      <link>https://community.databricks.com/t5/machine-learning/best-practice-for-image-manipulation/m-p/23166#M1310</link>
      <description>&lt;P&gt;Spark has a built-in 'image' data source which will read a directory of images files as a DataFrame: spark.read.format("image").load(...). The resulting DataFrame has the pixel data, dimensions, channels, etc.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can also read image files 'manually' by using the 'binaryFiles' data source, which will give you the raw bytes of image files. You would then read them with (for example) PIL in Python.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For Python, PIL is pretty much the standard for image manipulation. For the JVM, I think I'd still use the old java.awt classes like BufferedImage.&lt;/P&gt;</description>
      <pubDate>Thu, 17 Jun 2021 18:13:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/best-practice-for-image-manipulation/m-p/23166#M1310</guid>
      <dc:creator>sean_owen</dc:creator>
      <dc:date>2021-06-17T18:13:58Z</dc:date>
    </item>
  </channel>
</rss>

