cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Best practice for Image manipulation

Anonymous
Not applicable

Can you please recommend suggestions for image manipulation once you read the data as an image ? Any specific library to use?

1 ACCEPTED SOLUTION

Accepted Solutions

sean_owen
Honored Contributor II
Honored Contributor II

Spark has a built-in 'image' data source which will read a directory of images files as a DataFrame: spark.read.format("image").load(...). The resulting DataFrame has the pixel data, dimensions, channels, etc.

You can also read image files 'manually' by using the 'binaryFiles' data source, which will give you the raw bytes of image files. You would then read them with (for example) PIL in Python.

For Python, PIL is pretty much the standard for image manipulation. For the JVM, I think I'd still use the old java.awt classes like BufferedImage.

View solution in original post

1 REPLY 1

sean_owen
Honored Contributor II
Honored Contributor II

Spark has a built-in 'image' data source which will read a directory of images files as a DataFrame: spark.read.format("image").load(...). The resulting DataFrame has the pixel data, dimensions, channels, etc.

You can also read image files 'manually' by using the 'binaryFiles' data source, which will give you the raw bytes of image files. You would then read them with (for example) PIL in Python.

For Python, PIL is pretty much the standard for image manipulation. For the JVM, I think I'd still use the old java.awt classes like BufferedImage.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.