cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

If I write pandas code using koalas and have photon enabled, will my pandas code run on photon?

User16752240150
New Contributor II
 
1 ACCEPTED SOLUTION

Accepted Solutions

holly
Databricks Employee
Databricks Employee

Hi there! Appreciate this reply is 3 years later than it was originally asked, but people might be coming across it still. A few things:

  1. Koalas was deprecated in spark 3.2 (runtime 10.4). Instead, the recommendation is to use pandas on spark with `import pyspark.pandas as ps` You can find a link here to the spark migration guide, and here for more usage
  2. As of writing, photon works with SQL and equivalent DataFrame API statements. So SQL-ish statements like filter, join, and aggregates will work, but more complex ones for analytics or data science it won't.
  3. In the future, there may be more functionality bought out, but keep in mind that UDFs and RDDs are unlikely to ever work with photon as they bypass sparks catalyst optimizer which is needed for it to work. 

View solution in original post

1 REPLY 1

holly
Databricks Employee
Databricks Employee

Hi there! Appreciate this reply is 3 years later than it was originally asked, but people might be coming across it still. A few things:

  1. Koalas was deprecated in spark 3.2 (runtime 10.4). Instead, the recommendation is to use pandas on spark with `import pyspark.pandas as ps` You can find a link here to the spark migration guide, and here for more usage
  2. As of writing, photon works with SQL and equivalent DataFrame API statements. So SQL-ish statements like filter, join, and aggregates will work, but more complex ones for analytics or data science it won't.
  3. In the future, there may be more functionality bought out, but keep in mind that UDFs and RDDs are unlikely to ever work with photon as they bypass sparks catalyst optimizer which is needed for it to work. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group