cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

count or toPandas taking too long

jimcast
New Contributor

Hi,

I am fetching data from unity catalog from notebooks using spark.sql(). The query takes just a few seconds - I am actually trying to retrieving 2 rows - but some operations like count() or toPandas() take forever. I wonder why does it take so long and if there is a way to speed up those operations. 

Compute: personal compute m5d.2xlarge (14.1 (includes Apache Spark 3.5.0, Scala 2.12))

Thanks!

 

2 REPLIES 2

Hkesharwani
Contributor II

Hi,  it is quite normal that converting data frame from spark to pandas takes time.
Although there is a way we can optimize it.
Enable Arrow Optimization: Starting from Spark 3.0.0, We can enable arrow optimization, this will speed up the process by enabling  the use of Apache Arrow for faster data transfer between Spark and Python.

 

 

spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true")

 

 

 

 

Harshit Kesharwani
Data engineer at Rsystema

anardinelli
Databricks Employee
Databricks Employee

Hey @jimcast how are you?

You can check the internals and have a good hint of what's happening using the SparkUI. Filter and select the jobs that are taking the longest and check what is being requested on the SQL/Data Frame tab, as well as their plans. 

If your data is public, please also share more details (such as logs, prints and dumps) so we can better help you with.

Best,

Alessandro

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group