cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Why is GPU accelerated node much slower than CPU node for training a random forest model on databricks?

zzy
New Contributor III

I have a dataset about 5 million rows with 14 features and a binary target. I decided to train a pyspark random forest classifier on Databricks. The CPU cluster I created contains 2 c4.8xlarge workers (60GB, 36core) and 1 r4.xlarge (31GB, 4core) driver. The GPU cluster I created contains 3 g4dn.4xlarge (64GB, 16cores) nodes, 2 as workers and 1 as driver. The hourly costs are very similar. I assumed that GPU cluster would outperform since random forest is an algorithm good for parallel computing, while the result kinda shocked me that the GPU cluster trained the model near 5 times slower than the CPU cluster. Is there anything I misunderstood about GPU acceleration or is it just not used for pyspark.ml modules?

2 REPLIES 2

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi @Simon Zhang​ , could you please go through this: https://www.databricks.com/session/gpu-support-in-spark-and-gpu-cpu-mixed-resource-scheduling-at-pro... and let us know if it addresses your concern?

Hubert-Dudek
Esteemed Contributor III

In many cases, you need to adjust your code to utilize GPU.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.