cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Why is GPU accelerated node much slower than CPU node for training a random forest model on databricks?

zzy
New Contributor III

I have a dataset about 5 million rows with 14 features and a binary target. I decided to train a pyspark random forest classifier on Databricks. The CPU cluster I created contains 2 c4.8xlarge workers (60GB, 36core) and 1 r4.xlarge (31GB, 4core) driver. The GPU cluster I created contains 3 g4dn.4xlarge (64GB, 16cores) nodes, 2 as workers and 1 as driver. The hourly costs are very similar. I assumed that GPU cluster would outperform since random forest is an algorithm good for parallel computing, while the result kinda shocked me that the GPU cluster trained the model near 5 times slower than the CPU cluster. Is there anything I misunderstood about GPU acceleration or is it just not used for pyspark.ml modules?

2 REPLIES 2

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi @Simon Zhang​ , could you please go through this: https://www.databricks.com/session/gpu-support-in-spark-and-gpu-cpu-mixed-resource-scheduling-at-pro... and let us know if it addresses your concern?

Hubert-Dudek
Esteemed Contributor III

In many cases, you need to adjust your code to utilize GPU.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!