cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Runtime issue

choi_2
New Contributor II

Hello,

I am working on a machine learning project. The dataset I am using has more than 5000000 rows. 

I am using PySpark, and the attached screenshot is the block I used RandomForestRegressor to train the model.

It worked even though it took a pretty long time, but I was trying to run the same part again and it does not work anymore. I even let it run for a whole night but it did not even start the Spark Jobs and kept showing the message "Filtering files for query". I am using 10 features for the model, so I am wondering if it is due to the high dimensions of the features. But even then why it does not work now even though it did work before? 

Even I tried with sample dataset using 10% of the total data, but it still does not work. Also, I was trying to use PCA to reduce the dimensionality but that also did not process. 

I was trying to increase the number of worker nodes in the cluster, but it is not allowed because I am using Azure Databricks free trials. The Policy of my cluster is "Personal Compute". I am very new to this Databricks platform, and I am trying to figure out how to deal with these issues. I did search and tried everything that I could do but does not seem working. Can anyone please tell me if there is any way that I can work with large data and train the model with less time, or at least any suggestions for my situation?

I would very appreciate for your help!

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group