cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

AutoML "need to sample" not working as expected

sangramraje
New Contributor

tl; dr:

When the AutoML run realizes it needs to do sampling because the driver / worker node memory is not enough to load / process the entire dataset, it fails. A sample weight column is NOT provided by me, but I believe somewhere in the process the automl system believes it was supplied, tries to find it and encounters an error:

[UNRESOLVED_COLUMN.WITH_SUGGESTION] A column, variable, or function parameter with name `_automl_sample_weight_0000` cannot be resolved. Did you mean one of the following? [`_automl_split_col_0000`, `t*****d`, `o******l`, `s*******o`, `f*******o`]. SQLSTATE: 42703

More details:

I am running an experiment with AutoML on 15.4LTS ML runtime cluster. I set up the experiment with a driver node i3.2xlarge and worker node r5d.2xlarge. I see a log in the AutoML run:

 

 

 

2024/11/22 17:41:53 INFO databricks.automl.internal.size_estimator: 
            mem_req_data_load_mb = 52148.01129505962
            mem_req_training_mb_dense = 30748.212867736816
            mem_req_training_mb_sparse = 28898.903835296627
            mem_req_training_mb = 30748.212867736816
2024/11/22 17:41:53 INFO databricks.automl.internal.size_estimator: fraction (0.34224123905739046) = min of
                      (available_memory_mb_per_trial (41139.0) / worker_max_memory_req_mb (52148.01129505962)),
                      (self._memory_mb_on_driver (17847.199999999997) / mem_req_data_load_mb (52148.01129505962))

 

 

 

At this point, it goes into the code where it seems it is trying to sample and then fails. I am attaching the stacktrace as a pdf. 

See below snippet from the log showing that the sample_weight_col is NOT provided:

 

 

 

2024/11/22 17:38:56 INFO databricks.automl.internal.supervised_learner: AutoML called with params: target_col=w***s, data_dir=None exclude_cols=None exclude_columns=['p***d', 'E***d', 'T***D', 'M***e', 'T***y'] exclude_frameworks=['lightgbm', 'sklearn'] imputers=None metric=roc_auc max_trials=10000000000.0 timeout_minutes=120 experiment_id=4***6 time_col=None experiment_dir=/Users/s***m/databricks_automl pos_label=1 split_col=None sample_weight_col=None 

 

 

 

sangramraje_0-1732300084616.png

sangramraje_1-1732300133987.png

The 2nd image above (Classifier._sample) clearly shows at that point the `sample_weight_col` is not None (goes inside the if). The 1st image is what seems suspicious to me, hence adding that. 

Any help is most appreciated! I believe this piece of the source code is "internal" and closed hence posting here is the best I could do.

 

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group