Auto ML training - Early Stopping (training time) / Data Split

spearitchmeta — Mon, 11 Aug 2025 08:48:01 GMT

Greetings dear community,

I am using AutoML for the first time ands was wondering whether it is possible to have early stopping or incorporate any approach in my code to make the training of a model stop when the performance plateaus. Early stopping is something one can implement in the traditional way of training models (without auto ML). Additionally tracking loss function, performance evolution, etc...
I would be interested to have you thoughts on this since I am doing a client demo in the coming days.

# Run AutoML with manual's data split (0.8/0.2)

automl_result_manual_split = automl.classify(

dataset=train_df,

target_col="cae_type",

primary_metric="f1",

timeout_minutes=30,

experiment_dir=f"{group_workspace_base}/manual_split",

experiment_name=experiment_auto_ml_manual_split

)

2) My second question is regarding data split. As you can see here, I did a manual split (0.8 training data/ 0.2 testing data) but I am aware that data splitting can be done automatically by AutoML. Are there any resources that recommend the one or the other? (I also have class imbalance but I did not consider this in this first demo trial)

Best regards

Re: Auto ML training - Early Stopping (training time) / Data Split

Louis_Frolio — Mon, 11 Aug 2025 19:36:22 GMT

First question: See here for what is possible. https://docs.databricks.com/aws/en/machine-learning/automl/classification

Second question: See here for what is possible. https://docs.databricks.com/gcp/en/machine-learning/automl/classification-data-prep

Hope this helps, Louis.

topic Auto ML training - Early Stopping (training time) / Data Split in Machine Learning

Auto ML training - Early Stopping (training time) / Data Split

Re: Auto ML training - Early Stopping (training time) / Data Split