cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

AutoML split with dt column not working properly

Noura_azza
New Contributor II

I am using AutoML and want to split my data to train/validation and test  using a dt column (one date for train one different date for validation and a third date for test. The problem that the autoML fails, there are only training metrics (no valiation nor test ones) and when I check the data exploratory notebook it seems that all samples are considered as training eventhough the corresponding dt are different. When I look to model artifacts, I see that the column dt were taken into consideration  as feature by the model

2 REPLIES 2

Noura_azza
New Contributor II

this is what I see in my data exploration  notebook. All dates are considered part of the training split 

Noura_azza_1-1706102702277.png

 

 

maggiewang
Databricks Employee
Databricks Employee

Hello! Did you try specify a column name as manual split column? 

Then you can fully control which rows are in train / validate / test splits: https://docs.databricks.com/en/machine-learning/automl/automl-data-preparation.html#split-data-for-r...

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group