We are trying to train a predictive ML model using the XGBoost Classifier. Part of the requirements we have gotten from our business team is to implement feature weighting as they have defined certain features mattering more than others. We have 69 features as part of the dataset.
We are trying to fit the model with these parameters:
model.fit(X_train,
y_train,
classifier__feature_weights=feature_weights,
classifier__early_stopping_rounds=5,
classifier__verbose=False,
classifier__eval_set=[(X_val_processed,y_val_processed)])
feature_weights is set accordingly to test:
feature_weights = np.zeros(X_train.shape[1])
feature_weights[:10] = 2.0
When running this, we are getting the following error:
The Python process exited with exit code 139 (SIGSEGV: Segmentation fault).
However, when we run feature_weights set to this, we don't get an error:
feature_weights = np.zeros(X_train.shape[1])
feature_weights[:5] = 1.0
Do you have any insight or advice on this error and how we can fix it moving forward? Our research tells us it's a memory issue, but looking at the cluster metrics shows us that 90GB/220GB of memory is being used.