Data Engineering

Forum Posts

Sorted by:

by mrcity • New Contributor II

02-06-2023 2:35:38 PM

3661 Views
3 replies
1 kudos

Exclude absent lookup keys from dataframes made by create_training_set()

I've got data stored in feature tables, plus in a data lake. The feature tables are expected to lag the data lake by at least a little bit. I want to filter data coming out of the feature store by querying the data lake for lookup keys out of my inde...

Data Engineering

3661 Views
3 replies
1 kudos

02-06-2023 2:35:38 PM

View Replies

Latest Reply

Quinten
New Contributor II

08-14-2024 7:04:56 AM

1 kudos

I'm facing the same issue as described by @mrcity. There is no easy way to alter the dataframe, which is created inside the score_batch() function. Filtering out rows in the (sklearn) pipeline itself is also not convenient since these transformers ar...

1 kudos

08-14-2024 7:04:56 AM

2 More Replies

by Nasreddin • New Contributor

11-02-2021 1:20:19 PM

6869 Views
0 replies
0 kudos

ColumnTransformer not fitted after sklearn Pipeline loaded from Mlflow

I am building a machine learning model using sklearn Pipeline which includes a ColumnTransformer as a preprocessor before the actual model. Below is the code how the pipeline is created.transformers = [] num_pipe = Pipeline(steps=[ ('imputer', Si...

Data Engineering

6869 Views
0 replies
0 kudos

11-02-2021 1:20:19 PM

by Maser_AZ • New Contributor II

07-29-2019 6:07:11 PM

18247 Views
4 replies
1 kudos

Resolved! How to fix TypeError: init() got an unexpected keyword argument 'max_iter'?

# Create the model using sklearn (don't worry about the parameters for now): model = SGDRegressor(loss='squared_loss', verbose=0, eta0=0.0003, max_iter=3000) Train/fit the model to the train-part of the dataset: odel.fit(X_train, y_train) ERROR: Typ...

Data Engineering

18247 Views
4 replies
1 kudos

07-29-2019 6:07:11 PM

View Replies

Latest Reply

Fantomas_nl
New Contributor II

08-13-2019 3:55:31 AM

1 kudos

Replacing max_iter with n_iter resolves the error. Thnx! It is a bit unusual to expect errors like this with this type of solution from Microsoft. As if it could not be prevented..

1 kudos

08-13-2019 3:55:31 AM

3 More Replies

by AlexRomano • New Contributor

08-15-2019 9:54:41 AM

8922 Views
1 replies
0 kudos

PicklingError: Could not pickle the task to send it to the workers.

I am using sklearn in a databricks notebook to fit an estimator in parallel. Sklearn uses joblib with loky backend to do this. Now, I have file in databricks which I can import my custom Classifier from, and everything works fine. However, if I lite...

Data Engineering

8922 Views
1 replies
0 kudos

08-15-2019 9:54:41 AM

View Replies

Latest Reply

Anonymous
Not applicable

07-16-2020 2:06:42 PM

0 kudos

Hi, aromano I know this issue was opened almost a year ago, but I faced the same problem and I was able to solve it. So, I'm sharing the solution in order to help others. Probably, you're using SparkTrials to optimize the model's hyperparameters ...

0 kudos

07-16-2020 2:06:42 PM

Databricks Community

Exclude absent lookup keys from dataframes made by create_training_set()

ColumnTransformer not fitted after sklearn Pipeline loaded from Mlflow

Resolved! How to fix TypeError: __init__() got an unexpected keyword argument 'max_iter'?

PicklingError: Could not pickle the task to send it to the workers.

Resolved! How to fix TypeError: init() got an unexpected keyword argument 'max_iter'?