cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Nasreddin
by New Contributor
  • 4396 Views
  • 2 replies
  • 0 kudos

ColumnTransformer not fitted after sklearn Pipeline loaded from Mlflow

I am building a machine learning model using sklearn Pipeline which includes a ColumnTransformer as a preprocessor before the actual model. Below is the code how the pipeline is created.transformers = [] num_pipe = Pipeline(steps=[ ('imputer', Si...

  • 4396 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Nasreddin, MLflow is compatible with sklearn Pipeline with multiple steps. The error you're encountering, "This ColumnTransformer instance is not fitted yet. Call’ fit’ with appropriate arguments before using this estimator." is likely because  C...

  • 0 kudos
1 More Replies
mrcity
by New Contributor II
  • 1229 Views
  • 2 replies
  • 1 kudos

Exclude absent lookup keys from dataframes made by create_training_set()

I've got data stored in feature tables, plus in a data lake. The feature tables are expected to lag the data lake by at least a little bit. I want to filter data coming out of the feature store by querying the data lake for lookup keys out of my inde...

  • 1229 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Stephen Wylie​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 1 kudos
1 More Replies
MudassarA
by New Contributor II
  • 11533 Views
  • 4 replies
  • 1 kudos

Resolved! How to fix TypeError: __init__() got an unexpected keyword argument 'max_iter'?

# Create the model using sklearn (don't worry about the parameters for now): model = SGDRegressor(loss='squared_loss', verbose=0, eta0=0.0003, max_iter=3000) Train/fit the model to the train-part of the dataset: odel.fit(X_train, y_train) ERROR: Typ...

  • 11533 Views
  • 4 replies
  • 1 kudos
Latest Reply
Fantomas_nl
New Contributor II
  • 1 kudos

Replacing max_iter with n_iter resolves the error. Thnx! It is a bit unusual to expect errors like this with this type of solution from Microsoft. As if it could not be prevented..

  • 1 kudos
3 More Replies
AlexRomano
by New Contributor
  • 5716 Views
  • 1 replies
  • 0 kudos

PicklingError: Could not pickle the task to send it to the workers.

I am using sklearn in a databricks notebook to fit an estimator in parallel. Sklearn uses joblib with loky backend to do this. Now, I have file in databricks which I can import my custom Classifier from, and everything works fine. However, if I lite...

  • 5716 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi, aromano I know this issue was opened almost a year ago, but I faced the same problem and I was able to solve it. So, I'm sharing the solution in order to help others. Probably, you're using SparkTrials to optimize the model's hyperparameters ...

  • 0 kudos
Labels