Data Engineering

Forum Posts

Sorted by:

by JonHMDavis • New Contributor II

12-10-2021 1:00:39 AM

5112 Views
5 replies
2 kudos

Graphframes not importing on Databricks 9.1 LTS ML

Is Graphframes for python meant to be installed by default on Databricks 9.1 LTS ML? Previously I was running the attached python command on 7.3 LTS ML with no issue, however now I am getting "no module named graphframes" when trying to import the pa...

Data Engineering

5112 Views
5 replies
2 kudos

12-10-2021 1:00:39 AM

View Replies

Latest Reply

malz
New Contributor II

11-07-2024 10:25:53 PM

2 kudos

Hi @MuthuLakshmi , As per the documentation it was mentioned that graphframes comes preinstalled in databricks runtime for machine learning. but when trying to import the python module of graphframes, getting no module found error.from graphframes i...

2 kudos

11-07-2024 10:25:53 PM

4 More Replies

by User16789201666 • Databricks Employee

06-07-2021 4:36:53 PM

9074 Views
3 replies
4 kudos

Resolved! How do you detect model drift using Databricks?

Data Engineering

9074 Views
3 replies
4 kudos

06-07-2021 4:36:53 PM

View Replies

Latest Reply

arun_pamulapati
Databricks Employee

10-15-2023 3:24:51 AM

4 kudos

Use Lakehouse Monitoring: https://docs.databricks.com/en/lakehouse-monitoring/index.html Specifically: https://docs.databricks.com/en/lakehouse-monitoring/monitor-output.html#drift-metrics-table

4 kudos

10-15-2023 3:24:51 AM

2 More Replies

by Zoumana • New Contributor II

11-13-2021 5:22:34 AM

17948 Views
5 replies
6 kudos

Resolved! How to get probability score for each prediction from mlflow

I trained my model and was able to get the batch prediction from that model as specified below. But I want to also get the probability scores for each prediction. Do you have any idea? Thank you!logged_model = path_to_model# Load model as a PyFuncMod...

Data Engineering

17948 Views
5 replies
6 kudos

11-13-2021 5:22:34 AM

View Replies

Latest Reply

OndrejHavlicek
New Contributor III

08-08-2023 1:38:41 AM

6 kudos

Now you can log the model using this parameter:mlflow.sklearn.log_model( ..., # the usual params pyfunc_predict_fn="predict_proba" ) which will return probabilities for the first class apparently when using the model for inference (e.g. when...

6 kudos

08-08-2023 1:38:41 AM

4 More Replies

by Nikhil3107 • New Contributor III

06-02-2023 8:08:18 AM

10309 Views
1 replies
0 kudos

Model Serving error - Java gateway process exited before sending its port number

Hello, I am trying to serve a model endpoint (using Databricks GUI) for a model that was successfully logged to the Model Registry. However, the endpoint creation failed with the following errors: Endpoint logs with error messagesEndpoint events with...

Data Engineering

10309 Views
1 replies
0 kudos

06-02-2023 8:08:18 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-19-2023 11:13:12 PM

0 kudos

Hi @Nikhil Gajghate We haven't heard from you since the last response from @Kaniz Fatma , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to o...

0 kudos

06-19-2023 11:13:12 PM

by Chengcheng • New Contributor III

06-15-2023 12:21:56 AM

1986 Views
1 replies
4 kudos

Is Feature Store packaged model compatible with Spark UDF?

Hi, I tried to deploy a Feature Store packaged model into Delta Live Table using mlflow.pyfunc.spark_udf in Azure Databricks. This model is built by Databricks autoML with joined Feature Table inside it.And I'm trying to make prediction using the fol...

Data Engineering

1986 Views
1 replies
4 kudos

06-15-2023 12:21:56 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-17-2023 2:34:31 AM

4 kudos

Hi @Chengcheng Guo Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

4 kudos

06-17-2023 2:34:31 AM

by s_plank • New Contributor III

01-23-2023 12:27:10 AM

969 Views
0 replies
0 kudos

Model registry does not load adapted Normalization-Layer (Keras) correctly

Hello,I have a problem with the model registry. I'm not sure if I'm registering the model incorrectly or if it´s a bug.Here are some code snippets:import pandas as pd import mlflow import mlflow.keras import mlflow.tensorflow from mlflow.tracking.c...

Data Engineering

969 Views
0 replies
0 kudos

01-23-2023 12:27:10 AM

by rubenteixeira • New Contributor III

01-09-2023 7:16:03 AM

3943 Views
2 replies
0 kudos

Can't parallelize model training with sc.parallelize, even tough I can run the same code without parallelizing

I'm training a NeuralProphet for a time series forecasting problem. I'm trying to parallelize my training, but this error is appearingThe folder lightning_logs has a hparams.yaml but it's empty. Is this related to permissions on the cluster? Thanks i...

Data Engineering

3943 Views
2 replies
0 kudos

01-09-2023 7:16:03 AM

View Replies

Latest Reply

Debayan
Databricks Employee

01-09-2023 2:07:40 PM

0 kudos

Hi,Please let us know if this was checked already:

0 kudos

01-09-2023 2:07:40 PM

1 More Replies

by User16826992666 • Valued Contributor

06-25-2021 10:38:31 AM

2088 Views
3 replies
2 kudos

Resolved! What is the best method for bringing an already trained model into MLflow?

I already have a trained and saved model that was created outside of MLflow. What is the best way to handle it if I want this model to be added to an MLflow experiment?

Data Engineering

2088 Views
3 replies
2 kudos

06-25-2021 10:38:31 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-22-2022 7:11:52 AM

2 kudos

Hi @Trevor Bishop Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

2 kudos

04-22-2022 7:11:52 AM

2 More Replies

by admo • New Contributor III

03-17-2022 2:11:05 AM

9554 Views
4 replies
7 kudos

Scaling issue for inference with a spark.mllib model

Hello,I'm writing this because I have tried a lot of different directions to get a simple model inference working with no success.Here is the outline of the job# 1 - Load the base data (~1 billion lines of ~6 columns) interaction = build_initial_df()...

Data Engineering

9554 Views
4 replies
7 kudos

03-17-2022 2:11:05 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

03-17-2022 3:42:49 AM

7 kudos

It is hard to analyze without Spark UI and more detailed information, but anyway few tips:look for data skews some partitions can be very big some small because of incorrect partitioning. You can use Spark UI to do that but also debug your code a bit...

7 kudos

03-17-2022 3:42:49 AM

3 More Replies

by gibbona1 • New Contributor II

02-07-2022 8:28:46 AM

4422 Views
2 replies
1 kudos

Resolved! Correct setup and format for calling REST API for image classification

I trained a basic image classification model on MNIST using Tensorflow, logging the experiment run with MLflow.Model: "my_sequential" _________________________________________________________________ Layer (type) Output Shape ...

Data Engineering

4422 Views
2 replies
1 kudos

02-07-2022 8:28:46 AM

View Replies

Latest Reply

Atanu
Databricks Employee

03-15-2022 9:40:04 PM

1 kudos

@Anthony Gibbons may be this git should work with your use case - https://github.com/mlflow/mlflow/issues/1661

1 kudos

03-15-2022 9:40:04 PM

1 More Replies

by MichaelO • New Contributor III

01-28-2022 1:49:44 PM

13014 Views
2 replies
2 kudos

Resolved! Transfer files saved in filestore to either the workspace or to a repo

I built a machine learning model:lr = LinearRegression() lr.fit(X_train, y_train)which I can save to the filestore by:filename = "/dbfs/FileStore/lr_model.pkl" with open(filename, 'wb') as f: pickle.dump(lr, f)Ideally, I wanted to save the model ...

Data Engineering

13014 Views
2 replies
2 kudos

01-28-2022 1:49:44 PM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

02-01-2022 7:25:47 AM

2 kudos

Workspace and Repo is not full available via dbfs as they have separate access rights. It is better to use MLFlow for your models as it is like git but for ML. I think using MLOps you can than put your model also to git.

2 kudos

02-01-2022 7:25:47 AM

1 More Replies

by maranBH • New Contributor III

11-24-2021 6:21:55 AM

2343 Views
3 replies
1 kudos

Resolved! Trained model artifact, CI/CD and Databricks without MLFlow.

Hi all,We are constructing our CI/CD pipelines with the Repos feature following this guide:https://databricks.com/blog/2021/09/20/part-1-implementing-ci-cd-on-databricks-using-databricks-notebooks-and-azure-devops.htmlI'm trying to implement my pipes...

Data Engineering

2343 Views
3 replies
1 kudos

11-24-2021 6:21:55 AM

View Replies

Latest Reply

sean_owen
Databricks Employee

01-05-2022 7:14:39 PM

1 kudos

So you are managing your models with MLflow, and want to include them in a git repository?You can do that in a CI/CD process; it would run the mlflow CLI to copy the model you want (e.g. model:/my_model/production) to a git checkout and then commit i...

1 kudos

01-05-2022 7:14:39 PM

2 More Replies

by marchello • New Contributor III

11-24-2021 10:59:57 AM

2742 Views
5 replies
6 kudos

Resolved! register model - need python 3, but get only python 2

Hi all, I'm trying to register a model with python 3 support, but continue getting only python 2. I can see that runtime 6.0 and above get python 3 by default, but I don't see a way to set neither runtime version, nor python version during model regi...

Data Engineering

2742 Views
5 replies
6 kudos

11-24-2021 10:59:57 AM

View Replies

Latest Reply

marchello
New Contributor III

12-03-2021 9:13:17 AM

6 kudos

Hi team, thanks for getting back to me. Let's put this on hold for now. I will update once it's needed again. It was solely for education purpose and right now I have quite urgent stuff to do.Have a great day.

6 kudos

12-03-2021 9:13:17 AM

4 More Replies

by Orianh • Valued Contributor II

11-14-2021 1:00:06 AM

4276 Views
3 replies
1 kudos

Train deep learning model with numpy arrays.

Hey guys,I'm trying to train deep learning model at ML databricks with numpy arrays as input.For now i organized all the data inside DF- df contains 4 columns : col1,col2,col3,col4col1 and col2 have arrays with shape (1,3,3,3,3), col 3 have array wit...

Data Engineering

4276 Views
3 replies
1 kudos

11-14-2021 1:00:06 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

11-15-2021 2:06:47 AM

1 kudos

Maybe you could save some your code. It will be easier to answer and also we could learn deep learning in databricks from your code.

1 kudos

11-15-2021 2:06:47 AM

2 More Replies

by Nasreddin • New Contributor

11-02-2021 1:20:19 PM

5993 Views
0 replies
0 kudos

ColumnTransformer not fitted after sklearn Pipeline loaded from Mlflow

I am building a machine learning model using sklearn Pipeline which includes a ColumnTransformer as a preprocessor before the actual model. Below is the code how the pipeline is created.transformers = [] num_pipe = Pipeline(steps=[ ('imputer', Si...

Data Engineering

5993 Views
0 replies
0 kudos

11-02-2021 1:20:19 PM