Machine Learning

by Data_Cowboy • New Contributor III

03-16-2023 2:12:01 PM

3589 Views
3 replies
0 kudos

Resolved! Problems with xgboost.spark model loading from MLflow.

When loading an xgboost model from mlflow following the provided instructions in Databricks hosted MLflow the input sizes I am showing on the job are over 1 TB. Is anyone else using an xgboost.spark model and noticing the same behavior? Below are som...

Machine Learning

Reply

3589 Views
3 replies
0 kudos

03-16-2023 2:12:01 PM

View Replies

Latest Reply

dbx-user7354
New Contributor III

04-26-2024 2:07:54 AM

0 kudos

Thank you very much @Data_Cowboy !!! I had the same issue. I even had 14 TiB Databricks should really fix this

0 kudos

04-26-2024 2:07:54 AM

2 More Replies

by kng88 • New Contributor II

11-28-2022 11:52:11 AM

5614 Views
6 replies
7 kudos

How to save model produce by distributed training?

I am trying to save model after distributed training via the following codeimport sys from spark_tensorflow_distributor import MirroredStrategyRunner import mlflow.keras mlflow.keras.autolog() mlflow.log_param("learning_rate", 0.001) import...

Machine Learning

Reply

5614 Views
6 replies
7 kudos

11-28-2022 11:52:11 AM

View Replies

Latest Reply

Xiaowei
New Contributor III

03-21-2024 6:50:55 AM

7 kudos

I think I finally worked this out.Here is the extra code to save out the model only once and from the 1st node:context = pyspark.BarrierTaskContext.get() if context.partitionId() == 0: mlflow.keras.log_model(model, "mymodel")

7 kudos

03-21-2024 6:50:55 AM

5 More Replies

by 145093 • New Contributor II

07-18-2022 12:26:00 PM

6367 Views
2 replies
2 kudos

MLFlow model loading taking long time and "model serving" failing during init

I am trying to load a simple Minmaxscaler model that was logged as a run through spark's ML Pipeline api for reuse. On average it takes 40+ seconds just to load the model with the following example: This is fine and the model transforms my data corre...

sometimes the model takes almost 3 min just to load

Machine Learning

Reply

6367 Views
2 replies
2 kudos

07-18-2022 12:26:00 PM

View Replies

Latest Reply

DanSimpson
New Contributor II

10-19-2023 12:54:39 AM

2 kudos

Hello,Any solutions found for this issue?I'm serving up a large number of models at a time, but since we converted to PySpark (due to our data demands), the mlflow.spark.load_model() is taking hours.Part of the reason to switch to spark was to help w...

2 kudos

10-19-2023 12:54:39 AM

1 More Replies

by jonathan-dufaul • Valued Contributor

11-23-2022 9:00:44 PM

4166 Views
5 replies
5 kudos

Does FeatureStoreClient().score_batch support multidimentional predictions?

I have a pyfunc model that I can use to get predictions. It takes time series data with context information at each date, and produces a string of predictions. For example:The data is set up like below (temp/pressure/output are different than my inpu...

Machine Learning

Reply

4166 Views
5 replies
5 kudos

11-23-2022 9:00:44 PM

View Replies

Latest Reply

EmilAndersson
New Contributor II

09-05-2023 4:49:43 AM

5 kudos

I have the same question. I've decided to look for alternative Feature Stores as this makes it very difficult to use for time series forecasting.

5 kudos

09-05-2023 4:49:43 AM

4 More Replies

by Jaeseon • New Contributor II

06-02-2023 12:35:16 PM

4236 Views
3 replies
3 kudos

Resolved! Distributed training on building object detection model on PyTorch and PySpark.

I'm currently immersed in a project where I'm leveraging PyTorch to develop an object detection model using satellite imagery. My immediate objective is to perform distributed training on this model using PySpark. While I have found several tutorials...

Machine Learning

Reply

4236 Views
3 replies
3 kudos

06-02-2023 12:35:16 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 12:11:14 AM

3 kudos

Hi @Jaeseon Song Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

3 kudos

06-14-2023 12:11:14 AM

2 More Replies

by thomasm • New Contributor II

04-13-2023 4:41:26 AM

4588 Views
3 replies
1 kudos

Resolved! Online Feature Store MLflow serving problem

When I try to serve a model stored with FeatureStoreClient().log_model using the feature-store-online-example-cosmosdb tutorial Notebook, I get errors suggesting that the primary key schema is not configured properly. However, if I look in the Featur...

Machine Learning

Reply

4588 Views
3 replies
1 kudos

04-13-2023 4:41:26 AM

View Replies

Latest Reply

NandiniN
Databricks Employee

04-14-2023 12:13:14 AM

1 kudos

Hello @Thomas Michielsen , this error seems to occur when you may have created the table yourself. You must use publish_table() to create the table in the online store. Do not manually create a database or container inside Cosmos DB. publish_table()...

1 kudos

04-14-2023 12:13:14 AM

2 More Replies

by invalidargument • New Contributor III

01-18-2023 3:13:18 AM

1058 Views
1 replies
0 kudos

Model storage requirements management

Hi.We have around 30 models in model storage that we use for batch scoring. These are created at different times by different person and on different cluster run times.Now we have run into problems that we can't de-serialize the models and use for in...

Machine Learning

Reply

1058 Views
1 replies
0 kudos

01-18-2023 3:13:18 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 8:05:12 AM

0 kudos

@Jonas Lindberg :To address the issues you are facing with model serialization and versioning, I would recommend the following approach:Use MLflow to manage the lifecycle of your models, including versioning, deployment, and monitoring. MLflow is an...

0 kudos

04-10-2023 8:05:12 AM

by pol7451 • New Contributor

04-03-2023 9:17:55 AM

1071 Views
2 replies
0 kudos

Automating model history with multiple downstream elements

Hey, We got two models A and BModel A is fed from raw data that is firstly Clean / enriched and forecasted The results from model A are what are fed into model Bthe processes for cleaning, enriching, forecasting, model A and model B are all under ver...

Machine Learning

Reply

1071 Views
2 replies
0 kudos

04-03-2023 9:17:55 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-04-2023 10:30:15 PM

0 kudos

Hi @polly halton Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

0 kudos

04-04-2023 10:30:15 PM

1 More Replies

by Saeid_H • Contributor

03-22-2023 5:35:25 AM

11889 Views
5 replies
4 kudos

Register mlflow custom model, which has pickle files

Dear community,I want to basically store 2 pickle files during the training and model registry with my keras model. So that when I access the model from another workspace (using mlflow.set_registery_uri()) , these models can be accessed as well. The ...

Machine Learning

Reply

11889 Views
5 replies
4 kudos

03-22-2023 5:35:25 AM

View Replies

Latest Reply

arzex
New Contributor II

04-03-2023 12:37:34 AM

4 kudos

آموزش تولید محتوا

4 kudos

04-03-2023 12:37:34 AM

4 More Replies

by Orianh • Valued Contributor II

03-21-2023 9:59:57 AM

1680 Views
2 replies
0 kudos

TF SummaryWriter flush() don't send any buffered data to storage.

Hey guys, I'm training a TF model in databricks, and logging to tensorboard using SummaryWriter. At the end of each epoch SummaryWriter.flush() is called which should send any buffered data into storage. But i can't see the tensorboard files while th...

Machine Learning

Reply

1680 Views
2 replies
0 kudos

03-21-2023 9:59:57 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 6:53:05 PM

0 kudos

Hi @orian hindi Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

0 kudos

03-31-2023 6:53:05 PM

1 More Replies

by Hubert-Dudek • Esteemed Contributor III

03-26-2023 1:03:54 PM

1229 Views
1 replies
7 kudos

Have you heard about databricks latest open-source language model called Dolly? It’s a ChatGPT like model that uses the tatsu-lab/alpaca dataset with ...

Have you heard about databricks latest open-source language model called Dolly? It’s a ChatGPT like model that uses the tatsu-lab/alpaca dataset with examples of questions and answers. To train Dolly, you can combine this dataset (simple solution on ...

Machine Learning

Reply

1229 Views
1 replies
7 kudos

03-26-2023 1:03:54 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 8:25:24 AM

7 kudos

Thanks for posting this! I am so excited about the possibilities that this can do for us. It's an exciting development in the natural language processing field, and it has the potential to be a valuable tool for businesses looking to implement chatb...

7 kudos

03-31-2023 8:25:24 AM

by Tilo • New Contributor

03-20-2023 8:20:43 AM

3944 Views
3 replies
3 kudos

Resolved! MLFlow: How to load results from model and continue training

I'd like to continue / finetune training of an existing keras/tensorflow model. We use MLFlow to store the model. How can I load the wieght from an existing model to the model and continue "fit" preferable with a different learning rate.Just loading ...

Machine Learning

Reply

3944 Views
3 replies
3 kudos

03-20-2023 8:20:43 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-26-2023 11:29:26 PM

3 kudos

Hi @Tilo Wünsche Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

3 kudos

03-26-2023 11:29:26 PM

2 More Replies

by zachclem • New Contributor III

03-11-2023 8:52:54 AM

3936 Views
2 replies
1 kudos

Resolved! Logging model to MLflow using Feature Store API. Getting TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict'

I'm using databricks. Trying to log a model to MLflow using the Feature Store log_model function. but I have this error: TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict' I'am using the Databricks runtime ml (10.4 LTS M...

Machine Learning

Reply

3936 Views
2 replies
1 kudos

03-11-2023 8:52:54 AM

View Replies

Latest Reply

zachclem
New Contributor III

03-14-2023 1:56:14 PM

1 kudos

I updated by Databricks Run Time from 10.4 to 12.1 and this solved the issue.

1 kudos

03-14-2023 1:56:14 PM

1 More Replies

by notsure • New Contributor

02-09-2023 9:30:08 AM

2505 Views
1 replies
1 kudos

Model serving with Serverless Real-Time Inference - How could I call the endpoint with json file consisted of raw text that need to be transformed and get the prediction?

Hi!I want to call the generated endpoint with a json file consisted of texts directly, could this endpoint take the raw texts, transform the texts into vectors and then output the prediction?Is there a way to support so?Thanks in advance!!!

Machine Learning

Reply

2505 Views
1 replies
1 kudos

02-09-2023 9:30:08 AM

View Replies

Latest Reply

Debayan
Databricks Employee

02-12-2023 9:21:59 PM

1 kudos

Hi, the updated document is : https://docs.databricks.com/machine-learning/model-inference/serverless/serverless-real-time-inference.html, (mentioned in the document stated above: This documentation has been retired and might not be updated. The prod...

1 kudos

02-12-2023 9:21:59 PM

by Charley • New Contributor II

12-16-2022 1:03:58 AM

7257 Views
1 replies
1 kudos

error status 400 calling serving model endpoint invocation using personal access token on Azure Databricks

Hi all, I've deployed a model, moved it to production and served it (mlflow), but when testing it in the python notebook I get a 400 error. code/details below:import osimport requestsimport jsonimport pandas as pdimport numpy as np# Create two record...

Machine Learning

Reply

7257 Views
1 replies
1 kudos

12-16-2022 1:03:58 AM

View Replies

Latest Reply

nakany
New Contributor II

02-07-2023 5:02:54 AM

1 kudos

data_json in the score_model function should be defined as followsds_dict = {"dataframe_split": dataset.to_dict(orient='split')} if isinstance(dataset, pd.DataFrame) else create_tf_serving_json(dataset)

1 kudos

02-07-2023 5:02:54 AM

Databricks Community

Forum Posts

Resolved! Problems with xgboost.spark model loading from MLflow.

How to save model produce by distributed training?

MLFlow model loading taking long time and "model serving" failing during init

Does FeatureStoreClient().score_batch support multidimentional predictions?

Resolved! Distributed training on building object detection model on PyTorch and PySpark.

Resolved! Online Feature Store MLflow serving problem

Model storage requirements management

Automating model history with multiple downstream elements

Register mlflow custom model, which has pickle files

TF SummaryWriter flush() don't send any buffered data to storage.

Have you heard about databricks latest open-source language model called Dolly? It’s a ChatGPT like model that uses the tatsu-lab/alpaca dataset with ...

Resolved! MLFlow: How to load results from model and continue training

Resolved! Logging model to MLflow using Feature Store API. Getting TypeError: join() argument must be str, bytes, or os.PathLike object, not 'dict'

Model serving with Serverless Real-Time Inference - How could I call the endpoint with json file consisted of raw text that need to be transformed and get the prediction?

error status 400 calling serving model endpoint invocation using personal access token on Azure Databricks