Machine Learning

by Siebert_Looije • Contributor

10-12-2022 11:01:06 AM

1755 Views
1 replies
1 kudos

What is the best way to deal with pymc3 in MLFLOW models in databricks?

Last week, we started with using mlflow within databricks. The bayesian models that we are using right now are the pymc3 models (https://docs.pymc.io/en/v3/index.html).We could use the experiment feature of databricks/mlflow to save the models as an ...

Machine Learning

Reply

1755 Views
1 replies
1 kudos

10-12-2022 11:01:06 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-24-2022 10:34:08 PM

1 kudos

Hi @Siebert Looije Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon. Thanks.

1 kudos

11-24-2022 10:34:08 PM

by jonathan-dufaul • Valued Contributor

11-24-2022 12:24:46 PM

1469 Views
0 replies
1 kudos

is it possible to change the boilerplate code on a logged/saved pyfunc mlflow model?

When I log a pyfunc mlflow model, it generates a page that has this helpful code for using the model in production. Make Predictions Predict on a Spark DataFrame: import mlflow from pyspark.sql.functions import struct, col logged_model = 'runs:/1d......

Machine Learning

Reply

1469 Views
0 replies
1 kudos

11-24-2022 12:24:46 PM

by Slalom_Tobias • New Contributor III

08-01-2022 10:12:30 AM

5901 Views
3 replies
3 kudos

Resolved! ML Practioner | ML 11 - XGBoost notebook | cannot import keras.applications.resnet50

the following code...from sparkdl.xgboost import XgboostRegressorfrom pyspark.ml import Pipelineparams = {"n_estimators": 100, "learning_rate": 0.1, "max_depth": 4, "random_state": 42, "missing": 0}xgboost = XgboostRegressor(**params)pipeline = Pipel...

Machine Learning

Reply

5901 Views
3 replies
3 kudos

08-01-2022 10:12:30 AM

View Replies

Latest Reply

Prabakar
Databricks Employee

11-22-2022 11:12:37 AM

3 kudos

You need to choose the runtime for ML instead of the standard.

3 kudos

11-22-2022 11:12:37 AM

2 More Replies

by elgeo • Valued Contributor II

11-16-2022 1:30:05 AM

2430 Views
1 replies
1 kudos

Pass parameter from python to SQL - Null result

Hello. Could someone please explain at the below example, why having the prefix "da" at the parameter name allows us to select the parameter value but not having it returns to a null value?Correct valueNull value Thank you in advance

Machine Learning

Reply

2430 Views
1 replies
1 kudos

11-16-2022 1:30:05 AM

View Replies

Latest Reply

elgeo
Valued Contributor II

11-21-2022 11:59:03 PM

1 kudos

Any insight on this? Thank you!

1 kudos

11-21-2022 11:59:03 PM

by anthonylavado • New Contributor III

06-07-2022 9:34:08 AM

3097 Views
2 replies
7 kudos

Can't Add Cluster-scoped Init Script to Model Serving Cluster

Similar to this other question: https://community.databricks.com/s/question/0D58Y00008hahwuSAA/cant-edit-the-cluster-created-by-mlflow-model-servingWe're using Azure Databricks, and have a model that requires a WHL to be downloaded from a private add...

Machine Learning

Reply

3097 Views
2 replies
7 kudos

06-07-2022 9:34:08 AM

View Replies

Latest Reply

939772
New Contributor III

11-21-2022 3:53:23 PM

7 kudos

Has anyone had success with this? Trying to solve a resolve issue.

7 kudos

11-21-2022 3:53:23 PM

1 More Replies

by ianchenmu • New Contributor III

10-13-2022 7:55:19 AM

6080 Views
5 replies
7 kudos

Parallelization in training machine learning models using MLFlow

I'm training a ML model (e.g., XGboost) and I have a large combination of 5 hyperparameters, say each parameter has 5 candidates, it will be 5^5 = 3,125 combos.Now I want to do parallelization for the grid search on all the hyperparameter combos for ...

Machine Learning

Reply

6080 Views
5 replies
7 kudos

10-13-2022 7:55:19 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-20-2022 11:45:45 PM

7 kudos

Hi @Chen Mu Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

7 kudos

11-20-2022 11:45:45 PM

4 More Replies

by Mado • Valued Contributor II

11-19-2022 12:03:26 AM

13724 Views
4 replies
17 kudos

Error when reading Excel file: "java.lang.NoClassDefFoundError: shadeio/poi/schemas/vmldrawing/XmlDocument"

Hi, I want to read an Excel file by:filepath_xlsx = "dbfs:/FileStore/data.xlsx" sampleDF = (spark.read.format("com.crealytics.spark.excel") .option("Header", "true") .option("inferSchema", "false") .option("treatEmptyValuesAsNulls", ...

Machine Learning

Reply

13724 Views
4 replies
17 kudos

11-19-2022 12:03:26 AM

View Replies

Latest Reply

Mado
Valued Contributor II

11-19-2022 4:56:34 AM

17 kudos

For this dataset, I also tried binary file reading as below: xldf_xlsx = ( spark.read.format("binaryFile") .option("pathGlobFilter", "*.xls*") .load(filepath_xlsx) ) excel_content = xldf_xlsx.head(1)[0].content file_like_obj = io.BytesIO(excel...

17 kudos

11-19-2022 4:56:34 AM

3 More Replies

by 614849 • New Contributor II

11-18-2022 7:20:52 AM

1544 Views
0 replies
2 kudos

How to get the Probability of a prediction from Real Time Inference model

Hello,I have used AutoML to create a model. When using that model I want to have the probability of the predictions returned. I was able to do this in a notebook with:loaded_model = mlflow.pyfunc.load_model(logged_model)# Predict on a Pandas DataFram...

Machine Learning

Reply

1544 Views
0 replies
2 kudos

11-18-2022 7:20:52 AM

by NSRBX • Contributor

11-17-2022 1:30:49 AM

4009 Views
2 replies
4 kudos

Feature Store - Feature Lookup Engine with join on partial key and Filter

Hello ,I am working with lookupEngine functions. However, we have some feature tables with granularity level most detailled of dataframe input.Please find an example :table A with unique keys on two features : numero_p, numero_s So while performing F...

Machine Learning

Reply

4009 Views
2 replies
4 kudos

11-17-2022 1:30:49 AM

View Replies

Latest Reply

Debayan
Databricks Employee

11-17-2022 11:21:28 PM

4 kudos

Hi @SERET Nathalie , I can check internally on the ask here. In the meantime please let us know if this helps: https://docs.databricks.com/machine-learning/feature-store/feature-tables.htmlhttps://docs.databricks.com/machine-learning/feature-store/i...

4 kudos

11-17-2022 11:21:28 PM

1 More Replies

by Leodatabricks • Contributor

11-17-2022 1:47:51 PM

18792 Views
2 replies
2 kudos

Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

I have been getting this error sporadically. I'm loading a dataset and training a model using the dataset in notebook. Sometimes it works and sometimes it doesn't. I have seen similar posts and tried all solutions mentioned, log output size limit, sp...

Machine Learning

Reply

18792 Views
2 replies
2 kudos

11-17-2022 1:47:51 PM

View Replies

Latest Reply

karthik_p
Esteemed Contributor

11-17-2022 3:27:22 PM

2 kudos

@Leo Bao Are you seeing this issue whenever you are getting different sizes of data sets, or your data set size is same. if issue you are seeing is due to larger dataset, please check below link and try to increase partition size Databricks Spark Py...

2 kudos

11-17-2022 3:27:22 PM

1 More Replies

by Raymond_Garcia • Contributor II

11-14-2022 9:19:02 AM

3686 Views
1 replies
1 kudos

Resolved! Problem with Autoloader, S3, and wildcard

Hello, I have an autoloader code and it is pretty standard, we have this variable file path that points to an S3 bucket. example #2 executed successfully and example 1 throws an exception.it seems like source 1 always throws an exception whereas sour...

Machine Learning

Reply

3686 Views
1 replies
1 kudos

11-14-2022 9:19:02 AM

View Replies

Latest Reply

Raymond_Garcia
Contributor II

11-16-2022 7:43:14 AM

1 kudos

The error was more related to a lot of stuff on the AWS side, so we deleted and cleared the SQS and SNS. we also used the CloudFilesAWSResourceManagerval manager = CloudFilesAWSResourceManager .newManager .option("path", filePath) .create...

1 kudos

11-16-2022 7:43:14 AM

by bs_77 • New Contributor II

11-13-2022 5:40:26 AM

1706 Views
1 replies
3 kudos

Can the HTML behind a SQL visualisations be accessed?

We are using MLFlow to manage the usage of some self service notebooks. This involves logging parameters, tables and figures. Figures are logged using:mlflow.log_figure( figure=fig, artifact_file="visual/fig.html" )Usually the fig object is gener...

Machine Learning

Reply

1706 Views
1 replies
3 kudos

11-13-2022 5:40:26 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-14-2022 5:14:05 AM

3 kudos

There is no way to access the html used. You can download the images. The editor uses redash, so you can try looking at that library for more information.

3 kudos

11-14-2022 5:14:05 AM

by fermin_vicente • New Contributor III

11-11-2022 2:12:01 AM

2521 Views
2 replies
2 kudos

Is it wise to use a more recent MLFlow Python package version or is the DB Runtime compatibility matrix strict about MLFlow versions?

More concretely, should we fix the dependency version at MAJOR, MINOR or PATCH?For example, MLFlow 1.30.0 is available and latest DBR 11.3 LTS is compatible with 1.29.0 My question comes from the fact that installing our own libraries that use MLFlow...

Machine Learning

Reply

2521 Views
2 replies
2 kudos

11-11-2022 2:12:01 AM

View Replies

Latest Reply

fermin_vicente
New Contributor III

11-14-2022 2:06:14 AM

2 kudos

Hi! thanks for the reply, although maybe you didn't notice that I linked to the same url, so we're aware of the matrix. The question is, is it compatible solely with 1.29.0? We want to know which dependency should we use in all our projects that migh...

2 kudos

11-14-2022 2:06:14 AM

1 More Replies

by TomasP • New Contributor III

10-05-2022 1:28:53 AM

2953 Views
3 replies
0 kudos

Two or more different ml model on one cluster.

Hi, have you already dealt with the situation that you would like to have two different ml models in one cluster? i.e: I have a project which contains two or more different models with more different pursposes. The goals is to have three differ...

Machine Learning

Reply

2953 Views
3 replies
0 kudos

10-05-2022 1:28:53 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-12-2022 10:27:23 PM

0 kudos

Hi @Tomas Peterek Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

0 kudos

11-12-2022 10:27:23 PM

2 More Replies

by Developer35 • New Contributor III

11-02-2022 1:07:18 AM

3927 Views
4 replies
3 kudos

Resolved! Certification badge not received

I have completed the Databricks certified Data Engineer Associate exam on 29th October, received a mail with score and it is mentioned in mail that I would receive badge within 24 hours. It has been 4 days since I completed the exam still certificate...

Machine Learning

Reply

3927 Views
4 replies
3 kudos

11-02-2022 1:07:18 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-10-2022 3:54:25 AM

3 kudos

Hi @Manasa Tanguturu Hope everything is going great.Just wanted to check in if you were able to resolve your issue.If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please visit th...

3 kudos

11-10-2022 3:54:25 AM

3 More Replies

Databricks Community

Forum Posts

What is the best way to deal with pymc3 in MLFLOW models in databricks?

is it possible to change the boilerplate code on a logged/saved pyfunc mlflow model?

Resolved! ML Practioner | ML 11 - XGBoost notebook | cannot import keras.applications.resnet50

Pass parameter from python to SQL - Null result

Can't Add Cluster-scoped Init Script to Model Serving Cluster

Parallelization in training machine learning models using MLFlow

Error when reading Excel file: "java.lang.NoClassDefFoundError: shadeio/poi/schemas/vmldrawing/XmlDocument"

How to get the Probability of a prediction from Real Time Inference model

Feature Store - Feature Lookup Engine with join on partial key and Filter

Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

Resolved! Problem with Autoloader, S3, and wildcard

Can the HTML behind a SQL visualisations be accessed?

Is it wise to use a more recent MLFlow Python package version or is the DB Runtime compatibility matrix strict about MLFlow versions?

Two or more different ml model on one cluster.

Resolved! Certification badge not received

Join Us as a Local Community Builder!

Problem loading a pyfunc model in job run

Serving Endpoint Disappears After One Day

Can't use pyspark bucketizer

VLLM dependency Issues with DBR 17.0

Custom docker container for GPU compute using pyth...