cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jonathan-dufaul
by Valued Contributor
  • 1586 Views
  • 0 replies
  • 1 kudos

is it possible to change the boilerplate code on a logged/saved pyfunc mlflow model?

When I log a pyfunc mlflow model, it generates a page that has this helpful code for using the model in production. Make Predictions Predict on a Spark DataFrame: import mlflow from pyspark.sql.functions import struct, col logged_model = 'runs:/1d......

  • 1586 Views
  • 0 replies
  • 1 kudos
Slalom_Tobias
by New Contributor III
  • 6637 Views
  • 3 replies
  • 3 kudos

Resolved! ML Practioner | ML 11 - XGBoost notebook | cannot import keras.applications.resnet50

the following code...from sparkdl.xgboost import XgboostRegressorfrom pyspark.ml import Pipelineparams = {"n_estimators": 100, "learning_rate": 0.1, "max_depth": 4, "random_state": 42, "missing": 0}xgboost = XgboostRegressor(**params)pipeline = Pipel...

  • 6637 Views
  • 3 replies
  • 3 kudos
Latest Reply
Prabakar
Databricks Employee
  • 3 kudos

You need to choose the runtime for ML instead of the standard.

  • 3 kudos
2 More Replies
elgeo
by Valued Contributor II
  • 2633 Views
  • 1 replies
  • 1 kudos

Pass parameter from python to SQL - Null result

Hello. Could someone please explain at the below example, why having the prefix "da" at the parameter name allows us to select the parameter value but not having it returns to a null value?Correct valueNull value Thank you in advance

UDF_OK UDF_NOT_OK
  • 2633 Views
  • 1 replies
  • 1 kudos
Latest Reply
elgeo
Valued Contributor II
  • 1 kudos

Any insight on this? Thank you!

  • 1 kudos
anthonylavado
by New Contributor III
  • 3338 Views
  • 2 replies
  • 7 kudos

Can't Add Cluster-scoped Init Script to Model Serving Cluster

Similar to this other question: https://community.databricks.com/s/question/0D58Y00008hahwuSAA/cant-edit-the-cluster-created-by-mlflow-model-servingWe're using Azure Databricks, and have a model that requires a WHL to be downloaded from a private add...

  • 3338 Views
  • 2 replies
  • 7 kudos
Latest Reply
939772
New Contributor III
  • 7 kudos

Has anyone had success with this? Trying to solve a resolve issue.

  • 7 kudos
1 More Replies
ianchenmu
by New Contributor III
  • 6520 Views
  • 5 replies
  • 7 kudos

Parallelization in training machine learning models using MLFlow

I'm training a ML model (e.g., XGboost) and I have a large combination of 5 hyperparameters, say each parameter has 5 candidates, it will be 5^5 = 3,125 combos.Now I want to do parallelization for the grid search on all the hyperparameter combos for ...

  • 6520 Views
  • 5 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Chen Mu​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 7 kudos
4 More Replies
Mado
by Valued Contributor II
  • 14268 Views
  • 4 replies
  • 17 kudos

Error when reading Excel file: "java.lang.NoClassDefFoundError: shadeio/poi/schemas/vmldrawing/XmlDocument"

Hi, I want to read an Excel file by:filepath_xlsx = "dbfs:/FileStore/data.xlsx"       sampleDF = (spark.read.format("com.crealytics.spark.excel")   .option("Header", "true")   .option("inferSchema", "false")   .option("treatEmptyValuesAsNulls", ...

  • 14268 Views
  • 4 replies
  • 17 kudos
Latest Reply
Mado
Valued Contributor II
  • 17 kudos

For this dataset, I also tried binary file reading as below: xldf_xlsx = ( spark.read.format("binaryFile") .option("pathGlobFilter", "*.xls*") .load(filepath_xlsx) )   excel_content = xldf_xlsx.head(1)[0].content file_like_obj = io.BytesIO(excel...

  • 17 kudos
3 More Replies
NSRBX
by Contributor
  • 4298 Views
  • 2 replies
  • 4 kudos

Feature Store - Feature Lookup Engine with join on partial key and Filter

Hello ,I am working with lookupEngine functions. However, we have some feature tables with granularity level most detailled of dataframe input.Please find an example :table A with unique keys on two features : numero_p, numero_s So while performing F...

  • 4298 Views
  • 2 replies
  • 4 kudos
Latest Reply
Debayan
Databricks Employee
  • 4 kudos

Hi @SERET Nathalie​ , I can check internally on the ask here. In the meantime please let us know if this helps: https://docs.databricks.com/machine-learning/feature-store/feature-tables.htmlhttps://docs.databricks.com/machine-learning/feature-store/i...

  • 4 kudos
1 More Replies
Leodatabricks
by Contributor
  • 19248 Views
  • 2 replies
  • 2 kudos

Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

I have been getting this error sporadically. I'm loading a dataset and training a model using the dataset in notebook. Sometimes it works and sometimes it doesn't. I have seen similar posts and tried all solutions mentioned, log output size limit, sp...

  • 19248 Views
  • 2 replies
  • 2 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 2 kudos

@Leo Bao​ Are you seeing this issue whenever you are getting different sizes of data sets, or your data set size is same. if issue you are seeing is due to larger dataset, please check below link and try to increase partition size Databricks Spark Py...

  • 2 kudos
1 More Replies
Raymond_Garcia
by Contributor II
  • 3924 Views
  • 1 replies
  • 1 kudos

Resolved! Problem with Autoloader, S3, and wildcard

Hello, I have an autoloader code and it is pretty standard, we have this variable file path that points to an S3 bucket. example #2 executed successfully and example 1 throws an exception.it seems like source 1 always throws an exception whereas sour...

  • 3924 Views
  • 1 replies
  • 1 kudos
Latest Reply
Raymond_Garcia
Contributor II
  • 1 kudos

The error was more related to a lot of stuff on the AWS side, so we deleted and cleared the SQS and SNS. we also used the CloudFilesAWSResourceManagerval manager = CloudFilesAWSResourceManager .newManager .option("path", filePath) .create...

  • 1 kudos
bs_77
by New Contributor II
  • 1825 Views
  • 1 replies
  • 3 kudos

Can the HTML behind a SQL visualisations be accessed?

We are using MLFlow to manage the usage of some self service notebooks. This involves logging parameters, tables and figures. Figures are logged using:mlflow.log_figure( figure=fig, artifact_file="visual/fig.html" )Usually the fig object is gener...

  • 1825 Views
  • 1 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

There is no way to access the html used. You can download the images. The editor uses redash, so you can try looking at that library for more information.

  • 3 kudos
fermin_vicente
by New Contributor III
  • 2870 Views
  • 2 replies
  • 2 kudos

Is it wise to use a more recent MLFlow Python package version or is the DB Runtime compatibility matrix strict about MLFlow versions?

More concretely, should we fix the dependency version at MAJOR, MINOR or PATCH?For example, MLFlow 1.30.0 is available and latest DBR 11.3 LTS is compatible with 1.29.0 My question comes from the fact that installing our own libraries that use MLFlow...

  • 2870 Views
  • 2 replies
  • 2 kudos
Latest Reply
fermin_vicente
New Contributor III
  • 2 kudos

Hi! thanks for the reply, although maybe you didn't notice that I linked to the same url, so we're aware of the matrix. The question is, is it compatible solely with 1.29.0? We want to know which dependency should we use in all our projects that migh...

  • 2 kudos
1 More Replies
TomasP
by New Contributor III
  • 3284 Views
  • 3 replies
  • 0 kudos

Two or more different ml model on one cluster.

Hi, have you already dealt with the situation that you would like to have two different ml models in one cluster? i.e: I have a project which contains two or more different models with more different pursposes. The goals is to have three differ...

  • 3284 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Tomas Peterek​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 0 kudos
2 More Replies
Developer35
by New Contributor III
  • 4194 Views
  • 4 replies
  • 3 kudos

Resolved! Certification badge not received

I have completed the Databricks certified Data Engineer Associate exam on 29th October, received a mail with score and it is mentioned in mail that I would receive badge within 24 hours. It has been 4 days since I completed the exam still certificate...

  • 4194 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Manasa Tanguturu​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue.If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please visit th...

  • 3 kudos
3 More Replies
eshaanpathak
by New Contributor III
  • 6196 Views
  • 2 replies
  • 4 kudos

AttributeError: 'NoneType' object has no attribute 'enum_types_by_name'

I run into this error while using MLFlow: AttributeError: 'NoneType' object has no attribute 'enum_types_by_name'Here is the relevant stack trace:/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/mlflow/tracking/fluent....

  • 6196 Views
  • 2 replies
  • 4 kudos
Latest Reply
Debayan
Databricks Employee
  • 4 kudos

Hi, Could you please refer to this to check if this is an issue: https://github.com/protocolbuffers/protobuf/issues/10151, Also, could you please let us know the DBR version you are using?

  • 4 kudos
1 More Replies
Labels