cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

yorabhir
by New Contributor III
  • 1068 Views
  • 0 replies
  • 0 kudos

ModuleNotFoundError: No module named 'model_train' when using mlflow.sklearn.load_model

Hello,I have multiple versions of a model registered in model registry. When I am trying to load any other version except model version 1 by mlflow.sklearn.load_model(f"models:/{model_name}/{model_version}")I am getting ModuleNotFoundError: No module...

  • 1068 Views
  • 0 replies
  • 0 kudos
ukaplan
by New Contributor III
  • 761 Views
  • 0 replies
  • 0 kudos

Serving Endpoint Container Image Creation Fails

Hello, yesterday I send this message but I guess some AI flagging tool or non-technical moderator thought error logs are spam so no one could see my message. Thus, I am restating my problem without error logs this time.Essentially, after I train my m...

  • 761 Views
  • 0 replies
  • 0 kudos
Quinten
by New Contributor II
  • 1750 Views
  • 2 replies
  • 0 kudos

TrainingSet schema difference during training and inference

Hi,I'm using the Feature Store to train an ml model and log it using MLflow and FeatureStoreClient(). This model is then used for inference.I understand the schema of the TrainingSet should not differ between training time and inference time. However...

  • 1750 Views
  • 2 replies
  • 0 kudos
Latest Reply
KumaranT
Databricks Employee
  • 0 kudos

Hi  @Quinten,You can consider creating a custom feature group to store the "weight" column during training. This way, you can keep the schema of the TrainingSet consistent between training and inference time.Here are the steps you can follow:Create a...

  • 0 kudos
1 More Replies
MohsenJ
by Contributor
  • 1829 Views
  • 2 replies
  • 0 kudos

FeatureEngineeringClient failing to run inference with mlflow.spark flavor

I am using Databricks FeatureEngineeringClient to log my spark.ml model for batch inference. I use the ALS model on the movielens dataset. My dataset has three columns: user_id, item_id and rankhere is my code to prepare the dataset:fe_data = fe.crea...

MohsenJ_0-1723641930280.png
  • 1829 Views
  • 2 replies
  • 0 kudos
Latest Reply
MohsenJ
Contributor
  • 0 kudos

@KumaranT I did it already with the same result import mlflow.pyfunc # Load the model as a PyFuncModel model = mlflow.pyfunc.load_model(model_uri=f"{model_version_uri}") # Create a Spark UDF for scoring predict_udf = mlflow.pyfunc.spark_udf(spark, ...

  • 0 kudos
1 More Replies
HappyScientist
by New Contributor
  • 3362 Views
  • 1 replies
  • 0 kudos

Received Fatal error: The Python kernel is unresponsive.

I am running a databricks job on a cluster and I keep running into the following issue (pasted below in bold) The job trains a machine learning model on a modestly sized dataset (~ half GB). Note that I use pandas dataframes for the data, sklearn for...

  • 3362 Views
  • 1 replies
  • 0 kudos
Latest Reply
KumaranT
Databricks Employee
  • 0 kudos

Hi @HappyScientist,Can you increase the memory size of your cluster and try again?

  • 0 kudos
c3
by New Contributor II
  • 1620 Views
  • 1 replies
  • 0 kudos

AutoML workflows will no longer run with job compute

We have a few workflows that have been running fine with job compute (runtime 14x).  They started failing on 6/3 with the following error: The cluster [xxx] is not an all-purpose cluster. existing_cluster_id only supports all-purpose cluster IDs. I w...

  • 1620 Views
  • 1 replies
  • 0 kudos
Latest Reply
KumaranT
Databricks Employee
  • 0 kudos

Hi @c3,We can see this Automl issue got fixed, can you check whether you are getting the same issue?

  • 0 kudos
abd
by Contributor
  • 2013 Views
  • 1 replies
  • 0 kudos

Error - Langchain to interact with a SQL database

I am using databricks community edition to use langchain on SQL database in databricks.I am following this link: Interact with SQL database - DatabricksBut I am facing issue on this line: db = SQLDatabase.from_databricks(catalog="samples", schema="ny...

Machine Learning
Connection
Database
langchain
sql
  • 2013 Views
  • 1 replies
  • 0 kudos
Latest Reply
KumaranT
Databricks Employee
  • 0 kudos

Hi @abd,Can you check upgrading the SQL driver?

  • 0 kudos
espartaco
by New Contributor
  • 2885 Views
  • 1 replies
  • 0 kudos

MLflow autolging is not registering my experiments

When training a any ML model in a Databricks notebook, after calling model.fit() and train the model, before the model was automatically saved, but now is giving me this error:WARNING mlflow.utils.autologging_utils: Encountered unexpected error durin...

  • 2885 Views
  • 1 replies
  • 0 kudos
Latest Reply
KumaranT
Databricks Employee
  • 0 kudos

Hi @espartaco,The error message shows that there's an issue with SSL certificate verification when trying to connect to the Azure storage endpointCheck network and firewall configurations: You need to ensure that the network and firewall configuratio...

  • 0 kudos
fh
by New Contributor
  • 1327 Views
  • 2 replies
  • 0 kudos

Applyinpandas executed twice

Hi,I have a dataframe containing records (sales) over time for +- 1000 different items, so based on these records each item has its own timeseries. The goal is to make predictions for each of these items. Since the behaviour of these items is very di...

  • 1327 Views
  • 2 replies
  • 0 kudos
Latest Reply
KumaranT
Databricks Employee
  • 0 kudos

Hi @fh ,To avoid this double execution, you can try using the concurrent.futures module in Python to parallelize the training of your models. This module provides a high-level interface for asynchronously executing callables.

  • 0 kudos
1 More Replies
acdello
by New Contributor
  • 1101 Views
  • 2 replies
  • 0 kudos

Databricks documentation for training a local LLM

Im in the process of training a chat-bot for my team to use to learn about databricks and relevant tools quickly. Is there a place that I can easily (and legally) grab learning material in PDF or text? 

  • 1101 Views
  • 2 replies
  • 0 kudos
Latest Reply
KumaranT
Databricks Employee
  • 0 kudos

Hi @acdello,Could you check this doc if that helps in between?

  • 0 kudos
1 More Replies
chagoo
by New Contributor
  • 782 Views
  • 1 replies
  • 0 kudos

error tu run btyd model

I run the model in april and ok but today I need run the model and I have error and it is not possible continue I change the penalizer_coef and nothing # fit a model with a larger penalizer coefficientbgf_engagement = BetaGeoFitter(penalizer_coef=100...

  • 782 Views
  • 1 replies
  • 0 kudos
Latest Reply
Retired_mod
Esteemed Contributor III
  • 0 kudos

Hi @chagoo,To fix this, try lowering the penalizer coefficient, checking the data quality for anomalies, scaling the data, increasing the number of iterations, or experimenting with different initial parameters. These steps should help resolve the co...

  • 0 kudos
EijayK
by New Contributor
  • 1480 Views
  • 1 replies
  • 0 kudos

Debugging using vscode & databricks connect

Hi allI'm facing some difficulties when I use DataBricks Connect to debug my ML solution. A long story short, I want to investigate a few variables after I've conducted training. With the debugger at hand, I can simply place a breakpoint on the line ...

  • 1480 Views
  • 1 replies
  • 0 kudos
Latest Reply
Retired_mod
Esteemed Contributor III
  • 0 kudos

Hi @EijayK, Ensure that the package is installed on the cluster itself, which you can verify through the cluster's library installation logs. Additionally, make sure your cluster meets all Databricks Connect requirements, including proper configurati...

  • 0 kudos
Kjetil
by Contributor
  • 1212 Views
  • 2 replies
  • 2 kudos

Feature Store - lookback_window does not work with primary keys of "date" type

I just discovered what I believe is a bug in Feature Store. The expected value (of the "value" column) is 'NULL' but the actual value is "a". If I instead change the format to timestamp of the "date" column (i.e. removes the .date() in the generation...

  • 1212 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kjetil
Contributor
  • 2 kudos

Thank you for answering. Yes, that is also what I figured out. In other words the lookback_window argument only works when using timestamp format for the primary key. I cannot see that this behavior is described in the documentation.

  • 2 kudos
1 More Replies
yorabhir
by New Contributor III
  • 6949 Views
  • 2 replies
  • 2 kudos

Resolved! How to search the run id of an experiment run created in another notebook?

Hello,I have created an experiment using with mlflow.start_run(run_name='experment_1'):in a notebook say 'notebook_1'.  In the 'Experiments' tab if I click on 'notebook_1', I am able to see 'experiment_1'. Now I am trying to search the experiment in ...

  • 6949 Views
  • 2 replies
  • 2 kudos
Latest Reply
yorabhir
New Contributor III
  • 2 kudos

Thank you @atmcqueen , the solution is working.

  • 2 kudos
1 More Replies
Labels