cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

163050
by New Contributor II
  • 3226 Views
  • 3 replies
  • 0 kudos

Error installing datasets needed for LLM course

I signed up for this course via Databricks Academy : LLMs: Application through Production However I am getting this error when trying to download the needed datasets for the course:Installing datasets:| from "wasbs://courseware@dbacademy.blob.core.wi...

163050_0-1696757589160.png
  • 3226 Views
  • 3 replies
  • 0 kudos
Latest Reply
david_for_db
New Contributor II
  • 0 kudos

You would need to install the python library. You can either:1) Run %pip install datasets2) Put it as part of the PyPi packages to load in your cluster This should solve your issue

  • 0 kudos
2 More Replies
Noura_azza
by New Contributor II
  • 1238 Views
  • 2 replies
  • 0 kudos

AutoML split with dt column not working properly

I am using AutoML and want to split my data to train/validation and test  using a dt column (one date for train one different date for validation and a third date for test. The problem that the autoML fails, there are only training metrics (no valiat...

  • 1238 Views
  • 2 replies
  • 0 kudos
Latest Reply
maggiewang
Databricks Employee
  • 0 kudos

Hello! Did you try specify a column name as manual split column?  Then you can fully control which rows are in train / validate / test splits: https://docs.databricks.com/en/machine-learning/automl/automl-data-preparation.html#split-data-for-regressi...

  • 0 kudos
1 More Replies
stochastic
by New Contributor
  • 681 Views
  • 0 replies
  • 0 kudos

Why is spark mllib is not encouraged on the platform?/Why is ML dependent on .toPandas() on dbricks?

I'm new to Spark,Databricks and am surprised about how the Databricks tutorials for ML are using pandas DF > Spark DF. Of the tutorials I've seen, most data processing is done in a distributed manner but then its just cast to a pandas dataframe. From...

  • 681 Views
  • 0 replies
  • 0 kudos
Kjetil
by Contributor
  • 719 Views
  • 0 replies
  • 0 kudos

FeatureEngineeringClient and Unity Catalog

When testing this code  ( fe.score_batch( df=dataset.drop("Target").limit(10), model_uri=f"models:/{model_name}/{mv.version}", ) .select("prediction") .limit(10) .display() )  I get the error: â€œMlflowException: The...

  • 719 Views
  • 0 replies
  • 0 kudos
__paolo_c__
by Contributor II
  • 969 Views
  • 0 replies
  • 0 kudos

Feature tables & Null Values

Hi!I was wondering if any of you has ever dealt with Feature tables and null values (more specifically, via feature engineering objects, rather than feature store, although I don't think it really matters).In brief, null values are allowed to be stor...

  • 969 Views
  • 0 replies
  • 0 kudos
Ariane
by New Contributor II
  • 2258 Views
  • 3 replies
  • 0 kudos

Error using score_batch for batch inference

Hey everybody,I have been learning to use the Databricks feature store and I was trying to train the model using the stored features and compute batch inference. I am getting an error though, running prediction using score_batch, I have been getting ...

Ariane_0-1692892706534.png
  • 2258 Views
  • 3 replies
  • 0 kudos
Latest Reply
Ariane
New Contributor II
  • 0 kudos

Hey @Kumaran, I am using a Random forest classifier however I have tried to set the max depth to none since it is the default value but the error still exists. 

  • 0 kudos
2 More Replies
yorabhir
by New Contributor III
  • 647 Views
  • 0 replies
  • 0 kudos

ModuleNotFoundError: No module named 'model_train' when using mlflow.sklearn.load_model

Hello,I have multiple versions of a model registered in model registry. When I am trying to load any other version except model version 1 by mlflow.sklearn.load_model(f"models:/{model_name}/{model_version}")I am getting ModuleNotFoundError: No module...

  • 647 Views
  • 0 replies
  • 0 kudos
ukaplan
by New Contributor III
  • 427 Views
  • 0 replies
  • 0 kudos

Serving Endpoint Container Image Creation Fails

Hello, yesterday I send this message but I guess some AI flagging tool or non-technical moderator thought error logs are spam so no one could see my message. Thus, I am restating my problem without error logs this time.Essentially, after I train my m...

  • 427 Views
  • 0 replies
  • 0 kudos
Quinten
by New Contributor II
  • 778 Views
  • 2 replies
  • 0 kudos

TrainingSet schema difference during training and inference

Hi,I'm using the Feature Store to train an ml model and log it using MLflow and FeatureStoreClient(). This model is then used for inference.I understand the schema of the TrainingSet should not differ between training time and inference time. However...

  • 778 Views
  • 2 replies
  • 0 kudos
Latest Reply
KumaranT
New Contributor III
  • 0 kudos

Hi  @Quinten,You can consider creating a custom feature group to store the "weight" column during training. This way, you can keep the schema of the TrainingSet consistent between training and inference time.Here are the steps you can follow:Create a...

  • 0 kudos
1 More Replies
MohsenJ
by Contributor
  • 774 Views
  • 2 replies
  • 0 kudos

FeatureEngineeringClient failing to run inference with mlflow.spark flavor

I am using Databricks FeatureEngineeringClient to log my spark.ml model for batch inference. I use the ALS model on the movielens dataset. My dataset has three columns: user_id, item_id and rankhere is my code to prepare the dataset:fe_data = fe.crea...

MohsenJ_0-1723641930280.png
  • 774 Views
  • 2 replies
  • 0 kudos
Latest Reply
MohsenJ
Contributor
  • 0 kudos

@KumaranT I did it already with the same result import mlflow.pyfunc # Load the model as a PyFuncModel model = mlflow.pyfunc.load_model(model_uri=f"{model_version_uri}") # Create a Spark UDF for scoring predict_udf = mlflow.pyfunc.spark_udf(spark, ...

  • 0 kudos
1 More Replies
HappyScientist
by New Contributor
  • 1940 Views
  • 1 replies
  • 0 kudos

Received Fatal error: The Python kernel is unresponsive.

I am running a databricks job on a cluster and I keep running into the following issue (pasted below in bold) The job trains a machine learning model on a modestly sized dataset (~ half GB). Note that I use pandas dataframes for the data, sklearn for...

  • 1940 Views
  • 1 replies
  • 0 kudos
Latest Reply
KumaranT
New Contributor III
  • 0 kudos

Hi @HappyScientist,Can you increase the memory size of your cluster and try again?

  • 0 kudos
c3
by New Contributor II
  • 694 Views
  • 1 replies
  • 0 kudos

AutoML workflows will no longer run with job compute

We have a few workflows that have been running fine with job compute (runtime 14x).  They started failing on 6/3 with the following error: The cluster [xxx] is not an all-purpose cluster. existing_cluster_id only supports all-purpose cluster IDs. I w...

  • 694 Views
  • 1 replies
  • 0 kudos
Latest Reply
KumaranT
New Contributor III
  • 0 kudos

Hi @c3,We can see this Automl issue got fixed, can you check whether you are getting the same issue?

  • 0 kudos
abd
by Contributor
  • 922 Views
  • 1 replies
  • 0 kudos

Error - Langchain to interact with a SQL database

I am using databricks community edition to use langchain on SQL database in databricks.I am following this link: Interact with SQL database - DatabricksBut I am facing issue on this line: db = SQLDatabase.from_databricks(catalog="samples", schema="ny...

Machine Learning
Connection
Database
langchain
sql
  • 922 Views
  • 1 replies
  • 0 kudos
Latest Reply
KumaranT
New Contributor III
  • 0 kudos

Hi @abd,Can you check upgrading the SQL driver?

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels