cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

AChang
by New Contributor III
  • 7111 Views
  • 1 replies
  • 1 kudos

MlflowException: Unable to download model artifacts in Databricks while registering model with MLflo

I am attempting to log, register, and deploy a finetuned GPT2 model in Databricks. While I have been able to get my logging code to run, when I try to run my registration code, I get an MlflowException error.Here is my model logging code.mlflow.set_r...

  • 7111 Views
  • 1 replies
  • 1 kudos
Latest Reply
TimoLeco_56656
New Contributor II
  • 1 kudos

I've experience the same error. The issue is that the model uri is not correct.The model is logged with:mlflow.transformers.log_model( ... , artifact_path="gpt2", ...)The artifact_path is the last part of the model uri. If you don't specify it, it's ...

  • 1 kudos
Akash_Wadhankar
by New Contributor III
  • 1023 Views
  • 0 replies
  • 0 kudos

Learn Databricks AI medium article series for fellow learners.

When it comes to machine learning, the platform plays a pivotal role in successful implementation. Databricks offers a best-in-class machine learning platform with cutting-edge features such as MLflow, Model Registry, Feature Store, and MLOps, which ...

Machine Learning
DatabricksML MachineLearning AI FeatureStore DecisionScience
  • 1023 Views
  • 0 replies
  • 0 kudos
nicobuko
by New Contributor III
  • 3462 Views
  • 13 replies
  • 2 kudos

MLflow: Connect python with Community Edition

Hello,I am new to databricks and want to work with MLFlow in the Databricks Community Edition. In python i am using mlflow.login(). This requests me to enter a password. But i do not have any password due to the fact that databricks login only requir...

  • 3462 Views
  • 13 replies
  • 2 kudos
Latest Reply
Walter_C
Databricks Employee
  • 2 kudos

I am currently looking with our internal teams if this will be provided in the near future, still waiting for confirmation.

  • 2 kudos
12 More Replies
sjohnston2
by New Contributor II
  • 3393 Views
  • 2 replies
  • 2 kudos

Resolved! XGBoost Feature Weighting

We are trying to train a predictive ML model using the XGBoost Classifier. Part of the requirements we have gotten from our business team is to implement feature weighting as they have defined certain features mattering more than others. We have 69 f...

  • 3393 Views
  • 2 replies
  • 2 kudos
Latest Reply
Walter_C
Databricks Employee
  • 2 kudos

Hello @sjohnston2 here is some information i found internally: Possible Causes Memory Access Issue: The segmentation fault suggests that the program is trying to access memory that it's not allowed to, which could be caused by an internal bug in XGBo...

  • 2 kudos
1 More Replies
AnkurMittal008
by New Contributor III
  • 2037 Views
  • 2 replies
  • 1 kudos

Resolved! Online Feature Table : Storage

Databricks Online Feature Table is in public preview , And we have some questions on this 1) What storage is being used for Online Feature table's Data. Our offline feature table is stored in Unity Catalog managed S3 bucket (Customer AWS ). Does onli...

  • 2037 Views
  • 2 replies
  • 1 kudos
Latest Reply
AnkurMittal008
New Contributor III
  • 1 kudos

Thanks a lot @Walter_C ..

  • 1 kudos
1 More Replies
mharrison
by New Contributor II
  • 4019 Views
  • 1 replies
  • 1 kudos

Resolved! No Spark Session Available Within Model Serving Environment

Hi,Is it possible to have a Spark session, that can be to used query the Unity Catalog etc, available within a Model Serving?I have an MLFlow Pyfunc model that needs to get data from a Feature Table as part of its `.predict()` method. See my earlier ...

  • 4019 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @mharrison  Creating a Spark session within a Model Serving environment is not directly supported, which is why you are encountering the Exception: No SparkSession Available! error. This limitation arises because the serving environment does not a...

  • 1 kudos
byrnesy5
by New Contributor II
  • 27317 Views
  • 3 replies
  • 0 kudos

dbfs not found

Hi, I've saved a custom pyfunc and now I'm trying to load it in a pandas_udf. It works on small samples or if I repartition everything to 1 partition, but when I try to run it on a larger sample and distribute it across my cluster it fails repeatably...

  • 27317 Views
  • 3 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

This problem can often be attributed to the model artifacts not being available on all the executors, especially in a distributed environment. Can you try using the dbutils.fs.refreshMounts() in your code? If the model is small enough, broadcast it t...

  • 0 kudos
2 More Replies
mharrison
by New Contributor II
  • 1530 Views
  • 2 replies
  • 0 kudos

Feature Lookup Help

Hi,ContextI'm looking for help trying to get Unity Catalog Feature Lookup to work with my model how I need it to.I have a trained darts time series model that takes as input to its `.predict()` method both the history of the variable in question, and...

  • 1530 Views
  • 2 replies
  • 0 kudos
Latest Reply
mharrison
New Contributor II
  • 0 kudos

Thanks for your response. It sounds like the 2nd approach is best for me, modifying the `predict()` method to perform the required history lookup.Is it possible to do this via the Feature Engineering client within that method, or should I simply quer...

  • 0 kudos
1 More Replies
johndoe99012
by New Contributor II
  • 2012 Views
  • 4 replies
  • 1 kudos

How to serve a Unity Catalog ML model to external usage

Hello everyone I am following this notebook tutorial https://docs.databricks.com/en/machine-learning/manage-model-lifecycle/index.html#example-notebook Now I can register a machine learning model in Unity Catalog, but the tutorial only shows how to u...

  • 2012 Views
  • 4 replies
  • 1 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 1 kudos

Hi @johndoe99012 If the answer resolved your question, please consider marking it as the solution. It helps others in the community find answers more easily.  

  • 1 kudos
3 More Replies
TinSlim
by New Contributor III
  • 4236 Views
  • 3 replies
  • 0 kudos

Maximum wait time Databricks Model Serving

hi, hope you are fineI deployed a model 3 or 2 months ago using Databricks Serving and MLFlow. The model worked good using GPU from model serving.I stopped using it for some months and when I tried again deploying it, it has some errors.1. [FIXED] A ...

TinSlim_0-1733768150465.png TinSlim_1-1733768584347.png
  • 4236 Views
  • 3 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Thanks, I will review it and get back. I'll DIM you.

  • 0 kudos
2 More Replies
mradassaad
by New Contributor III
  • 7082 Views
  • 3 replies
  • 1 kudos

Resolved! Tuning `CrossValidator` spark job performance

I am running a 3-fold cross validation of an ML pipeline that utilizes `GBTClassifier` as the final step. It takes 18 hours to run and I am looking for feedback into how to improve the performance as I expect this to go faster.For context here is the...

Random Forest Job Random Forest Job Summary GBT storage top half
  • 7082 Views
  • 3 replies
  • 1 kudos
Latest Reply
cchalc
Databricks Employee
  • 1 kudos

Hello @Assaad Mrad​ , So this looks like trying to decide between putting the pipeline in the cross validator or the cross validator in the pipeline. Since you are doing the polynomial expansion as part of the pipeline you might want to consider putt...

  • 1 kudos
2 More Replies
jonathanhodges
by New Contributor II
  • 4059 Views
  • 4 replies
  • 0 kudos

Training Job Failure (Driver Error)

We have a new model training job that was running fine for a few days and then started failing. I have attached images for more details.I am wondering if 'can't reach driver cluster' is a red herring. It says the driver is healthy right before execut...

  • 4059 Views
  • 4 replies
  • 0 kudos
Latest Reply
jonathanhodges
New Contributor II
  • 0 kudos

In our case, we needed to correct our dependent libraries. We had an incorrect path referenced.

  • 0 kudos
3 More Replies
nikviz
by New Contributor II
  • 1618 Views
  • 2 replies
  • 0 kudos

Resolved! Vector search index stops at 45406

I am trying to create a vector search index for a table, but it stops at 45406 rows. I can see that the writeback table has all the records but the indexing stops. Is there a hard limit on index?

  • 1618 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

There are some limits that you can be hitting: Row Size for Delta Sync Index: The maximum row size is 100KB.Embedding Source Column Size for Delta Sync Index: The maximum size is 32764 bytes.Bulk Upsert Request Size Limit for Direct Vector Index: The...

  • 0 kudos
1 More Replies
miahopman
by New Contributor II
  • 4539 Views
  • 2 replies
  • 1 kudos

AutoML Runs Failing

After the Data Exploration notebook runs successfully, all AutoML trials fail without providing a source notebook. I have ensured that the training data labels have no null values or any labels with 16 or less occurrences associated with them. I cann...

  • 4539 Views
  • 2 replies
  • 1 kudos
Latest Reply
rtreves
Contributor
  • 1 kudos

@AnNg Have there been any updates on this feature?

  • 1 kudos
1 More Replies
JoeAckerman
by New Contributor II
  • 1396 Views
  • 2 replies
  • 0 kudos

Python running far slower than locally, even with large cluster and multiple workers

I have a notebook that is running extremely slowly even when I try to do pretty basic python functions. It is running far slower than locally no matter what I try, this is in spite of using a 32gb 4 core cluster with 4-8 workers. For context, my data...

  • 1396 Views
  • 2 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

Please share more information, for example: Type of data sourceType of operations being executed (sharing code if possible)Timings of local runs and Databricks runs

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels