Machine Learning

by pablobd • Contributor II

01-09-2024 7:32:32 AM

4903 Views
6 replies
1 kudos

Resolved! Model Serving Endpoint Creation through API

Hello,I am trying to create a model serving endpoint via the API as explained here: https://docs.databricks.com/api/workspace/servingendpoints/createI created a trusted IAM role with access to DynamoDB for the feature store. I try to use this field,"...

Machine Learning

Reply

4903 Views
6 replies
1 kudos

01-09-2024 7:32:32 AM

View Replies

Latest Reply

EugeneBad
Visitor

an hour ago

1 kudos

If you're using the databricks terraform provider, make sure the role's name matches the instance-profile name.If not, use the `iam_role_arn` attribute to explicitly set the role's arn when creating the databricks instance profileresource "databricks...

1 kudos

an hour ago

5 More Replies

by excavator-matt • New Contributor III

09-03-2025 2:21:13 AM

1127 Views
5 replies
2 kudos

Resolved! What is the most efficient way of running sentence-transformers on a Spark DataFrame column?

We're trying to run the bundled sentence-transformers library from SBert in a notebook running Databricks ML 16.4 on an AWS g4dn.2xlarge [T4] instance.However, we're experiencing out of memory crashes and are wondering what the optimal to run sentenc...

Machine Learning

memory issues

sentence-transformers

vector embeddings

Reply

1127 Views
5 replies
2 kudos

09-03-2025 2:21:13 AM

View Replies

Latest Reply

excavator-matt
New Contributor III

an hour ago

2 kudos

@Louis_Frolio I tried the Pandas on Spark approach.How do I from Delta table into Pandas on Spark DataFrame. Is this the best way? projects_df = spark.read.table("my_catalog.my_schema.my_project_table")projects_spdf = ps.from_pandas(projects_df.toPan...

2 kudos

an hour ago

4 More Replies

by sangramraje • New Contributor

11-22-2024 10:33:12 AM

3631 Views
1 replies
0 kudos

AutoML "need to sample" not working as expected

tl; dr:When the AutoML run realizes it needs to do sampling because the driver / worker node memory is not enough to load / process the entire dataset, it fails. A sample weight column is NOT provided by me, but I believe somewhere in the process the...

Machine Learning

Reply

3631 Views
1 replies
0 kudos

11-22-2024 10:33:12 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

7 hours ago

0 kudos

Hey @sangramraje , sorry for the late response. I wanted to check in to see if this is still an issue with the latest release? Please let me know. Cheers, Louis.

0 kudos

7 hours ago

by adoodsonruby • New Contributor II

10-02-2024 4:59:33 PM

3738 Views
1 replies
1 kudos

AutoML Doesn't Work Due to Not being able to generate the EDA notebook

HiI'm trying run AutoML classification experiment with a dataset that I have made, and am experiencing this issue even after I have purposely downsampled my dataset before running it into the AutoML experiment. It appears that there is no way for me ...

Machine Learning

Reply

3738 Views
1 replies
1 kudos

10-02-2024 4:59:33 PM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

7 hours ago

1 kudos

Hey @adoodsonruby , sorry this got lost in the shuffle. Have you tried again recently? I believe limits have been increased that would remove this impediment. Let us know, Louis.

1 kudos

7 hours ago

by lchicoma • New Contributor

09-17-2024 3:17:41 PM

3411 Views
1 replies
0 kudos

Error to create an endpoint of databricks with 2 primary keys online table

I have a delta table that has a primary key conformed by 2 fields (accountId,ruleModelVersionDesc) and I have also created an online table that has the same primary key, but when I create a feature spec to create an endpoint I get the following error...

Machine Learning

enpoints

featurespec

fetureserving

MachineLearning

onlinetabl

Reply

3411 Views
1 replies
0 kudos

09-17-2024 3:17:41 PM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

7 hours ago

0 kudos

Hey @lchicoma , sorry for the delayed response. Thanks for sharing the error and context—this looks like a parsing issue in the feature specification rather than a problem with Delta or the runtime versions. What changed recently There was an inci...

0 kudos

7 hours ago

by amanjethani • New Contributor

06-16-2025 11:49:53 PM

963 Views
1 replies
0 kudos

🐞 Stuck on LightGBM Distributed Training in PySpark – Hanging After Socket Communication

My Setup:I'm trying to run distributed LightGBM training using synapseml.lightgbm.LightGBMRegressor in PySpark.Cluster Details:Spark version: 3.5.1 (compatible with PySpark 3.5.6)PySpark version: 3.5.6synapseml: v0.11.1 (latest)Spark Cluster: 3 Hetzn...

Machine Learning

Reply

963 Views
1 replies
0 kudos

06-16-2025 11:49:53 PM

View Replies

Latest Reply

stbjelcevic
Databricks Employee

yesterday

0 kudos

Hi @amanjethani , Thanks for laying out the setup and symptoms so clearly. The hang likely occurs because LightGBM’s distributed network either doesn’t fully form between executors or because the expected task count doesn’t match actual tasks, leadin...

0 kudos

yesterday

by semsim • Contributor

09-13-2024 9:32:45 AM

3411 Views
1 replies
0 kudos

Can't query Legacy Serving Endpoint

Hi,I was able to deploy an endpoint using legacy serving (It's the only option we have to deploy endpoints in DB). Now I am having trouble querying the endpoint itself. When I try to query it I get the following error: Here is the code I am using ...

Machine Learning

Reply

3411 Views
1 replies
0 kudos

09-13-2024 9:32:45 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

yesterday

0 kudos

Hey @semsim , sorry for the delayed response. Thanks for the screenshot—this pinpoints the problem. Root cause from the error Your model’s predict path is trying to create or write to /Workspace/Shared, and the serving container does not permit t...

0 kudos

yesterday

by Kasen • New Contributor III

09-05-2024 9:27:46 PM

4017 Views
1 replies
1 kudos

Multi-tenant recommendation system (Machine learning)

Hello,I am looking to build a multi-tenant machine learning recommender system in Azure Databricks. The idea is to have a single shared model, where each tenant can use the same model to train on their own unique dataset. Essentially, while the model...

Machine Learning

machine learning

multi-tenant

recommendation

Reply

4017 Views
1 replies
1 kudos

09-05-2024 9:27:46 PM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

yesterday

1 kudos

@Kasen , sorry for the delayed response. Here are some things to consider regarding your question. Azure Databricks is well-suited for a shared-architecture, tenant‑isolated recommender system. Below is a pragmatic blueprint, the isolation model o...

1 kudos

yesterday

by ScyLukb • New Contributor

09-04-2024 6:50:33 AM

3528 Views
1 replies
0 kudos

Determine exact location of MLflow model tracking and model registry files and the Backend Stores

I would like to determine the exact location of:1. MLflow model tracking files2. Model registry files (with Workspace Model Registry)as according to the documentation it is mentioned that: "All methods copy the model into a secure location managed by...

Machine Learning

Reply

3528 Views
1 replies
0 kudos

09-04-2024 6:50:33 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

yesterday

0 kudos

Greetings @ScyLukb , You’re right that the docs say the Workspace Model Registry copies models to a “secure location” but don’t name it prominently. Here’s where those files actually live and how to discover the configured stores. Locations of MLflo...

0 kudos

yesterday

by art1 • New Contributor III

10-25-2024 7:24:39 AM

3288 Views
1 replies
0 kudos

Hyperopt (15.4 LTS ML) ignores autologger settings

I use ML Flow Experiment to store models once they leave very early tests and development. I switched lately to 15.4 LTS ML and was hit by unhinged Hyperopt behavior:it was creating Experiment logs ignoring i) autologger is off on the workspace level...

Machine Learning

Reply

3288 Views
1 replies
0 kudos

10-25-2024 7:24:39 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

yesterday

0 kudos

Hey @art1 , sorry this post got lost in the shuffle. Here are some things to consider regarding your question: Thanks for flagging this—what you’re seeing is expected given how Databricks integrates Hyperopt with MLflow, and there are clear ways t...

0 kudos

yesterday

by javeed • New Contributor

02-11-2025 10:10:41 PM

3353 Views
1 replies
0 kudos

Working with pyspark dataframe with machine learning libraries / statistical model libraries

Hi Team, I am working with huge volume of data (50GB) and i decompose the time series data using the statsmodel.Having said that the major challenge i am facing is the compatibility of the pyspark dataframe with the machine learning algorithms. altho...

Machine Learning

Reply

3353 Views
1 replies
0 kudos

02-11-2025 10:10:41 PM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

yesterday

0 kudos

Greetings @javeed , You’re right to call out the friction between a PySpark DataFrame and many Python ML libraries like statsmodels; most Python ML stacks expect pandas, while Spark is distributed-first. Here’s how to bridge that gap efficiently fo...

0 kudos

yesterday

by snaveedgm • New Contributor

03-21-2025 9:22:59 AM

3371 Views
1 replies
0 kudos

databricks-vectorsearch 0.53 unable to use similarity_search()

I have an issue with databricks-vectorsearch package. Version 0.51 suddenly stopped working this week because:It now expected me to provide azure_tenant_id in addition to service principal's client ID and secret.After supplying tenant ID, it showed s...

Machine Learning

Reply

3371 Views
1 replies
0 kudos

03-21-2025 9:22:59 AM

View Replies

Latest Reply

stbjelcevic
Databricks Employee

Tuesday

0 kudos

Hi @snaveedgm , This is interesting - can you double-check that the service principal has CAN QUERY on the embedding endpoint used for ingestion and/or querying (databricks-bge-large-en in your case)? Even though your direct REST test works, double-c...

0 kudos

Tuesday

by aswinkks • New Contributor III

03-25-2025 5:47:58 AM

3329 Views
1 replies
0 kudos

ML Solution for unstructured data containing Images and videos

Hi,I have a use case of developing an entire ML solution within Databricks starting from ingestion to inference and monitoring, but the problem is that we have unstructured data containing Images and Video for training the model using frameworks such...

Machine Learning

Reply

3329 Views
1 replies
0 kudos

03-25-2025 5:47:58 AM

View Replies

Latest Reply

stbjelcevic
Databricks Employee

Tuesday

0 kudos

Hi @aswinkks , This is a very broad question, but generally, when dealing with video data, you convert the videos to images and have a system in place for training and another for inference. This Databricks blog posts explains how to set up a video ...

0 kudos

Tuesday

by naveen_marthala • Contributor

05-02-2022 9:08:48 AM

12671 Views
4 replies
3 kudos

Resolved! How to PREVENT mlflow's autologging from logging ALL runs?

I am logging runs from jupyter notebook. the cells which has `mlflow.sklearn.autlog()` behaves as expected. but, the cells which has .fit() method being called on sklearn's estimators are also being logged as runs without explicitly mentioning `mlflo...

Machine Learning

Reply

12671 Views
4 replies
3 kudos

05-02-2022 9:08:48 AM

View Replies

Latest Reply

Joe_Breath1
New Contributor III

08-24-2025 4:39:46 PM

3 kudos

It looks like MLflow auto-logging is kicking in by default whenever you call .fit(), which is why you’re seeing runs even without explicitly using mlflow.sklearn.autolog(). To fix this, you can disable the global autologging and only trigger it when ...

3 kudos

08-24-2025 4:39:46 PM

3 More Replies

by harry_dfe • New Contributor

01-30-2025 5:29:39 AM

3145 Views
1 replies
0 kudos

notebook stuck at "filtering data" or waiting to run

Hi, my data is in vector sparse representaion, and it was working fine (display and training ml models), I added few features that converted data from sparse to dense represenation and after that anything I want to perform on data stuck(display or ml...

Machine Learning

Reply

3145 Views
1 replies
0 kudos

01-30-2025 5:29:39 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

Tuesday

0 kudos

Greetings @harry_dfe , Thanks for the details — this almost certainly stems from your data flipping from a sparse vector representation to a dense one, which explodes per‑row memory and stalls actions like display, writes, and ML training. Why t...

0 kudos

Tuesday

Databricks Community

Forum Posts

Resolved! Model Serving Endpoint Creation through API

Resolved! What is the most efficient way of running sentence-transformers on a Spark DataFrame column?

AutoML "need to sample" not working as expected

AutoML Doesn't Work Due to Not being able to generate the EDA notebook

Error to create an endpoint of databricks with 2 primary keys online table

🐞 Stuck on LightGBM Distributed Training in PySpark – Hanging After Socket Communication

Can't query Legacy Serving Endpoint

Multi-tenant recommendation system (Machine learning)

Determine exact location of MLflow model tracking and model registry files and the Backend Stores

Hyperopt (15.4 LTS ML) ignores autologger settings

Working with pyspark dataframe with machine learning libraries / statistical model libraries

databricks-vectorsearch 0.53 unable to use similarity_search()

ML Solution for unstructured data containing Images and videos

Resolved! How to PREVENT mlflow's autologging from logging ALL runs?

notebook stuck at "filtering data" or waiting to run

Join Us as a Local Community Builder!

Databricks Free Edition serverless

Problem loading a pyfunc model in job run

Serving Endpoint Disappears After One Day

Can't use pyspark bucketizer

VLLM dependency Issues with DBR 17.0