Machine Learning

by sergio-calderon • New Contributor II

01-25-2024 8:55:31 AM

2714 Views
3 replies
1 kudos

How to use Databricks secrets on MLFlow conda dependencies?

Hi!Do you know if it's correct to use the plain user and token for installing a custom dependency (an internal python package) in a mlflow registered model? (it's the only way I get it working because if not it can't install the dependency) It works,...

Machine Learning

Reply

2714 Views
3 replies
1 kudos

01-25-2024 8:55:31 AM

View Replies

Latest Reply

Froffri
New Contributor

yesterday

1 kudos

Did someone solve this? I'm currently forced to download some libraries using an artifactory pypi mirror. However, I wouldn't want to have my secrets pasted in the conda.yaml file as plain text.

1 kudos

yesterday

2 More Replies

by naveen_marthala • Contributor

05-02-2022 9:08:48 AM

15689 Views
7 replies
3 kudos

Resolved! How to PREVENT mlflow's autologging from logging ALL runs?

I am logging runs from jupyter notebook. the cells which has `mlflow.sklearn.autlog()` behaves as expected. but, the cells which has .fit() method being called on sklearn's estimators are also being logged as runs without explicitly mentioning `mlflo...

Machine Learning

Reply

15689 Views
7 replies
3 kudos

05-02-2022 9:08:48 AM

View Replies

Latest Reply

ericholland009
New Contributor II

2 weeks ago

3 kudos

Good question—mlflow autologging can easily capture more runs than expected if not configured properly. Managing it carefully improves experiment tracking. Similar control and optimization are important in bussid mod workflows, where users fine-tune ...

3 kudos

2 weeks ago

6 More Replies

by tkfm_s • New Contributor II

2 weeks ago

285 Views
2 replies
0 kudos

Memory error in LightGBM training data processing

I am developing a LightGBM model on Databricks, and I am using the Native API because it offers the widest range of options and allows me to try various approaches.The training data is loaded from a table in the Catalog as a Spark DataFrame. However,...

Machine Learning

Reply

285 Views
2 replies
0 kudos

2 weeks ago

View Replies

Latest Reply

tkfm_s
New Contributor II

2 weeks ago

0 kudos

Thank you.I will check the document.tkfm_s

0 kudos

2 weeks ago

1 More Replies

by KyraHinnegan • New Contributor II

03-17-2026 7:50:55 PM

906 Views
2 replies
1 kudos

Resolved! Which types of model serving endpoints have health metrics available?

I am retrieving a list of model serving endpoints for my workspace via this API: https://docs.databricks.com/api/workspace/servingendpoints/listAnd then going to retrieve health metrics for each one with: https://[DATABRICKS_HOST]/api/2.0/serving-end...

Machine Learning

Reply

906 Views
2 replies
1 kudos

03-17-2026 7:50:55 PM

View Replies

Latest Reply

johandoc
New Contributor II

2 weeks ago

1 kudos

Your observation is correct—this behavior is expected.Endpoints with entity_type = FOUNDATION_MODEL_API do not expose health metrics via the /metrics endpoint, which is why you’re getting 404 responses. These endpoints are fully managed, multi-tenant...

1 kudos

2 weeks ago

1 More Replies

by MattBuck • New Contributor II

2 weeks ago

321 Views
1 replies
0 kudos

Resolved! AWS GovCloud Feature Availability Question

Hi! I'm trying to determine if Mosaic Vector Search (or is it simply called Vector Search) is available on AWS GovCloud?This shows it is not: https://docs.databricks.com/aws/en/resources/feature-region-supportAnd it's not mentioned here: https://docs...

Machine Learning

Reply

321 Views
1 replies
0 kudos

2 weeks ago

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

2 weeks ago

0 kudos

Hi @MattBuck ,It's not available on AWS GovCloud.1) The first link you attached is the authoritative source for feature availability by region. If you can't find it there it means the feature is not available in specific region2) And I think this lim...

0 kudos

2 weeks ago

by nb92 • New Contributor

a month ago

328 Views
1 replies
0 kudos

Resolved! Recommended Python UDFs for On-Demand Feature Computation in Databricks

The Databricks documentation page on on-demand feature computation (https://docs.databricks.com/aws/en/machine-learning/feature-store/on-demand-features#what-are-on-demand-features) mentions using Python UDFs for computing on-demand features. What ty...

Machine Learning

Reply

328 Views
1 replies
0 kudos

a month ago

View Replies

Latest Reply

aleksandra_ch
Databricks Employee

a month ago

0 kudos

HI @nb92 , Only Scalar Python UDFs are allowed for on-demand feature computation. This page provides the recommended approach. Best regards,

0 kudos

a month ago

by thomasm • New Contributor III

01-07-2026 3:44:23 AM

1149 Views
7 replies
3 kudos

MLFlow Detailed Trace view doesn't work in some workspaces

I've created a Databricks Model Serving Endpoint which serves an MLFlow Pyfunc model. The model uses langchain and I'm using mlflow.langchain.autolog().At my company we have some production(-like) workspaces where users cannot e.g. run Notebooks and ...

Machine Learning

Reply

1149 Views
7 replies
3 kudos

01-07-2026 3:44:23 AM

View Replies

Latest Reply

lkt1
New Contributor III

a month ago

3 kudos

Funnily enough, the problem also disappeard on my end this morning Previously, I saw a networking issue in my logs, but that also went away. Let's hope it stays that way!

3 kudos

a month ago

6 More Replies

by pfzoz • New Contributor

04-10-2026 8:02:10 AM

1204 Views
1 replies
0 kudos

Resolved! Using Qwen with vLLM

There are many conflict and dependency issues when trying to install VLLM and use the Qwen models (on serverless), even the v2 families.I tried following this guide https://docs.databricks.com/aws/en/machine-learning/sgc-examples/tutorials/sgc-raydat...

Machine Learning

Reply

1204 Views
1 replies
0 kudos

04-10-2026 8:02:10 AM

View Replies

Latest Reply

anuj_lathi
Databricks Employee

04-10-2026 8:27:45 PM

0 kudos

Hi @pfzoz -- the "Model architectures failed to be inspected" error you are hitting is a well-known compatibility issue between vLLM, the transformers library, and the Qwen2/2.5-VL model family. The root cause is that vLLM's model registry subprocess...

0 kudos

04-10-2026 8:27:45 PM

by thomas_berry • Databricks Partner

04-07-2026 5:36:41 AM

913 Views
3 replies
0 kudos

Resolved! TrainingArguments fails

Hello,I am working on an ML project for text classification and I have a problem.The following piece of code stalls completely. It prints 'start' but never 'end'.from transformers import TrainingArguments print("start") args = TrainingArguments(outpu...

Machine Learning

Reply

913 Views
3 replies
0 kudos

04-07-2026 5:36:41 AM

View Replies

Latest Reply

thomas_berry
Databricks Partner

04-08-2026 6:39:47 AM

0 kudos

Hello @lingareddy_Alva ,Thank you for your reply. I have since been given a cluster with the ML Runtime and the code now works. So I consider the problem solved.

0 kudos

04-08-2026 6:39:47 AM

2 More Replies

by TomBurns • New Contributor

06-28-2023 4:19:11 PM

2341 Views
2 replies
0 kudos

Identity Resolution

Looking for best solutions for identity resolution. I already have deterministic matching. Exploring probabilistic solutions. Any advice for me?

Machine Learning

Reply

2341 Views
2 replies
0 kudos

06-28-2023 4:19:11 PM

View Replies

Latest Reply

Sonal
New Contributor III

04-06-2026 10:56:13 AM

0 kudos

Check open source Zingg which runs natively within Databricks https://github.com/zinggAI/zingg

0 kudos

04-06-2026 10:56:13 AM

1 More Replies

by ruia-dojo • New Contributor

03-30-2026 7:02:12 AM

298 Views
1 replies
0 kudos

Job compute fails due to BQ permissions

Hello,My databricks workspace is associated to GCP project analytics.But me and my team mostly work on GCP project data-science, which contains the only BQ dataset that we have write access to.I'm trying to automate a pipeline to run on job compute a...

Machine Learning

Reply

298 Views
1 replies
0 kudos

03-30-2026 7:02:12 AM

View Replies

Latest Reply

MoJaMa
Databricks Employee

03-30-2026 12:04:48 PM

0 kudos

What identity is the job running as? Do you have any settings on the all-purpose cluster that you are not setting on the job-cluster? Maybe you need to provide roles/bigquery.jobUser on project analytics to the job compute service account?

0 kudos

03-30-2026 12:04:48 PM

by knight22-21 • New Contributor II

03-26-2026 12:16:17 PM

1178 Views
3 replies
1 kudos

Resolved! Unable to Access Azure Blob Storage from Databricks Community Edition Notebook

Hi everyone,I’m currently using the Databricks Community Edition and trying to access data stored in Azure Blob Storage from my .ipynb notebook. The storage account is part of my student free Azure subscription.However, I’m not able to establish a co...

Machine Learning

Reply

1178 Views
3 replies
1 kudos

03-26-2026 12:16:17 PM

View Replies

Latest Reply

emma_s
Databricks Employee

03-27-2026 2:48:23 AM

1 kudos

Hi, I think you are referring to Databricks Free edition, in which case this doesn't support the connection to external storage such as Azure Blob storage. Thanks,Emma

1 kudos

03-27-2026 2:48:23 AM

2 More Replies

by rtglorenabasul • New Contributor

03-24-2026 7:38:10 AM

819 Views
1 replies
1 kudos

Resolved! Issue Running Job on Serverless GPU

I have a job that runs a notebook, the notebook uses serverless GPU (A10) and it keeps failing with a "Run failed with error message Cluster 'xxxxxxxxxxx' was terminated. Reason: UNKNOWN (SUCCESS)". The base environment is 'Standard v4' and I have tr...

Machine Learning

Reply

819 Views
1 replies
1 kudos

03-24-2026 7:38:10 AM

View Replies

Latest Reply

Ashwin_DSA
Databricks Employee

03-24-2026 8:59:52 AM

1 kudos

Hi @rtglorenabasul, Thanks for sharing the details. The behaviour you’re seeing is consistent with an issue in how the job is bringing up Serverless GPU compute, rather than with the notebook code itself. Having done some checks, that error usually m...

1 kudos

03-24-2026 8:59:52 AM

by jayshan • New Contributor III

02-24-2026 12:42:45 PM

1631 Views
4 replies
3 kudos

Resolved! Generic Spark Connect ML error. The fitted or loaded model size is too big.

When I train models in the serverless environment V4 (Premium Plan), the system occasionally returns the error message listed below, especially after running the model training code multiple times. We have tried creating new serverless sessions, whic...

Machine Learning

Reply

1631 Views
4 replies
3 kudos

02-24-2026 12:42:45 PM

View Replies

Latest Reply

Ashwin_DSA
Databricks Employee

03-16-2026 12:30:46 PM

3 kudos

Hi @jayshan, I'm sorry for the delayed response to your question. And, thanks for the extra details and for sharing your workaround. This behaviour is tied to how Spark Connect ML works in serverless mode, rather than a traditional JVM/GC leak. On se...

3 kudos

03-16-2026 12:30:46 PM

3 More Replies

by RodrigoE • New Contributor III

12-17-2025 7:49:49 AM

1661 Views
5 replies
3 kudos

Resolved! Vector search index initialization very slow

Hello,I am creating a vector search index and selected Compute embeddings for a delta table with 19M records. Delta table has only two columns: ID (selected as index) and Name (selected for embedding). Embedding model is databricks-gte-large-en.Ind...

Machine Learning

index

search

vector

vector index

Vector Search

Reply

1661 Views
5 replies
3 kudos

12-17-2025 7:49:49 AM

View Replies

Latest Reply

BadrErraji
New Contributor III

03-15-2026 4:57:56 AM

3 kudos

Why the deltaSync doesn't compute the embedding in parralel instead of sequential.That a major gap in the architecture no ?

3 kudos

03-15-2026 4:57:56 AM

4 More Replies

Databricks Community

Forum Posts

How to use Databricks secrets on MLFlow conda dependencies?

Resolved! How to PREVENT mlflow's autologging from logging ALL runs?

Memory error in LightGBM training data processing

Resolved! Which types of model serving endpoints have health metrics available?

Resolved! AWS GovCloud Feature Availability Question

Resolved! Recommended Python UDFs for On-Demand Feature Computation in Databricks

MLFlow Detailed Trace view doesn't work in some workspaces

Resolved! Using Qwen with vLLM

Resolved! TrainingArguments fails

Identity Resolution

Job compute fails due to BQ permissions

Resolved! Unable to Access Azure Blob Storage from Databricks Community Edition Notebook

Resolved! Issue Running Job on Serverless GPU

Resolved! Generic Spark Connect ML error. The fitted or loaded model size is too big.

Resolved! Vector search index initialization very slow

Recommended Python UDFs for On-Demand Feature Comp...

Using Qwen with vLLM

AWS GovCloud Feature Availability Question

TrainingArguments fails

Issue Running Job on Serverless GPU