Machine Learning

by ashfire • New Contributor II

9 hours ago

29 Views
1 replies
0 kudos

How to store & update a FAISS Index in Databricks

I’m currently using FAISS in a Databricks notebook to perform semantic search in text data. My current workflow looks like this:encode ~10k text entries using an embedding model.build a FAISS index in memory.run similarity searches using index.search...

Machine Learning

Reply

29 Views
1 replies
0 kudos

9 hours ago

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

7 hours ago

0 kudos

Hello @ashfire , Here’s a practical path to scale your FAISS workflow on Databricks, along with patterns to persist indexes, incrementally add embeddings, and keep metadata aligned. Best practice to persist/load FAISS indexes on Databricks Use faiss...

0 kudos

7 hours ago

by p4pratikjain • Contributor

08-19-2024 7:53:54 AM

3506 Views
2 replies
0 kudos

DAB - Add/remove task depending on workspace.

I use DAB for deploying Jobs, I want to add a specific Task in dev only but not in staging or prod. Is there any way to achieve this using DAB ?

Machine Learning

Reply

3506 Views
2 replies
0 kudos

08-19-2024 7:53:54 AM

View Replies

Latest Reply

Coffee77
Contributor III

9 hours ago

0 kudos

You can define specific resources by target in DAB as shown here. This is valid for jobs and/or tasks:For instance, in my case:I think, best option (but not available as far as I know) would be to be able to define "include" sections by target, inste...

0 kudos

9 hours ago

1 More Replies

by aswinkks • New Contributor III

05-28-2025 1:13:27 AM

1024 Views
2 replies
0 kudos

Distributed Training quits if any worker node fails

Hi,I'm training a Pytorch model in a distributed environment using the Pytorch's DistributedDataParallel (DDP) library. I have spin up 10 worker nodes.The issue which I'm facing is that during the training, if any worker node fails and exits, the ent...

Machine Learning

Reply

1024 Views
2 replies
0 kudos

05-28-2025 1:13:27 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

10 hours ago

0 kudos

Distributed training with PyTorch’s DistributedDataParallel (DDP) is not inherently fault-tolerant—if any node fails, the whole job crashes, and, as you noted, checkpointing cannot auto-recover the process without hypervisor or application-level orch...

0 kudos

10 hours ago

1 More Replies

by Kjetil • Contributor

08-26-2024 7:09:27 AM

3877 Views
1 replies
0 kudos

FeatureEngineeringClient and Unity Catalog

When testing this code ( fe.score_batch( df=dataset.drop("Target").limit(10), model_uri=f"models:/{model_name}/{mv.version}", ) .select("prediction") .limit(10) .display() ) I get the error: “MlflowException: The...

Machine Learning

Reply

3877 Views
1 replies
0 kudos

08-26-2024 7:09:27 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

10 hours ago

0 kudos

Your issues are tied to authentication and network/configuration differences between Unity Catalog and Workspace models in Databricks, specifically when using the FeatureEngineeringClient. Key Issues FeatureEngineeringClient + Unity Catalog: You get...

0 kudos

10 hours ago

by stochastic • New Contributor

08-26-2024 12:30:33 PM

3696 Views
1 replies
0 kudos

Why is spark mllib is not encouraged on the platform?/Why is ML dependent on .toPandas() on dbricks?

I'm new to Spark,Databricks and am surprised about how the Databricks tutorials for ML are using pandas DF > Spark DF. Of the tutorials I've seen, most data processing is done in a distributed manner but then its just cast to a pandas dataframe. From...

Machine Learning

Reply

3696 Views
1 replies
0 kudos

08-26-2024 12:30:33 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

10 hours ago

0 kudos

You are noticing a common pattern in Databricks ML tutorials: data is often processed with Spark for scalability, but training and modeling are frequently done on pandas DataFrames using single-node libraries like scikit-learn. This workflow can be c...

0 kudos

10 hours ago

by nitinjain26 • New Contributor

12 hours ago

22 Views
1 replies
0 kudos

Resolved! No option for create compute in trial version

Hi,I dont see an option for "Create Compute". I have a trial version. I am trying to build machine learning model on Databricks for the first time.Please check the attached the screenshot.

Machine Learning

Reply

22 Views
1 replies
0 kudos

12 hours ago

View Replies

Latest Reply

Advika
Databricks Employee

11 hours ago

0 kudos

Hello @nitinjain26! Free trials only offer serverless/SQL compute clusters (due to resource and cost controls).Please check out this post for more details: [FREE TRIAL] Missing All-Purpose Clusters Access - New Account

0 kudos

11 hours ago

by __paolo_c__ • Contributor II

08-23-2024 8:57:41 AM

4311 Views
1 replies
0 kudos

Feature tables & Null Values

Hi!I was wondering if any of you has ever dealt with Feature tables and null values (more specifically, via feature engineering objects, rather than feature store, although I don't think it really matters).In brief, null values are allowed to be stor...

Machine Learning

Reply

4311 Views
1 replies
0 kudos

08-23-2024 8:57:41 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

0 kudos

When dealing with feature tables and null values—especially via Databricks Feature Engineering objects (but also more broadly in Spark or feature platforms)—there are some nuanced behaviors when schema inference is required. Here are clear answers to...

0 kudos

yesterday

by ryojikn • New Contributor III

05-03-2024 8:59:23 AM

1291 Views
2 replies
2 kudos

Model Serving - Shadow Deployment - Azure

Hey,I'm composing an architecture within the usage of Model Serving Endpoints and one of the needs that we're aiming to resolve is Shadow Deployment.Currently, it seems that the traffic configurations available in model serving do not allow this type...

Machine Learning

Reply

1291 Views
2 replies
2 kudos

05-03-2024 8:59:23 AM

View Replies

Latest Reply

KaushalVachhani
Databricks Employee

yesterday

2 kudos

@ryojikn and @irtizak , you’re right. Databricks Model Serving allows splitting traffic between model versions, but it doesn’t have a true shadow deployment where live production traffic is mirrored to a new model for monitoring without affecting use...

2 kudos

yesterday

1 More Replies

by tarunnagar • New Contributor III

Monday

98 Views
4 replies
1 kudos

What Are the Key Challenges in Developing ETL Pipelines Using Databricks?

I’m looking to understand the practical challenges that professionals face when building ETL (Extract, Transform, Load) pipelines on Databricks. Specifically, I’m curious about issues related to scalability, performance, data quality, integration wit...

Machine Learning

Reply

98 Views
4 replies
1 kudos

Monday

View Replies

Latest Reply

Suheb
New Contributor II

Tuesday

1 kudos

Developing ETL pipelines in Databricks comes with challenges like managing diverse data sources, optimizing Spark performance, and controlling cloud costs. Ensuring data quality, handling errors, and maintaining security and compliance add complexity...

1 kudos

Tuesday

3 More Replies

by nitinjain26 • New Contributor

Monday

79 Views
3 replies
3 kudos

course material access

Hi,Where do I find the notebooks used in the training? I am doing the Machine Learning Practitioner Learning PlanRegardsNitin

Machine Learning

Reply

79 Views
3 replies
3 kudos

Monday

View Replies

Latest Reply

nitinjain26
New Contributor

Monday

3 kudos

Then in the video the instructor should specify that. This ( https://partner-academy.databricks.com/learn/learning-plans/11/machine-learning-practitioner-learning-plan/courses/2343/data-preparation-for-machine-learning/lessons/17941/demo-load-and-ex...

3 kudos

Monday

2 More Replies

by intelliconnectq • New Contributor II

a week ago

154 Views
2 replies
2 kudos

Resolved! Model Registration and hosting

I have train & tested a model in databricks, now I want to register it and host it. But I am unable too do so. Please find attach snapshot of code & error

Machine Learning

Reply

154 Views
2 replies
2 kudos

a week ago

View Replies

Latest Reply

joelrobin
Databricks Employee

a week ago

2 kudos

Hi @intelliconnectq The above code will fail with AttributeError: 'NoneType' object has no attribute 'info' on the line: model_uri = f"runs:/{mlflow.active_run().info.run_id}/xgboost-model" This happens because once the with mlflow.start_run(): bloc...

2 kudos

a week ago

1 More Replies

by ScyLukb • New Contributor

09-18-2024 5:12:12 AM

3947 Views
1 replies
0 kudos

Model serving with custom pip index URL

An mlflow model was logged with a custom pip requirements file which contains package versions (mlflow==2.11.3), as well as a custom --index-url. However model serving during the "Initializing model enviroment" step tries to pip install mlflow==2.2.2...

Machine Learning

Reply

3947 Views
1 replies
0 kudos

09-18-2024 5:12:12 AM

View Replies

Latest Reply

stbjelcevic
Databricks Employee

a week ago

0 kudos

Hi @ScyLukb , This is a common and frustrating problem that occurs when the Model Serving environment's built-in dependencies conflict with your model's specific requirements. The root cause is that the Model Serving environment tries to install its ...

0 kudos

a week ago

by Mario_D • New Contributor III

10-25-2024 6:28:47 AM

3526 Views
1 replies
2 kudos

Bug: MLflow recipe

I'm not sure whether this is the right place, but we've encountered a bug in the datasets.py(https://github.com/mlflow/mlflow/blob/master/mlflow/recipes/steps/ingest/datasets.py.). Anyone using recipes beware of forementioned.def _convert_spark_df_to...

Machine Learning

Reply

3526 Views
1 replies
2 kudos

10-25-2024 6:28:47 AM

View Replies

Latest Reply

stbjelcevic
Databricks Employee

a week ago

2 kudos

Hi @Mario_D , Thanks for bringing this to our attention, I will pass this information along to the appropriate team!

2 kudos

a week ago

by danielvdc • New Contributor II

11-27-2024 7:39:16 AM

3938 Views
1 replies
2 kudos

Rolling predictions with FeatureEngineeringClient

I am performing a time series analysis, using a XGBoostRegressor with rolling predictions. I am doing so using the FeatureEngineeringClient (in combination with Unity Catalog), where I create and load in my features during training and inference, as ...

Machine Learning

Reply

3938 Views
1 replies
2 kudos

11-27-2024 7:39:16 AM

View Replies

Latest Reply

stbjelcevic
Databricks Employee

a week ago

2 kudos

You’re running into a fundamental limitation: score_batch does point‑in‑time feature lookups and batch scoring, but it doesn’t support recursive multi‑step forecasting where predictions update features for subsequent timesteps. Feature Store looks up...

2 kudos

a week ago

by tooooods • New Contributor

02-05-2025 2:38:29 PM

3666 Views
1 replies
0 kudos

TorchDistributor: installation of custom python package via wheel across all nodes in cluster

I am trying to set up a training pipeline of a distributed PyTorch model using TorchDistributor. I have defined a train_object (in my case it is a Callable) that runs my training code. However, this method requires custom code from modules that I hav...

Machine Learning

Reply

3666 Views
1 replies
0 kudos

02-05-2025 2:38:29 PM

View Replies

Latest Reply

stbjelcevic
Databricks Employee

a week ago

0 kudos

hi @tooooods , This is a classic challenge in distributed computing, and your observation is spot on. The ModuleNotFoundError on the workers, despite the UI and API showing the library as "Installed," is the key symptom. This happens because TorchDis...

0 kudos

a week ago

Databricks Community

Forum Posts

How to store & update a FAISS Index in Databricks

DAB - Add/remove task depending on workspace.

Distributed Training quits if any worker node fails

FeatureEngineeringClient and Unity Catalog

Why is spark mllib is not encouraged on the platform?/Why is ML dependent on .toPandas() on dbricks?

Resolved! No option for create compute in trial version

Feature tables & Null Values

Model Serving - Shadow Deployment - Azure

What Are the Key Challenges in Developing ETL Pipelines Using Databricks?

course material access

Resolved! Model Registration and hosting

Model serving with custom pip index URL

Bug: MLflow recipe

Rolling predictions with FeatureEngineeringClient

TorchDistributor: installation of custom python package via wheel across all nodes in cluster

Join Us as a Local Community Builder!

No option for create compute in trial version

notebook stuck at "filtering data" or waiting to r...

AutoGluon MLflow integration

MLflow Nested run with applyInPandas does not exec...

ML Solution for unstructured data containing Image...