cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

thomasm
by New Contributor II
  • 140 Views
  • 4 replies
  • 1 kudos

MLFlow Detailed Trace view doesn't work in some workspaces

I've created a Databricks Model Serving Endpoint which serves an MLFlow Pyfunc model. The model uses langchain and I'm using mlflow.langchain.autolog().At my company we have some production(-like) workspaces where users cannot e.g. run Notebooks and ...

thomasm_1-1767785859607.png thomasm_0-1767785737567.png thomasm_2-1767785939124.png
  • 140 Views
  • 4 replies
  • 1 kudos
Latest Reply
thomasm
New Contributor II
  • 1 kudos

Hi Jahnavi,Thanks for your reply. I think the issues you mentioned are not the cause of the discrepancy though. I have attached a screenshot of the same trace ID when displayed in the Experiments UI (where I cannot get a detailed trace view) and in t...

  • 1 kudos
3 More Replies
tonybenzu99
by New Contributor
  • 121 Views
  • 2 replies
  • 3 kudos

Is Delta Lake deeply tested in Professional Data Engineer Exam?

I wanted to ask people who have already taken the Databricks Certified Professional Data Engineer exam whether Delta Lake is tested in depth or not. While preparing, I’m currently using the Databricks Certified Professional Data Engineer sample quest...

  • 121 Views
  • 2 replies
  • 3 kudos
Latest Reply
lucafredo
New Contributor II
  • 3 kudos

Yes, Delta Lake concepts are an important part of the Databricks Professional Data Engineer exam, but they aren’t tested in extreme depth compared to core Spark transformations and data pipeline design. The exam mainly focuses on practical understand...

  • 3 kudos
1 More Replies
d_szepietowska
by New Contributor II
  • 72 Views
  • 1 replies
  • 3 kudos

Why ENABLE_MLFLOW_TRACING does not work for serving endpoint?

I would like to ask you if  you have experienced similar issue like me recently. I trained sklearn model. Logged this model with fe.log_model for automatic feature lookup. Online feature tables where published with currently recommended approach, whi...

  • 72 Views
  • 1 replies
  • 3 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 3 kudos

Hello @d_szepietowska , I did some research on my end and found a few helpful hints/tips to help you troubleshoot.  Let’s walk through what should be happening, and then I’ll call out the most common reasons the feature lookup DataFrame doesn’t show ...

  • 3 kudos
ryojikn
by New Contributor III
  • 1686 Views
  • 3 replies
  • 2 kudos

Model Serving - Shadow Deployment - Azure

Hey,I'm composing an architecture within the usage of Model Serving Endpoints and one of the needs that we're aiming to resolve is Shadow Deployment.Currently, it seems that the traffic configurations available in model serving do not allow this type...

  • 1686 Views
  • 3 replies
  • 2 kudos
Latest Reply
KaushalVachhani
Databricks Employee
  • 2 kudos

@ryojikn and @irtizak , you’re right. Databricks Model Serving allows splitting traffic between model versions, but it doesn’t have a true shadow deployment where live production traffic is mirrored to a new model for monitoring without affecting use...

  • 2 kudos
2 More Replies
jitenjha11
by New Contributor II
  • 175 Views
  • 2 replies
  • 3 kudos

Getting error when running databricks deploy bundle command

HI all,I am trying to implement MLOps project using https://github.com/databricks/mlops-stacks repo.I have created azure databricks with Premium (+ Role-based access controls) (Click to change) and following bundle creation and deploy using uRL: http...

  • 175 Views
  • 2 replies
  • 3 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 3 kudos

This is expected behavior with mlops-stacks and not an issue with your Terraform version or the CLI. The main problem is that your Azure Databricks workspace does not have Unity Catalog enabled or assigned. The mlops-stacks templates assume Unity Cat...

  • 3 kudos
1 More Replies
Suheb
by Contributor
  • 202 Views
  • 2 replies
  • 2 kudos

Why does my MLflow model training job fail on Databricks with an out‑of‑memory error for large datas

I am trying to train a machine learning model using MLflow on Databricks. When my dataset is very large, the training stops and gives an ‘out-of-memory’ error. Why does this happen and how can I fix it?

  • 202 Views
  • 2 replies
  • 2 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 2 kudos

+1 to what @mukul1409 has told. Please follow the guides below to distribute the training: https://docs.databricks.com/aws/en/machine-learning/train-model/distributed-training/spark-pytorch-d... https://docs.databricks.com/aws/en/notebooks/source/dee...

  • 2 kudos
1 More Replies
KyraHinnegan
by New Contributor II
  • 181 Views
  • 1 replies
  • 1 kudos

Resolved! Full list of serving endpoint metrics returned by api/2.0/serving-endpoints/[ENDPOINT_NAME]/metrics

Hello! Looking at the documentation for this metric endpoint: https://docs.databricks.com/aws/en/machine-learning/model-serving/metrics-export-serving-endpointIt does not include a sample API response, and the code examples given don't have the full ...

KyraHinnegan_0-1767388845438.png
  • 181 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hey @KyraHinnegan , I did some digging and here is what I found: Based on the Databricks documentation, GPU metrics exposed by the Serving Endpoint Metrics API follow a clear and consistent naming convention. Once you know the pattern, the response i...

  • 1 kudos
jitenjha11
by New Contributor II
  • 256 Views
  • 1 replies
  • 0 kudos

Getting error when running databricks deploy bundle command

HI all,I am trying to implement MLOps project using https://github.com/databricks/mlops-stacks repo.I have created azure databricks with Premium (+ Role-based access controls) (Click to change) and following bundle creation and deploy using uRL: http...

  • 256 Views
  • 1 replies
  • 0 kudos
Latest Reply
emma_s
Databricks Employee
  • 0 kudos

Hi, first things to check is that you have the correct permissions on the user or service principal you're running the job with, the user needs to have workspace access and cluster creation access toggled on. Next you need to check you have a metast...

  • 0 kudos
Suheb
by Contributor
  • 413 Views
  • 3 replies
  • 1 kudos

Resolved! What are the practical differences between bagging and boosting algorithms?

How are bagging and boosting different when you use them in real machine-learning projects?

  • 413 Views
  • 3 replies
  • 1 kudos
Latest Reply
jameswood32
Contributor
  • 1 kudos

The practical differences between bagging and boosting mostly come down to how they build models and how they handle errors:Model Training Approach:Bagging (Bootstrap Aggregating): Builds multiple models in parallel using random subsets of the data. ...

  • 1 kudos
2 More Replies
Suheb
by Contributor
  • 424 Views
  • 4 replies
  • 2 kudos

Resolved! How do I improve the performance of my Random Forest model on Databricks?

How can I make these people smarter or faster so the final answer is better?

  • 424 Views
  • 4 replies
  • 2 kudos
Latest Reply
jameswood32
Contributor
  • 2 kudos

Improving the performance of a Random Forest model on Databricks is usually about data quality, feature engineering, and hyperparameter tuning. Some tips:Feature Engineering:Create meaningful features and remove irrelevant ones.Encode categorical var...

  • 2 kudos
3 More Replies
Suheb
by Contributor
  • 185 Views
  • 1 replies
  • 1 kudos

How do I implement and train a custom PyTorch model on Databricks using distributed training?

How can I build my own PyTorch machine-learning model and train it faster on Databricks by using multiple machines/GPUs instead of just one?

  • 185 Views
  • 1 replies
  • 1 kudos
Latest Reply
KaushalVachhani
Databricks Employee
  • 1 kudos

@Suheb , You may look at the torch distributor. It provides multiple distributed training options, including single-node with multiple-GPU training and multi-node training. Below are the references for you. https://docs.databricks.com/aws/en/machine-...

  • 1 kudos
RodrigoE
by New Contributor III
  • 263 Views
  • 2 replies
  • 0 kudos

Vector search index very slow

Hello,I have created a vector search index for a delta table with 1,400 rows. Using this vector index to find matching records on a table with 52M records with the query below ran for 20hrs and failed with: 'HTTP request failed with status: {"error_c...

Machine Learning
vector search index
  • 263 Views
  • 2 replies
  • 0 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 0 kudos

Hi @RodrigoE ,Your LATERAL subquery calls the Vector Search function once for every row of the 52M-row table, which results in tens of millions of remote calls to the Vector Search endpoint—this is not a nice pattern and will be extremely slow leadin...

  • 0 kudos
1 More Replies
Suheb
by Contributor
  • 332 Views
  • 1 replies
  • 1 kudos

Resolved! What are recommended approaches for feature engineering in Databricks ML projects?

When building machine-learning models in Databricks, how should I prepare and transform my data so the model can learn better?

  • 332 Views
  • 1 replies
  • 1 kudos
Latest Reply
emma_s
Databricks Employee
  • 1 kudos

Hi, this is quite a general question, I've put together a list of bullets that will help you in the right direction:   Focus on organized storage, flexible transformations, and making features easy to reuse and discover. Use Unity Catalog for govern...

  • 1 kudos
RodrigoE
by New Contributor III
  • 439 Views
  • 4 replies
  • 2 kudos

Resolved! Vector search index initialization very slow

Hello,I am creating a vector search index and selected Compute embeddings for a delta table with 19M records.  Delta table has only two  columns: ID (selected as index) and Name (selected for embedding). Embedding model is databricks-gte-large-en.Ind...

Machine Learning
index
search
vector
vector index
Vector Search
  • 439 Views
  • 4 replies
  • 2 kudos
Latest Reply
RodrigoE
New Contributor III
  • 2 kudos

Your recommendation addressed the issue.  Followed the instructions and index initialization took only 8 hours - thank you! 

  • 2 kudos
3 More Replies
Suheb
by Contributor
  • 201 Views
  • 1 replies
  • 1 kudos

Resolved! How do I start with MLflow on Databricks?

I am new to MLflow and Databricks. How can I begin using MLflow inside Databricks to track and manage my machine learning models?

  • 201 Views
  • 1 replies
  • 1 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 1 kudos

Hi @Suheb , MLFlow is already pre installed in ML runtime. The question is very vague. You can follow the below documentations to get started with MLFlow on databricks. 1) https://www.databricks.com/product/managed-mlflow2) https://docs.databricks.co...

  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels