cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

spicysheep
by New Contributor II
  • 1249 Views
  • 3 replies
  • 1 kudos

Distributed SparkXGBRanker training: failed barrier ResultStage

I'm following a variation of the tutorial [here](https://assets.docs.databricks.com/_extras/notebooks/source/xgboost-pyspark-new.html) to train an `SparkXGBRanker` in distributed mode. However, the line:pipeline_model = pipeline.fit(data) Is throwing...

  • 1249 Views
  • 3 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

You have already mentioned you did turn off autoscaling, please try the num_workers too Step 1: Disable Dynamic Resource Allocation: Use spark.dynamicAllocation.enabled = false Step 2: Configure num_workers to Match Fixed Resources After disabling dy...

  • 1 kudos
2 More Replies
the_p_l
by New Contributor
  • 751 Views
  • 1 replies
  • 0 kudos

Lakehouse monitoring generates broken queries

Hi everyone,I’m setting up Databricks Lakehouse Monitoring to track my model’s performance using an inference-regression monitor. I’ve completed all the required configuration and successfully launched my first monitoring run.The quality tables are g...

  • 751 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hi @the_p_l ,I want to confirm that I understand your situation correctly. You mentioned that you are not adding any custom code to the deployed Lakehouse Monitoring setup, and you believe the issue is related to the inline comments generated during ...

  • 0 kudos
AlkaSaliss
by New Contributor II
  • 483 Views
  • 3 replies
  • 2 kudos

Unable to register Scikit-learn or XGBoost model to unity catalog

Hello, I'm following the tutorial provided here https://docs.databricks.com/aws/en/notebooks/source/mlflow/mlflow-classic-ml-e2e-mlflow-3.html for ML model management process using ML FLow, in a unity-catalog enabled workspace, however I'm facing an ...

  • 483 Views
  • 3 replies
  • 2 kudos
Latest Reply
gbhatia
New Contributor II
  • 2 kudos

Maybe add missing: mlflow.set_tracking_uri("databricks")mlflow.set_registry_uri("databricks")

  • 2 kudos
2 More Replies
gbhatia
by New Contributor II
  • 505 Views
  • 3 replies
  • 1 kudos

Endpoint deployment is very slow

HI team I am testing some changes on UAT / DEV environment and noticed that the model endpoint are very slow to deploy. Since the environment is just testing and not serving any production traffic, I was wondering if there was a way to expedite this ...

  • 505 Views
  • 3 replies
  • 1 kudos
Latest Reply
gbhatia
New Contributor II
  • 1 kudos

Hi @WiliamRosa Thanks for your response on this. I have been using the setting you described aboved, with the exception of `scale_to_zero`. PFA screenshot of the endpoint settings. My deployment is a simple Pytorch Deep Learning model wrapped in a `s...

  • 1 kudos
2 More Replies
Edwin1
by New Contributor III
  • 824 Views
  • 4 replies
  • 4 kudos

Resolved! Distributed Optuna and MLflow

Hello All, I just tried running the following notebook (https://docs.databricks.com/aws/en/notebooks/source/machine-learning/optuna-mlflow.html)  on the Databricks Free Edition platform , through Microsoft Account Authentication. It takes 15 minutes ...

Edwin1_0-1757179781808.png
  • 824 Views
  • 4 replies
  • 4 kudos
Latest Reply
Edwin1
New Contributor III
  • 4 kudos

Great. Thank you. That worked. I still need more compute and networking resources to make it justifiable, but this confirms that it works !!!

  • 4 kudos
3 More Replies
Junqueira
by New Contributor II
  • 604 Views
  • 1 replies
  • 1 kudos

[ERROR] Worker (pid:11) was sent code 132 When deploying a Custom Model in serving

Hi, I've been developing a custom model with mlflow.pyfunc.PythonModel. Among other libs, I use ANNOY. While trying to serve the model as an endpoint in "serving", After a few fixes my model worked fine as well the endpoin call.Altough, I tried updat...

  • 604 Views
  • 1 replies
  • 1 kudos
Latest Reply
WiliamRosa
Contributor
  • 1 kudos

Great observation! The difference between Using worker: sync and Using worker: gevent typically refers to the worker class used by Gunicorn, the web server behind many MLflow model deployments (like in Databricks model serving or other MLflow-compati...

  • 1 kudos
Dnirmania
by Contributor
  • 1344 Views
  • 2 replies
  • 3 kudos

Resolved! Serving Endpoint: Container image creation

Hi TeamWhenever I try to create an endpoint from a model in Databricks, the process often gets stuck at the 'Container Image Creation' step. I've tried to understand what happens during this step, but couldn't find any detailed or helpful information...

  • 1344 Views
  • 2 replies
  • 3 kudos
Latest Reply
Dnirmania
Contributor
  • 3 kudos

Thank you @Vidhi_Khaitan for sharing the detailed process ..

  • 3 kudos
1 More Replies
CelGuillau
by New Contributor III
  • 2736 Views
  • 5 replies
  • 3 kudos

Resolved! This API is disabled for users without the databricks-sql-access

Running a deply on github: Run databricks bundle deploydatabricks bundle deployshell: /usr/bin/bash -e {0}env:DATABRICKS_HOST: {{HOST}}DATABRICKS_CLIENT_ID: {{ID}}DATABRICKS_CLIENT_SECRET: ***DATABRICKS_BUNDLE_ENV: prodError: This API is disabled for...

  • 2736 Views
  • 5 replies
  • 3 kudos
Latest Reply
CelGuillau
New Contributor III
  • 3 kudos

Got it working, yes I see it was a little confusing at first, the workspace displayed at the top right is the account information whereas the profile icon is where you can access the workspace settings. For anyone that got as confused as I did. Thank...

  • 3 kudos
4 More Replies
Sachin_Amin
by New Contributor II
  • 907 Views
  • 1 replies
  • 1 kudos

Resolved! Model Inferencing

Any links, pointers to host a model in real time (similar to sagemaker endpoint in aws) - how can we host a model in DBX in real time - any documentation please?

  • 907 Views
  • 1 replies
  • 1 kudos
Latest Reply
jamesl
Databricks Employee
  • 1 kudos

@Sachin_Amin you can find an example in our docs here: https://docs.databricks.com/aws/en/machine-learning/model-serving/model-serving-intro We also have free training courses on realtime model deployment for both classical ML (https://www.databricks...

  • 1 kudos
Dharma25
by New Contributor II
  • 3220 Views
  • 2 replies
  • 2 kudos

workflow not pickingup correct host value (While working with MLflow model registry URI)

Exception: mlflow.exceptions.MlflowException: An API request to https://canada.cloud.databricks.com/api/2.0/mlflow/model-versions/list-artifacts failed due to a timeout. The error message was: HTTPSConnectionPool(host='canada.cloud.databricks.com', p...

  • 3220 Views
  • 2 replies
  • 2 kudos
Latest Reply
Dharma25
New Contributor II
  • 2 kudos

Thanks for the answer. I will try this solution

  • 2 kudos
1 More Replies
DaPo
by New Contributor III
  • 1685 Views
  • 2 replies
  • 0 kudos

Model Serving Endpoint: Cuda-OOM for Custom Model

Hello all,I am tasked to evaluate a new LLM  for some use-cases. In particular, I need to build a POC for a chat bot based on that model. To that end, I want to create a custom Serving Endpoint for an LLM pulled from huggingfaces. The model itself is...

  • 1685 Views
  • 2 replies
  • 0 kudos
Latest Reply
sarahbhord
Databricks Employee
  • 0 kudos

Here are some suggestions:  1. Update coda.yaml. Replace the current config with this optimized version:  channels: - conda-forge dependencies: - python=3.10 # 3.12 may cause compatibility issues - pip - pip: - mlflow==2.21.3 - torch...

  • 0 kudos
1 More Replies
Sri2025
by New Contributor
  • 890 Views
  • 1 replies
  • 0 kudos

Not able to run end to end ML project on Databricks Trial

I started using Databricks trial version from today. I want to explore full end to end ML lifecycle on the databricks. I observed for the compute only 'serverless' option is available. I was trying to execute the notebook posted on https://docs.datab...

  • 890 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

I can take up to 15 minutes for the serving endpoint to be created. Once you initiate the "create endpoint" chunk of code go and grab a cup of coffee and wait 15 minutes.  Then, before you use it verify it is running (bottom left menu "Serving") by g...

  • 0 kudos
Labels