cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

KyraHinnegan
by New Contributor II
  • 95 Views
  • 1 replies
  • 0 kudos

Which types of model serving endpoints have health metrics available?

I am retrieving a list of model serving endpoints for my workspace via this API: https://docs.databricks.com/api/workspace/servingendpoints/listAnd then going to retrieve health metrics for each one with: https://[DATABRICKS_HOST]/api/2.0/serving-end...

  • 95 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hey @KyraHinnegan, I did some digging and here is what I found. Hopefully it helps you understand a bit more about what is going on. At a high level, not every endpoint type exposes infrastructure health metrics via /metrics. What you’re seeing with ...

  • 0 kudos
fede_bia
by New Contributor II
  • 210 Views
  • 1 replies
  • 0 kudos

Databricks Model Serving Scaling Logic

Hi everyone,I’m seeking technical clarification on how Databricks Model Serving handles request queuing and autoscaling for CPU-intensive tasks. I am deploying a custom model for text and image extraction from PDFs (using Tesseract), and I’m struggli...

  • 210 Views
  • 1 replies
  • 0 kudos
Latest Reply
AbhaySingh
Databricks Employee
  • 0 kudos

TLDR: Pre-provision min_provisioned_concurrency â‰¥ your peak parallel requests (in multiples of 4) with scale-to-zero disabled, and chunk large PDFs in your model code to bound per-request latency — reactive autoscaling can't help CPU-bound workloads ...

  • 0 kudos
Dali1
by New Contributor III
  • 276 Views
  • 4 replies
  • 1 kudos

Params with databricks Asset bundles

Hello,I am using Databricks Asset bundels to create jobs for machine learning pipelines.My problem is I am using SparkPython taks and defining params inside those. When the job is created it is created with some params. When I want to run the same jo...

  • 276 Views
  • 4 replies
  • 1 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 1 kudos

Hi @Dali1, Great questions -- parameterizing ML pipelines in DABs is something a lot of people wrestle with, so let me break down the options. THE SHORT ANSWER No, you should not have to update the job definition every time you want different paramet...

  • 1 kudos
3 More Replies
fede_bia
by New Contributor II
  • 270 Views
  • 1 replies
  • 1 kudos

Resolved! Model Serving Only Shows WARNING/ERROR Logs

Hi everyone,I’m deploying a custom model using mlflow.pyfunc.PythonModel in Databricks Model Serving. Inside my wrapper code, I configured logging as follows:logging.basicConfig( stream=sys.stdout, level=logging.INFO, format='%(asctime)s ...

  • 270 Views
  • 1 replies
  • 1 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 1 kudos

@fede_bia This is worth walking through carefully. this is a common source of confusion when deploying custom models on Databricks Model Serving. SHORT ANSWER The default root logging level for Model Serving endpoints is set to WARNING. That is why y...

  • 1 kudos
Dali1
by New Contributor III
  • 327 Views
  • 2 replies
  • 2 kudos

Resolved! Python environment DAB

Hello,I am building a pipeline using DAB.The first step of the dab is to deploy my library as a wheel.The pipeline is run on a shared databricks cluster.When I run the job I see that the job is not using exactly the requirements I specified but it us...

  • 327 Views
  • 2 replies
  • 2 kudos
Latest Reply
stbjelcevic
Databricks Employee
  • 2 kudos

Hi @Dali1, +1 to @pradeep_singh, on shared clusters, tasks inherit cluster-installed libraries, so you won’t get a clean, versioned environment. Use a job cluster (new_cluster) or switch to serverless jobs with an environment per task for isolation. ...

  • 2 kudos
1 More Replies
Dali1
by New Contributor III
  • 220 Views
  • 1 replies
  • 0 kudos

Resolved! Install library in notebook

Hello ,I tried installing a custom library in my databricks notebook that is in a git folder of my worskpace.The installation looks successfulI saw the library in the list of libraries but when I want to import it I have : ModuleNotFoundError: No mod...

  • 220 Views
  • 1 replies
  • 0 kudos
Latest Reply
Dali1
New Contributor III
  • 0 kudos

Just found the issue - The installation with editable mode doesnt work you have to install it as a library I don't know why 

  • 0 kudos
Deep_Blue_Whale
by New Contributor
  • 276 Views
  • 1 replies
  • 0 kudos

Error starting or creating custom model serving endpoints - 'For input string: ""'

Hi Databricks Community,I'm having issues starting or creating custom model serving endpoints. When going into Serving endpoints > Selecting the endpoint > Start, I get the error message 'For input string:'This endpoint had worked correctly yesterday...

Deep_Blue_Whale_1-1771333391364.png Deep_Blue_Whale_2-1771333570626.png
  • 276 Views
  • 1 replies
  • 0 kudos
Latest Reply
emma_s
Databricks Employee
  • 0 kudos

Hi, sorry you're having the issue. You mentioned you've tried to recreate the endpoint with this model and other custom models but still having the same issue. Have you tried serving one of the foundation models and seeing if that works or a really s...

  • 0 kudos
Dali1
by New Contributor III
  • 305 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks SDK vs bundles

Hello,In this article: https://www.databricks.com/blog/from-airflow-to-lakeflow-data-first-orchestrationI understand that if I want to create and deploy ml pipeline in production the recommandation is to use databricks asset bundles. But by using it ...

  • 305 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Dali1 ,When you deploy with Asset Bundles, DABk keeps track of what’s already been deployed and what has changed. That means:it only updates what needs updating,detects drift between your desired state and the workspace,lets you generate plans/di...

  • 0 kudos
AlkaSaliss
by New Contributor II
  • 1293 Views
  • 4 replies
  • 2 kudos

Unable to register Scikit-learn or XGBoost model to unity catalog

Hello, I'm following the tutorial provided here https://docs.databricks.com/aws/en/notebooks/source/mlflow/mlflow-classic-ml-e2e-mlflow-3.html for ML model management process using ML FLow, in a unity-catalog enabled workspace, however I'm facing an ...

  • 1293 Views
  • 4 replies
  • 2 kudos
Latest Reply
joelramirezai
Databricks Employee
  • 2 kudos

You need to ensure that your Unity Catalog catalog and schema already exist, that you have the necessary permissions to use them, and that you update the code to reference your own catalog and schema names. You must also run on a classic cluster with...

  • 2 kudos
3 More Replies
Danik
by New Contributor II
  • 1085 Views
  • 2 replies
  • 3 kudos

Resolved! Population stability index (PSI) calculation in Lakehouse monitor

Hi! We are using Lakehouse monitoring for detecting data drift in our metrics. However, the exact calculation of metrics is not documented anywhere (I couldnt find it) and it raises questions on how they are done, in our case especially - PSI. I woul...

  • 1085 Views
  • 2 replies
  • 3 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 3 kudos

Hi @Danik , I have reviewed this. 1) Is there documentation for PSI and other metrics?Public docs list PSI in the drift table and give thresholds, but don’t detail the exact algorithm.Internally, numeric PSI uses ~1000 quantiles, equal‑height binning...

  • 3 kudos
1 More Replies
thomasm
by New Contributor II
  • 512 Views
  • 4 replies
  • 1 kudos

MLFlow Detailed Trace view doesn't work in some workspaces

I've created a Databricks Model Serving Endpoint which serves an MLFlow Pyfunc model. The model uses langchain and I'm using mlflow.langchain.autolog().At my company we have some production(-like) workspaces where users cannot e.g. run Notebooks and ...

thomasm_1-1767785859607.png thomasm_0-1767785737567.png thomasm_2-1767785939124.png
  • 512 Views
  • 4 replies
  • 1 kudos
Latest Reply
thomasm
New Contributor II
  • 1 kudos

Hi Jahnavi,Thanks for your reply. I think the issues you mentioned are not the cause of the discrepancy though. I have attached a screenshot of the same trace ID when displayed in the Experiments UI (where I cannot get a detailed trace view) and in t...

  • 1 kudos
3 More Replies
tonybenzu99
by New Contributor II
  • 1247 Views
  • 2 replies
  • 3 kudos

Resolved! Is Delta Lake deeply tested in Professional Data Engineer Exam?

I wanted to ask people who have already taken the Databricks Certified Professional Data Engineer exam whether Delta Lake is tested in depth or not. While preparing, I’m currently using the Databricks Certified Professional Data Engineer sample quest...

  • 1247 Views
  • 2 replies
  • 3 kudos
Latest Reply
lucafredo
New Contributor III
  • 3 kudos

Yes, Delta Lake concepts are an important part of the Databricks Professional Data Engineer exam, but they aren’t tested in extreme depth compared to core Spark transformations and data pipeline design. The exam mainly focuses on practical understand...

  • 3 kudos
1 More Replies
KyraHinnegan
by New Contributor II
  • 1056 Views
  • 1 replies
  • 1 kudos

Resolved! Full list of serving endpoint metrics returned by api/2.0/serving-endpoints/[ENDPOINT_NAME]/metrics

Hello! Looking at the documentation for this metric endpoint: https://docs.databricks.com/aws/en/machine-learning/model-serving/metrics-export-serving-endpointIt does not include a sample API response, and the code examples given don't have the full ...

KyraHinnegan_0-1767388845438.png
  • 1056 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hey @KyraHinnegan , I did some digging and here is what I found: Based on the Databricks documentation, GPU metrics exposed by the Serving Endpoint Metrics API follow a clear and consistent naming convention. Once you know the pattern, the response i...

  • 1 kudos
cbossi
by New Contributor III
  • 461 Views
  • 1 replies
  • 1 kudos

Resolved! Options sporadic (and cost-efficient) Model Serving on Databricks?

Hi all,I'm new to Databricks so would appreciate some advice.I have a ML model deployed using Databricks Model Serving. My use case is very sporadic: I only need to make 5–15 prediction requests per day (industrial application), and there can be long...

  • 461 Views
  • 1 replies
  • 1 kudos
Latest Reply
KaushalVachhani
Databricks Employee
  • 1 kudos

Hi @cbossi , You are right! A 30-minute idle period precedes the endpoint's scaling down. You are billed for the compute resources used during this period, plus the actual serving time when requests are made. This is the current expected behaviour. Y...

  • 1 kudos
spearitchmeta
by Contributor
  • 549 Views
  • 1 replies
  • 1 kudos

Resolved! How does Databricks AutoML handle null imputation for categorical features by default?

Hi everyone I’m using Databricks AutoML (classification workflow) on Databricks Runtime 10.4 LTS ML+, and I’d like to clarify how missing (null) values are handled for categorical (string) columns by default.From the AutoML documentation, I see that:...

  • 549 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hello @spearitchmeta , I looked internally to see if I could help with this and I found some information that will shed light on your question.   Here’s how missing (null) values in categorical (string) columns are handled in Databricks AutoML on Dat...

  • 1 kudos
Labels