Databricks Community

Poised2Learn · ‎10-09-2023

Hi fellows,

I encountered memory(?) error when sending POST requests to my real-time endpoint, and I'm unable to find hardware setting to increase memory, as suggested by the Service Logs (below).

Steps to Repro:

(1) I registered a custom MLFlow model with utils functions included in the code_path -argument of log_model(), as described in this doc
(2) I deployed the registered model as a Serving Endpoint
(3) Upon sending requests to the endpoint through `score_model()`-function, I get the following response, Exception: Request failed with status 400, {"error_code":"Bad request.","message":"The model server has crashed unexpectedly. This happens e.g. if server runs out of memory. Please verify that your model can handle the volume and the type of requests with the current configuration."}

Steps I have attempted to resolve this issue:

- I have tried to change the concurrency from Small to Large, but no changes in response

Below is my service logs

[95wb9] [2023-10-10 00:08:42 +0000] [2] [INFO] Starting gunicorn 21.2.0
[95wb9] [2023-10-10 00:08:42 +0000] [2] [INFO] Listening at: http://0.0.0.0:8080 (2)
[95wb9] [2023-10-10 00:08:42 +0000] [2] [INFO] Using worker: sync
[95wb9] [2023-10-10 00:08:42 +0000] [5] [INFO] Booting worker with pid: 5
[95wb9] [2023-10-10 00:08:43 +0000] [6] [INFO] Booting worker with pid: 6
[95wb9] [2023-10-10 00:08:43 +0000] [7] [INFO] Booting worker with pid: 7
[95wb9] [2023-10-10 00:08:43 +0000] [8] [INFO] Booting worker with pid: 8
[95wb9] [2023-10-10 00:12:53 +0000] [2] [ERROR] Worker (pid:6) was sent SIGKILL! Perhaps out of memory?
[95wb9] [2023-10-10 00:12:53 +0000] [111] [INFO] Booting worker with pid: 111

Poised2Learn · ‎10-10-2023

Thank you for your responses, @Annapurna_Hiriy and @Retired_mod

Indeed, it appeared that my original model (~800MB) was too big for the current server. Based on your suggestion, I made a simpler/smaller model for this project, and then I was able to deploy and get responses successfully.

I will reach out to the support team to increase our compute configuration, to handle other (large) models.

View solution in original post

Annapurna_Hiriy · ‎10-10-2023

@Poised2Learn What do you see under Memory Usage in the Metrics tab? If you still see memory utilization over 70% after increasing the compute, please reach out to the Databricks support team to increase the compute for you.

Poised2Learn · ‎10-10-2023

Thank you for your responses, @Annapurna_Hiriy and @Retired_mod

Indeed, it appeared that my original model (~800MB) was too big for the current server. Based on your suggestion, I made a simpler/smaller model for this project, and then I was able to deploy and get responses successfully.

I will reach out to the support team to increase our compute configuration, to handle other (large) models.

Databricks Community

Model serving Ran out of memory

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences