<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Serving API endpoint failing in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/45594#M2340</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/88547"&gt;@ombhuyan&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;Thank you for posting your question in the Databricks Community.&lt;/P&gt;
&lt;P&gt;I am unsure of the issue without seeing the code. however, could you check with the example code &lt;A href="https://github.com/databricks/ds-projects/blob/master/app/llm/notebooks/serving/llama2_serving_demo.py" target="_self"&gt;here&lt;/A&gt; and see what is missing?&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 21 Sep 2023 21:20:22 GMT</pubDate>
    <dc:creator>Kumaran</dc:creator>
    <dc:date>2023-09-21T21:20:22Z</dc:date>
    <item>
      <title>Serving API endpoint failing</title>
      <link>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/44219#M2179</link>
      <description>&lt;P&gt;Hi Team,&lt;BR /&gt;I registered my ML model in databricks but while trying to serve an API endpoint for the model it is failing with the following error logs.&lt;/P&gt;&lt;DIV&gt;Service logs: There are currently no replicas in a running state.&lt;BR /&gt;Build logs :Build never started - check the event log to see if the model failed validation or contact databricks.&lt;/DIV&gt;&lt;DIV&gt;Can someone help me in debugging the issue?&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Sun, 10 Sep 2023 16:22:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/44219#M2179</guid>
      <dc:creator>ombhuyan</dc:creator>
      <dc:date>2023-09-10T16:22:02Z</dc:date>
    </item>
    <item>
      <title>Re: Serving API endpoint failing</title>
      <link>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/45069#M2294</link>
      <description>&lt;P&gt;I am having the same issue on the large compute! Except my error looks like&lt;/P&gt;&lt;P&gt;[rkxn8] [2023-09-15 19:49:24 +0000] [2] [INFO] Starting gunicorn 21.2.0&lt;BR /&gt;[rkxn8] [2023-09-15 19:49:24 +0000] [2] [INFO] Listening at: &lt;A href="http://0.0.0.0:8080" target="_blank"&gt;http://0.0.0.0:8080&lt;/A&gt; (2)&lt;BR /&gt;[rkxn8] [2023-09-15 19:49:24 +0000] [2] [INFO] Using worker: sync&lt;BR /&gt;[rkxn8] [2023-09-15 19:49:24 +0000] [3] [INFO] Booting worker with pid: 3&lt;BR /&gt;[rkxn8] [2023-09-15 19:49:24 +0000] [4] [INFO] Booting worker with pid: 4&lt;BR /&gt;[rkxn8] [2023-09-15 19:49:24 +0000] [5] [INFO] Booting worker with pid: 5&lt;BR /&gt;[rkxn8] [2023-09-15 19:49:24 +0000] [6] [INFO] Booting worker with pid: 6&lt;BR /&gt;[rkxn8] [2023-09-15 19:49:57 +0000] [2] [ERROR] Worker (pid:5) was sent SIGKILL! Perhaps out of memory?&lt;BR /&gt;[rkxn8] [2023-09-15 19:49:57 +0000] [29] [INFO] Booting worker with pid: 29&lt;BR /&gt;[rkxn8] [2023-09-15 19:50:05 +0000] [2] [ERROR] Worker (pid:3) was sent SIGKILL! Perhaps out of memory?&lt;BR /&gt;[rkxn8] [2023-09-15 19:50:05 +0000] [33] [INFO] Booting worker with pid: 33&lt;BR /&gt;[rkxn8] [2023-09-15 19:50:48 +0000] [2] [ERROR] Worker (pid:6) was sent SIGKILL! Perhaps out of memory?&lt;BR /&gt;[rkxn8] [2023-09-15 19:50:48 +0000] [57] [INFO] Booting worker with pid: 57&lt;BR /&gt;[rkxn8] [2023-09-15 19:51:00 +0000] [2] [ERROR] Worker (pid:4) was sent SIGKILL! Perhaps out of memory?&lt;BR /&gt;[rkxn8] [2023-09-15 19:51:00 +0000] [63] [INFO] Booting worker with pid: 63&lt;/P&gt;&lt;P&gt;Trying to deploy a 1.5B param model.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Sep 2023 19:52:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/45069#M2294</guid>
      <dc:creator>AChang</dc:creator>
      <dc:date>2023-09-15T19:52:47Z</dc:date>
    </item>
    <item>
      <title>Re: Serving API endpoint failing</title>
      <link>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/45594#M2340</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/88547"&gt;@ombhuyan&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;Thank you for posting your question in the Databricks Community.&lt;/P&gt;
&lt;P&gt;I am unsure of the issue without seeing the code. however, could you check with the example code &lt;A href="https://github.com/databricks/ds-projects/blob/master/app/llm/notebooks/serving/llama2_serving_demo.py" target="_self"&gt;here&lt;/A&gt; and see what is missing?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Sep 2023 21:20:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/45594#M2340</guid>
      <dc:creator>Kumaran</dc:creator>
      <dc:date>2023-09-21T21:20:22Z</dc:date>
    </item>
    <item>
      <title>Re: Serving API endpoint failing</title>
      <link>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/50805#M2714</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/88547"&gt;@ombhuyan&lt;/a&gt;&amp;nbsp;&lt;SPAN&gt;We currently only upload logs during the build phase to the user (i.e where we install the pip dependencies) but we don't upload logs during the pre-build phase (i.e where we download the model). &lt;BR /&gt;That's why you may not see clear error messages in build logs. &lt;BR /&gt;Please create an SF case if you still see this issue.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Nov 2023 09:10:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/50805#M2714</guid>
      <dc:creator>Annapurna_Hiriy</dc:creator>
      <dc:date>2023-11-10T09:10:14Z</dc:date>
    </item>
    <item>
      <title>Re: Serving API endpoint failing</title>
      <link>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/63057#M3087</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/63081"&gt;@Kumaran&lt;/a&gt;,&lt;BR /&gt;The linked code seems not to be available. Do you know whether there is an alternative link to it?&lt;BR /&gt;Thank you!&lt;BR /&gt;Octavian&lt;/P&gt;</description>
      <pubDate>Fri, 08 Mar 2024 14:07:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/63057#M3087</guid>
      <dc:creator>Octavian1</dc:creator>
      <dc:date>2024-03-08T14:07:09Z</dc:date>
    </item>
    <item>
      <title>Re: Serving API endpoint failing</title>
      <link>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/63058#M3088</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I ran also into such an issue. I would find very useful to be able to see also the errors issued in the prebuild stage.&lt;/P&gt;&lt;P&gt;In any case, if it may help, eventually I found out through "trial and error" that the problem was caused by an incompatible version of one of the packages supposed to be installed in the container.&lt;/P&gt;&lt;P&gt;Octavian&lt;/P&gt;</description>
      <pubDate>Fri, 08 Mar 2024 14:10:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/serving-api-endpoint-failing/m-p/63058#M3088</guid>
      <dc:creator>Octavian1</dc:creator>
      <dc:date>2024-03-08T14:10:51Z</dc:date>
    </item>
  </channel>
</rss>

