<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Error at model serving for quantised models using bitsandbytes library in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/61457#M6677</link>
    <description>&lt;P&gt;Hi,&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/99801"&gt;@phi_alpaca&lt;/a&gt;&amp;nbsp;have you managed to solve this? We have a similar issue..&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 22 Feb 2024 12:26:11 GMT</pubDate>
    <dc:creator>JAgreenskylake</dc:creator>
    <dc:date>2024-02-22T12:26:11Z</dc:date>
    <item>
      <title>Error at model serving for quantised models using bitsandbytes library</title>
      <link>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/60324#M6673</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I've been trying to serve registered MLflow models at GPU Model Serving Endpoint, which works except for the models using &lt;A href="https://github.com/TimDettmers/bitsandbytes" target="_self"&gt;bitsandbytes&lt;/A&gt; library. The library is used to quantise the LLM models into 4-bit/ 8-bit (e.g. Mistral-7B), however, it runs into error while registering at endpoint.&amp;nbsp;This error is shown in the service log:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="phi_alpaca_1-1708013174746.png" style="width: 623px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/6295iC00F22E5F3F61327/image-dimensions/623x217/is-moderation-mode/true?v=v2" width="623" height="217" role="button" title="phi_alpaca_1-1708013174746.png" alt="phi_alpaca_1-1708013174746.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;All libraries needed are registered in the requirements.txt files, it looks like one option to fix the error is to run a bash script to help it locate the right path of the package, but we're not able to do so at serving endpoint.&lt;BR /&gt;&lt;BR /&gt;Has anyone successfully served a quantised LLM model at Databricks model serving using bitsandbytes? If so, how do you get around it? Any help on the topic would be much appreciated.&lt;BR /&gt;&lt;BR /&gt;Thanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 15 Feb 2024 16:13:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/60324#M6673</guid>
      <dc:creator>phi_alpaca</dc:creator>
      <dc:date>2024-02-15T16:13:01Z</dc:date>
    </item>
    <item>
      <title>Re: Error at model serving for quantised models using bitsandbytes library</title>
      <link>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/61217#M6675</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/99801"&gt;@phi_alpaca&lt;/a&gt;&amp;nbsp;, we are facing exactly the same issue trying to serve a bitsandbytes quantized version of Mixtral-8x7B . Did you have any progress resolving this? The answer from&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;isn't too helpful and seems to be AI-generated...&lt;/P&gt;&lt;P&gt;As you say, the deployed container is such a black box that we can't take the diagnostic steps listed in the error output.&lt;/P&gt;</description>
      <pubDate>Tue, 20 Feb 2024 09:02:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/61217#M6675</guid>
      <dc:creator>G-M</dc:creator>
      <dc:date>2024-02-20T09:02:18Z</dc:date>
    </item>
    <item>
      <title>Re: Error at model serving for quantised models using bitsandbytes library</title>
      <link>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/61269#M6676</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/98213"&gt;@G-M&lt;/a&gt;&amp;nbsp;, thanks for sharing your experience as well. Unfortunately I haven't had any luck on my end for resolving this. Would be interested to know if you have any breakthrough down the line. Is it something Databricks might be able to put a small fix in please?&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Tue, 20 Feb 2024 14:24:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/61269#M6676</guid>
      <dc:creator>phi_alpaca</dc:creator>
      <dc:date>2024-02-20T14:24:15Z</dc:date>
    </item>
    <item>
      <title>Re: Error at model serving for quantised models using bitsandbytes library</title>
      <link>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/61457#M6677</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/99801"&gt;@phi_alpaca&lt;/a&gt;&amp;nbsp;have you managed to solve this? We have a similar issue..&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 22 Feb 2024 12:26:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/61457#M6677</guid>
      <dc:creator>JAgreenskylake</dc:creator>
      <dc:date>2024-02-22T12:26:11Z</dc:date>
    </item>
    <item>
      <title>Re: Error at model serving for quantised models using bitsandbytes library</title>
      <link>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/61464#M6678</link>
      <description>&lt;P&gt;Hey &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/100561"&gt;@JAgreenskylake&lt;/a&gt;&amp;nbsp;, no luck so far. I have been working around it by not using quantised models, which is not ideal, so really hope it's possible to do that soon.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Feb 2024 13:23:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/61464#M6678</guid>
      <dc:creator>phi_alpaca</dc:creator>
      <dc:date>2024-02-22T13:23:15Z</dc:date>
    </item>
    <item>
      <title>Re: Error at model serving for quantised models using bitsandbytes library</title>
      <link>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/62014#M6679</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/99801"&gt;@phi_alpaca&lt;/a&gt;&lt;/P&gt;&lt;P&gt;We have solved it by providing a conda_env.yaml when we log the model, all we needed was to add &lt;SPAN&gt;cudatoolkit=11.8 to the dependencies.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2024 18:13:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/62014#M6679</guid>
      <dc:creator>G-M</dc:creator>
      <dc:date>2024-02-26T18:13:13Z</dc:date>
    </item>
    <item>
      <title>Re: Error at model serving for quantised models using bitsandbytes library</title>
      <link>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/62123#M6680</link>
      <description>&lt;P&gt;Thanks so much for sharing and glad it worked out for you guys!&lt;BR /&gt;I will have a go and feed back.&lt;/P&gt;</description>
      <pubDate>Tue, 27 Feb 2024 16:36:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/62123#M6680</guid>
      <dc:creator>phi_alpaca</dc:creator>
      <dc:date>2024-02-27T16:36:34Z</dc:date>
    </item>
    <item>
      <title>Re: Error at model serving for quantised models using bitsandbytes library</title>
      <link>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/62828#M6681</link>
      <description>&lt;P&gt;I seem to have some compatibility issues with cudatoolkit=11.8, would it be possible for you share what versions you use for torch, transformers, accelerate, and bitsandbytes? Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 07 Mar 2024 08:51:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/62828#M6681</guid>
      <dc:creator>phi_alpaca</dc:creator>
      <dc:date>2024-03-07T08:51:20Z</dc:date>
    </item>
    <item>
      <title>Re: Error at model serving for quantised models using bitsandbytes library</title>
      <link>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/62829#M6682</link>
      <description>&lt;P&gt;These versions are working for us:&lt;/P&gt;&lt;P&gt;torch==1.13.1&lt;BR /&gt;transformers==4.35.2&lt;BR /&gt;accelerate==0.25.0&lt;BR /&gt;bitsandbytes==0.41.3&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 07 Mar 2024 08:56:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/error-at-model-serving-for-quantised-models-using-bitsandbytes/m-p/62829#M6682</guid>
      <dc:creator>G-M</dc:creator>
      <dc:date>2024-03-07T08:56:46Z</dc:date>
    </item>
  </channel>
</rss>

