<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Using private python packages with databricks model serving in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/using-private-python-packages-with-databricks-model-serving/m-p/39058#M26868</link>
    <description>&lt;P&gt;I am attempting to host a Python MLflow model using &lt;A href="https://docs.databricks.com/en/machine-learning/model-serving/index.html" target="_self"&gt;Databricks model serving&lt;/A&gt;. While the serving endpoint functions correctly without private Python packages, I am encountering difficulties when attempting to include them.&lt;/P&gt;&lt;H2&gt;Context:&lt;/H2&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Without Private Packages&lt;/STRONG&gt;: The serving endpoint works fine&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;With Private Packages&lt;/STRONG&gt;: I can only use the `--index.url` set to my private PyPI server as detailed in this &lt;A href="https://stackoverflow.com/questions/75418655/how-to-load-private-python-package-when-loading-a-mlflow-model" target="_self"&gt;answer&lt;/A&gt;.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I wish to avoid storing my token for the private PyPI in plain text. Since [init scripts are not supported with model &lt;A href="https://docs.databricks.com/en/machine-learning/model-serving/index.html#limitations" target="_self"&gt;serving&lt;/A&gt;, I don't know how to inject the token, as a secret at build time. Could this be possible?&lt;/P&gt;&lt;H3&gt;Attempted Solution:&lt;/H3&gt;&lt;P&gt;Following this &lt;A href="https://docs.databricks.com/en/machine-learning/model-serving/private-libraries-model-serving.html" target="_self"&gt;tutorial&lt;/A&gt;, I built the `whl` files, uploaded them to dbfs, and listed them in `pip_requirements` in `mlflow.pyfunc.log_model`. Unfortunately, the file on dbfs cannot be found at build time, preventing the endpoint creation.&lt;/P&gt;&lt;H3&gt;Code:&lt;/H3&gt;&lt;P&gt;Here's how I'm logging the model:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;mlflow.pyfunc.log_model(
"hello-world",
python_model=model,
registered_model_name="hello-world",
signature=signature,
input_example=input_example,
pip_requirements=[
"/dbfs/FileStore/tables/private_package-0.1.10-py3-none-any.whl"
],
)&lt;/LI-CODE&gt;&lt;P&gt;I have tried different paths in pip_requirements, and the file's existence on dbfs has been verified through both the Databricks CLI.&lt;/P&gt;&lt;P&gt;in `pip_requirements` I have tried:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;- /dbfs/FileStore...
- dbfs/FileStore...
- /dbfs:/FileStore...
- dbfs:/FileStore...&lt;/LI-CODE&gt;&lt;P&gt;Command to view package in databricks notebook:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;dbutils.fs.ls("dbfs:/FileStore/tables/private_package-0.1.10-py3-none-any.whl")&lt;/LI-CODE&gt;&lt;H2&gt;&lt;BR /&gt;Error:&lt;/H2&gt;&lt;P&gt;The build logs produce the following error.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/dbfs/FileStore/tables/private_package-0.1.10-py3-none-any.whl'
CondaEnvException: Pip failed&lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;My hypothesis is that there might be a permission error, and Databricks model hosting might not have access to dbfs. Being new to Databricks, I am unsure how to debug this. Any guidance or insights on how to resolve this issue would be greatly appreciated!&lt;/P&gt;</description>
    <pubDate>Thu, 03 Aug 2023 21:50:16 GMT</pubDate>
    <dc:creator>ericcbonet</dc:creator>
    <dc:date>2023-08-03T21:50:16Z</dc:date>
    <item>
      <title>Using private python packages with databricks model serving</title>
      <link>https://community.databricks.com/t5/data-engineering/using-private-python-packages-with-databricks-model-serving/m-p/39058#M26868</link>
      <description>&lt;P&gt;I am attempting to host a Python MLflow model using &lt;A href="https://docs.databricks.com/en/machine-learning/model-serving/index.html" target="_self"&gt;Databricks model serving&lt;/A&gt;. While the serving endpoint functions correctly without private Python packages, I am encountering difficulties when attempting to include them.&lt;/P&gt;&lt;H2&gt;Context:&lt;/H2&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;Without Private Packages&lt;/STRONG&gt;: The serving endpoint works fine&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;With Private Packages&lt;/STRONG&gt;: I can only use the `--index.url` set to my private PyPI server as detailed in this &lt;A href="https://stackoverflow.com/questions/75418655/how-to-load-private-python-package-when-loading-a-mlflow-model" target="_self"&gt;answer&lt;/A&gt;.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I wish to avoid storing my token for the private PyPI in plain text. Since [init scripts are not supported with model &lt;A href="https://docs.databricks.com/en/machine-learning/model-serving/index.html#limitations" target="_self"&gt;serving&lt;/A&gt;, I don't know how to inject the token, as a secret at build time. Could this be possible?&lt;/P&gt;&lt;H3&gt;Attempted Solution:&lt;/H3&gt;&lt;P&gt;Following this &lt;A href="https://docs.databricks.com/en/machine-learning/model-serving/private-libraries-model-serving.html" target="_self"&gt;tutorial&lt;/A&gt;, I built the `whl` files, uploaded them to dbfs, and listed them in `pip_requirements` in `mlflow.pyfunc.log_model`. Unfortunately, the file on dbfs cannot be found at build time, preventing the endpoint creation.&lt;/P&gt;&lt;H3&gt;Code:&lt;/H3&gt;&lt;P&gt;Here's how I'm logging the model:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;mlflow.pyfunc.log_model(
"hello-world",
python_model=model,
registered_model_name="hello-world",
signature=signature,
input_example=input_example,
pip_requirements=[
"/dbfs/FileStore/tables/private_package-0.1.10-py3-none-any.whl"
],
)&lt;/LI-CODE&gt;&lt;P&gt;I have tried different paths in pip_requirements, and the file's existence on dbfs has been verified through both the Databricks CLI.&lt;/P&gt;&lt;P&gt;in `pip_requirements` I have tried:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;- /dbfs/FileStore...
- dbfs/FileStore...
- /dbfs:/FileStore...
- dbfs:/FileStore...&lt;/LI-CODE&gt;&lt;P&gt;Command to view package in databricks notebook:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;dbutils.fs.ls("dbfs:/FileStore/tables/private_package-0.1.10-py3-none-any.whl")&lt;/LI-CODE&gt;&lt;H2&gt;&lt;BR /&gt;Error:&lt;/H2&gt;&lt;P&gt;The build logs produce the following error.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/dbfs/FileStore/tables/private_package-0.1.10-py3-none-any.whl'
CondaEnvException: Pip failed&lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;My hypothesis is that there might be a permission error, and Databricks model hosting might not have access to dbfs. Being new to Databricks, I am unsure how to debug this. Any guidance or insights on how to resolve this issue would be greatly appreciated!&lt;/P&gt;</description>
      <pubDate>Thu, 03 Aug 2023 21:50:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-private-python-packages-with-databricks-model-serving/m-p/39058#M26868</guid>
      <dc:creator>ericcbonet</dc:creator>
      <dc:date>2023-08-03T21:50:16Z</dc:date>
    </item>
    <item>
      <title>Re: Using private python packages with databricks model serving</title>
      <link>https://community.databricks.com/t5/data-engineering/using-private-python-packages-with-databricks-model-serving/m-p/39194#M26889</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;thanks for getting back to me.&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;The link you attached is for installing private pip packages in a notebook. As mentioned in my question I can install my private package (that I uploaded to dbfs) in a notebook without issue. The problem I am having is installing this same package with&lt;/SPAN&gt;&lt;A href="https://docs.databricks.com/en/machine-learning/model-serving/index.html#limitations" target="_blank"&gt; &lt;SPAN&gt;model serving&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN&gt;. &lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;Running the command you gave me in a notebook results in a FileNotFoundException, while the directory is found with dbutils, see the screenshot below.&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="ericcbonet_0-1691310536630.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/3065i4D77962D02B1FDDB/image-size/medium?v=v2&amp;amp;px=400" role="button" title="ericcbonet_0-1691310536630.png" alt="ericcbonet_0-1691310536630.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I copy-pasted the file path from my databricks notebook to my python code and tried this a number of times with a number of different path combinations. I am always getting the same error&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Furthermore, even if I can debug the issue i.e. why can the model serving docker build environment not find the file on dbfs (which i suspect is permission related), I'm not super happy with this workflow,&amp;nbsp; having to update private python packages in dbfs and having to update the link in the pip_requirements&amp;nbsp;argument of mlflow.pyfunc.log_model. &lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;What would make this process much easier is if a secret could be picked up by the build environment and then could be injected into the `conda.yaml` file via an init script. For example&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;# conda.yaml
channels:
  - defaults
dependencies:
  - python=3.10
  - pip
  - pip:
      - mlflow&amp;gt;=2.5.0
      - boto3&amp;gt;=1.28.18
      - company-private&amp;gt;=0.1.10
      - --index-url "https://aws:%%CODE_ARTIFACT_TOKEN%%@company-0123456789.d.codeartifact.eu-central-1.amazonaws.com/pypi/company-python-packages/simple/"
name: mlflow-serving&lt;/LI-CODE&gt;&lt;P&gt;&lt;SPAN&gt;Then a .sh init script could do the following&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;sed -i "s/%%CODE_ARTIFACT_TOKEN%%/${{ secrets.code-artifact-token }}/g" conda.yaml &lt;/LI-CODE&gt;&lt;P&gt;&lt;SPAN&gt;I realize model serving currently does not support init scripts, is this on the roadmap? Or can you suggest another workflow so I can use private python packages?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 06 Aug 2023 08:38:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-private-python-packages-with-databricks-model-serving/m-p/39194#M26889</guid>
      <dc:creator>ericcbonet</dc:creator>
      <dc:date>2023-08-06T08:38:15Z</dc:date>
    </item>
  </channel>
</rss>

