<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Set up compute policy to allow installing python libraries from a private package index in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/set-up-compute-policy-to-allow-installing-python-libraries-from/m-p/93619#M9062</link>
    <description>&lt;P&gt;I figured it out, seems like secrets can only be loaded into environment variables if the content is the secret and nothing else:&lt;/P&gt;&lt;PRE&gt;"value": "{{secrets/global/arneCorpPyPI_token}}"         # this will work&lt;BR /&gt;"value": "foo {{secrets/global/arneCorpPyPI_token}} bar" # this will not&lt;/PRE&gt;&lt;P&gt;My last problem is now that I need to use string interpolation to create my actual value, e.g.:&lt;/P&gt;&lt;PRE&gt;[...],&lt;BR /&gt;"spark_env_vars.TOKEN": {&lt;BR /&gt;"type": "fixed",&lt;BR /&gt;"value": "{{secrets/global/arneCorpPyPI_token}}"&lt;BR /&gt;},,&lt;BR /&gt;"spark_env_vars.PIP_INDEX_URL": {&lt;BR /&gt;"type": "fixed",&lt;BR /&gt;"value": "https://arneCorpPyPI:${TOKEN}@gitlab.office.arneCorp.com/api/v4/groups/42/-/packages/pypi/simple"&lt;BR /&gt;},&lt;BR /&gt;[...]&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;and json maps are unordered. As it happens, PIP_INDEX_URL is initialized before TOKEN, and my auth is broken. I tried a couple other names, and it looks like the name TEMPORARY will be consistently initialized before PIP_INDEX_URL, and it will work. Obviously, this is not something I want to rely on in any shape, way or form. Is there a better approach? I assume I'm not the first one to define env vars in a policy that depend on each other.&lt;/P&gt;</description>
    <pubDate>Fri, 11 Oct 2024 11:29:23 GMT</pubDate>
    <dc:creator>arne_c</dc:creator>
    <dc:date>2024-10-11T11:29:23Z</dc:date>
    <item>
      <title>Set up compute policy to allow installing python libraries from a private package index</title>
      <link>https://community.databricks.com/t5/get-started-discussions/set-up-compute-policy-to-allow-installing-python-libraries-from/m-p/93415#M9061</link>
      <description>&lt;P&gt;In our organization, we maintain a bunch of libraries we share code with. They're hosted on a private python package index, which requires a token to allow downloads. My idea was to store the token as a secret which would then be loaded into a cluster's environment arguments using a policy. The secret itself has a permissive read-access, but I myself am also a workspace admin, so I'd expect that I would be able to see it, if at all possible.&lt;/P&gt;&lt;P&gt;The relevant part in my policy definition looks like this:&lt;/P&gt;&lt;PRE&gt;[...],&lt;BR /&gt;"spark_env_vars.PIP_INDEX_URL": {&lt;BR /&gt;"type": "fixed",&lt;BR /&gt;"value": "https://arneCorpPyPI:{{secrets/global/arneCorpPyPI_token}}@gitlab.office.arneCorp.com/api/v4/groups/42/-/packages/pypi/simple"&lt;BR /&gt;},&lt;BR /&gt;[...]&lt;/PRE&gt;&lt;P&gt;If I run&lt;/P&gt;&lt;PRE&gt;databricks secrets get-secret global arneCorpPyPI_token&lt;/PRE&gt;&lt;P&gt;from my command line, I can see its value.&lt;/P&gt;&lt;P&gt;If I run&lt;/P&gt;&lt;PRE&gt;PIP_INDEX_URL="https://corpPyPI:$(databricks secrets get-secret qa-prediction auxpypi_token | jq -r .value)@gitlab.office.corp.com/api/v4/groups/42/-/packages/pypi/simple" pip install arne-corp-library&lt;/PRE&gt;&lt;P&gt;it will install the requested library correctly from the private index.&lt;/P&gt;&lt;P&gt;When I start a cluster with this policy though and start a shell, I get this:&lt;/P&gt;&lt;PRE&gt;$ echo $PIP_INDEX_URL&lt;BR /&gt;https://corpPyPI:{{secrets/global/corpPyPI_token}}@gitlab.office.corp.com/api/v4/groups/42/-/packages/pypi/simple&lt;/PRE&gt;&lt;P&gt;I thought that my user should have the required permissions, and from&amp;nbsp;&lt;A href="https://docs.databricks.com/en/security/secrets/secrets.html#use-a-secret-in-a-spark-configuration-property-or-environment-variable" target="_self"&gt;the secret-docs&lt;/A&gt; I assumed that the secret-access syntax I used should work in this kind of policy-config-file (my test-cluster had databricks-runtime v15.4 installed), but apparently it doesn't.&lt;/P&gt;&lt;P&gt;I'd like to avoid using init-scripts.&lt;/P&gt;&lt;P&gt;What can I do?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 10 Oct 2024 11:38:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/set-up-compute-policy-to-allow-installing-python-libraries-from/m-p/93415#M9061</guid>
      <dc:creator>arne_c</dc:creator>
      <dc:date>2024-10-10T11:38:57Z</dc:date>
    </item>
    <item>
      <title>Re: Set up compute policy to allow installing python libraries from a private package index</title>
      <link>https://community.databricks.com/t5/get-started-discussions/set-up-compute-policy-to-allow-installing-python-libraries-from/m-p/93619#M9062</link>
      <description>&lt;P&gt;I figured it out, seems like secrets can only be loaded into environment variables if the content is the secret and nothing else:&lt;/P&gt;&lt;PRE&gt;"value": "{{secrets/global/arneCorpPyPI_token}}"         # this will work&lt;BR /&gt;"value": "foo {{secrets/global/arneCorpPyPI_token}} bar" # this will not&lt;/PRE&gt;&lt;P&gt;My last problem is now that I need to use string interpolation to create my actual value, e.g.:&lt;/P&gt;&lt;PRE&gt;[...],&lt;BR /&gt;"spark_env_vars.TOKEN": {&lt;BR /&gt;"type": "fixed",&lt;BR /&gt;"value": "{{secrets/global/arneCorpPyPI_token}}"&lt;BR /&gt;},,&lt;BR /&gt;"spark_env_vars.PIP_INDEX_URL": {&lt;BR /&gt;"type": "fixed",&lt;BR /&gt;"value": "https://arneCorpPyPI:${TOKEN}@gitlab.office.arneCorp.com/api/v4/groups/42/-/packages/pypi/simple"&lt;BR /&gt;},&lt;BR /&gt;[...]&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;and json maps are unordered. As it happens, PIP_INDEX_URL is initialized before TOKEN, and my auth is broken. I tried a couple other names, and it looks like the name TEMPORARY will be consistently initialized before PIP_INDEX_URL, and it will work. Obviously, this is not something I want to rely on in any shape, way or form. Is there a better approach? I assume I'm not the first one to define env vars in a policy that depend on each other.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Oct 2024 11:29:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/set-up-compute-policy-to-allow-installing-python-libraries-from/m-p/93619#M9062</guid>
      <dc:creator>arne_c</dc:creator>
      <dc:date>2024-10-11T11:29:23Z</dc:date>
    </item>
    <item>
      <title>Re: Set up compute policy to allow installing python libraries from a private package index</title>
      <link>https://community.databricks.com/t5/get-started-discussions/set-up-compute-policy-to-allow-installing-python-libraries-from/m-p/110987#M9063</link>
      <description>&lt;P&gt;Hello &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/125945"&gt;@arne_c&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;I’m working on creating a Python package that I will host on Azure DevOps. The idea is to download the package when creating different Jobs, and the way you solved the problem is exactly what I intend to use&lt;/P&gt;&lt;P&gt;From what I’ve seen among the proposed approaches, using compute policies seems to be the best practice for this. I wanted to ask how you resolved the issue—did you keep the Temporary declaration, or did you end up approaching it differently?&lt;/P&gt;&lt;P&gt;I’d like to mention that I intend to use Databricks Asset Bundles for the creation of these Jobs.&lt;/P&gt;&lt;P&gt;Thanks in advance, and best regards.&lt;/P&gt;</description>
      <pubDate>Sun, 23 Feb 2025 12:23:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/set-up-compute-policy-to-allow-installing-python-libraries-from/m-p/110987#M9063</guid>
      <dc:creator>jorperort</dc:creator>
      <dc:date>2025-02-23T12:23:08Z</dc:date>
    </item>
  </channel>
</rss>

