In our organization, we maintain a bunch of libraries we share code with. They're hosted on a private python package index, which requires a token to allow downloads. My idea was to store the token as a secret which would then be loaded into a cluster's environment arguments using a policy. The secret itself has a permissive read-access, but I myself am also a workspace admin, so I'd expect that I would be able to see it, if at all possible.
The relevant part in my policy definition looks like this:
[...],
"spark_env_vars.PIP_INDEX_URL": {
"type": "fixed",
"value": "https://arneCorpPyPI:{{secrets/global/arneCorpPyPI_token}}@gitlab.office.arneCorp.com/api/v4/groups/42/-/packages/pypi/simple"
},
[...]
If I run
databricks secrets get-secret global arneCorpPyPI_token
from my command line, I can see its value.
If I run
PIP_INDEX_URL="https://corpPyPI:$(databricks secrets get-secret qa-prediction auxpypi_token | jq -r .value)@gitlab.office.corp.com/api/v4/groups/42/-/packages/pypi/simple" pip install arne-corp-library
it will install the requested library correctly from the private index.
When I start a cluster with this policy though and start a shell, I get this:
$ echo $PIP_INDEX_URL
https://corpPyPI:{{secrets/global/corpPyPI_token}}@gitlab.office.corp.com/api/v4/groups/42/-/packages/pypi/simple
I thought that my user should have the required permissions, and from the secret-docs I assumed that the secret-access syntax I used should work in this kind of policy-config-file (my test-cluster had databricks-runtime v15.4 installed), but apparently it doesn't.
I'd like to avoid using init-scripts.
What can I do?