- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-12-2023 02:15 AM
Hi Databricks Community,
I want to set environment variables for all clusters in my workspace. The goal is to the have environment variable, available in all notebooks executed on the cluster.
The environment variable is generated in global init script and stored in the `/etc/environment` like documented here: https://community.databricks.com/s/question/0D58Y000096UKm5SAG/set-environment-variables-in-global-i...
After my init script execution the `/etc/environment` content looks like:
CLUSTER_DB_HOME=/databricks
DATABRICKS_RUNTIME_VERSION=10.4
DB_HOME=/databricks
DEFAULT_DATABRICKS_ROOT_VIRTUALENV_ENV=/databricks/python3
MLFLOW_CONDA_HOME=/databricks/conda
MLFLOW_PYTHON_EXECUTABLE=/databricks/python/bin/python
MLFLOW_TRACKING_URI=databricks
PYARROW_IGNORE_TIMEZONE=1
export MY_TEST_VAR=testThe integration is working for the standard clusters and I can use the variable in the notebooks.
BUT for the clusters with defined custom docker container, the environment variable is invisible.
With the custom docker container cluster, I mean the clusters with the option "Use your own Docker container" set. For that type of clusters I can't access the environment variable. E.g the result of the code
import os
print(os.getenv('MY_TEST_VAR'))is empty (None).
Any ideas where do I need to store environment variables to have them available in all cluster types?
Thank you!