cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to add environment variable

samalexg
New Contributor III

Instead of setting the AWS accessKey and secret Key in hadoopConfiguration, I would like to add those in environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.

How can I do that in databricks?

13 REPLIES 13

vida
Contributor II
Contributor II

Hi,

You can create an init script which will run a bash script which can set unix environment variables. Go to Databricks Guide >> AWS Configuration >> Init Scripts for more details.

-Vida

samalexg
New Contributor III

Thanks for your response. I want to add one more environment variable apart from the AWS properties. I just created the init script as mentioned in documentation. I restarted my cluster. And when I read the env variable like

sys.env.get(envName);
in scala. But it returns nothing. How to check whether my init script is executed or not?

kruhly
New Contributor II

The Init Scripts documentation seem to have moved to: user-guide/advanced/init-scripts.html and are no longer in the AWS Configuration section.

vida
Contributor II
Contributor II

Hi,

In the init scipts notebook is details on how to find the output:

  • Output from all init scripts are generated here: dbfs:/databricks/init/output/

samalexg
New Contributor III

Yes I've verified that. But nothing is there in dbfs:/databricks/init/output/

vida
Contributor II
Contributor II

Did you put an echo command or something in your script so that it will output something?

samalexg
New Contributor III

No. I've just set my env variables in the script. Currently our cluster is in production and I can't test it now. Once I got access I will add some echo and let you know the result

cfregly
Contributor

@samalexg​ : are you sure you're writing to the

/etc/environment

file as follows:

otherwise, the env vars are only set for the process that is called to run the script.

i assume you're doing this, but wanted to double check as this has been a common mistake in the past.

dbutils.fs.put(s"""dbfs:/databricks/init/$clusterName/setup-awscreds.sh""","""
#!/bin/bash 
sudo echo AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY_HERE >> /etc/environment
sudo echo AWS_SECRET_KEY=YOUR_SECRET_KEY_HERE >> /etc/environment
""", true)

samalexg
New Contributor III

Yes. I set the env variables as mentioned, but one difference. I've not set it based on cluster. I've added init scripts in dbfs:/databricks/init/ itself? Will that makes the difference?

TimChan
New Contributor II

I have the same problem. I've added this script and it does not write anything to /etc/environment. The output files from the execution of these scripts are also empty.

Yes, its not supported at the moment. We'll make sure to document that!

@Miklos Christine​  , is writing variables to /etc/environment not supported, or are session/system wide environment variables not supported at all?

jric
New Contributor II

It is possible! I was able to confirm that the following post's "Best" answer works: https://forums.databricks.com/questions/11116/how-to-set-an-environment-variable.html

FYI for @Miklos Christine​  and @Mike Trewartha​ 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.