cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to add environment variable

samalexg
New Contributor III

Instead of setting the AWS accessKey and secret Key in hadoopConfiguration, I would like to add those in environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.

How can I do that in databricks?

13 REPLIES 13

vida
Databricks Employee
Databricks Employee

Hi,

You can create an init script which will run a bash script which can set unix environment variables. Go to Databricks Guide >> AWS Configuration >> Init Scripts for more details.

-Vida

samalexg
New Contributor III

Thanks for your response. I want to add one more environment variable apart from the AWS properties. I just created the init script as mentioned in documentation. I restarted my cluster. And when I read the env variable like

sys.env.get(envName);
in scala. But it returns nothing. How to check whether my init script is executed or not?

kruhly
New Contributor II

The Init Scripts documentation seem to have moved to: user-guide/advanced/init-scripts.html and are no longer in the AWS Configuration section.

vida
Databricks Employee
Databricks Employee

Hi,

In the init scipts notebook is details on how to find the output:

  • Output from all init scripts are generated here: dbfs:/databricks/init/output/

samalexg
New Contributor III

Yes I've verified that. But nothing is there in dbfs:/databricks/init/output/

vida
Databricks Employee
Databricks Employee

Did you put an echo command or something in your script so that it will output something?

samalexg
New Contributor III

No. I've just set my env variables in the script. Currently our cluster is in production and I can't test it now. Once I got access I will add some echo and let you know the result

cfregly
Contributor

@samalexg​ : are you sure you're writing to the

/etc/environment

file as follows:

otherwise, the env vars are only set for the process that is called to run the script.

i assume you're doing this, but wanted to double check as this has been a common mistake in the past.

dbutils.fs.put(s"""dbfs:/databricks/init/$clusterName/setup-awscreds.sh""","""
#!/bin/bash 
sudo echo AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY_HERE >> /etc/environment
sudo echo AWS_SECRET_KEY=YOUR_SECRET_KEY_HERE >> /etc/environment
""", true)

samalexg
New Contributor III

Yes. I set the env variables as mentioned, but one difference. I've not set it based on cluster. I've added init scripts in dbfs:/databricks/init/ itself? Will that makes the difference?

TimChan
New Contributor II

I have the same problem. I've added this script and it does not write anything to /etc/environment. The output files from the execution of these scripts are also empty.

Yes, its not supported at the moment. We'll make sure to document that!

@Miklos Christine​  , is writing variables to /etc/environment not supported, or are session/system wide environment variables not supported at all?

jric
New Contributor II

It is possible! I was able to confirm that the following post's "Best" answer works: https://forums.databricks.com/questions/11116/how-to-set-an-environment-variable.html

FYI for @Miklos Christine​  and @Mike Trewartha​ 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group