- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ04-19-2023 07:21 AM
Hello, I have an Databricks account on Azure, and the goal is to compare different image tagging services from Azure, GCP, AWS via corresponding API calls, with Python notebook. I have problems with GCP vision API calls, specifically with credentials: as far as I understand, the one necessary step is to set 'GOOGLE_APPLICATION_CREDENTIALS' environment variable in my databricks notebook with something like
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] ='/folder1/credentials.json'
where '/folder1/credentials.json' is the place my notebook looks for json file with credentials (notebook is in the same folder, /folder1/notebook_api_test).
I am getting this path by looking into Workspace-> Copy file path in the Databricks web page. But this approach doesn't work, when cell is executed, I am getting this error:
DefaultCredentialsError: File /folder1/credentials.json was not found.
What is the right way to deal with credentials to access google vision API from Azure Databricks notebook?
- Labels:
-
API
-
Azure
-
Best Way
-
Databricks notebook
-
GCP
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ04-20-2023 04:21 PM
Ok, here is a trick: in my case, the file with GCP credentials is stored in notebook workspace storage, which is not visible to os.environ() command.
So solution is to read a content of this file, and save it to the cluster storage attached to the notebook, which is created with the cluster and is erased when cluster is gone (so we need to repeat this procedure every time the cluster is re-created). According to this link, we can read the content of the credentials json file stored in notebook workspace with
with open('/Workspace/folder1/cred.json'): #note that I need a full path here, for some reason
content = f.read()
and then according to his doc,, we need to save it on another place in a new file (with the same name in my case, cred.json), namely on cluster storage attached to the notebook (which is visible to os-related functions, like os.environ()), with
fd = os.open("cred.json", os.O_RDWR|os.O_CREAT)
ret = os.write(fd,content.encode())
#need to add .encode(), or will get TypeError: a bytes-like object is required, not 'str'
os.close(fd)
Only after that we can continue with setting an environment variable, required for GCP authentication:
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] ='./cred.json'
and then API calls should work fine, without DefaultCredentialsError.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ04-20-2023 04:21 PM
Ok, here is a trick: in my case, the file with GCP credentials is stored in notebook workspace storage, which is not visible to os.environ() command.
So solution is to read a content of this file, and save it to the cluster storage attached to the notebook, which is created with the cluster and is erased when cluster is gone (so we need to repeat this procedure every time the cluster is re-created). According to this link, we can read the content of the credentials json file stored in notebook workspace with
with open('/Workspace/folder1/cred.json'): #note that I need a full path here, for some reason
content = f.read()
and then according to his doc,, we need to save it on another place in a new file (with the same name in my case, cred.json), namely on cluster storage attached to the notebook (which is visible to os-related functions, like os.environ()), with
fd = os.open("cred.json", os.O_RDWR|os.O_CREAT)
ret = os.write(fd,content.encode())
#need to add .encode(), or will get TypeError: a bytes-like object is required, not 'str'
os.close(fd)
Only after that we can continue with setting an environment variable, required for GCP authentication:
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] ='./cred.json'
and then API calls should work fine, without DefaultCredentialsError.

