- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-09-2024 07:54 AM - edited 12-09-2024 08:00 AM
It seems that nothing is being loaded into the GOOGLE_APPLICATION_CREDENTIALS.
From https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/master/gcs/INSTALL.md
# The JSON keyfile of the service account used for GCS
# access when google.cloud.auth.service.account.enable is true.
spark.hadoop.google.cloud.auth.service.account.json.keyfile=/path/to/keyfile
I think its worth trying this before raising a support ticket as there are many details that could be playing a role here. I think os.environ["GOOGLE_APPLICATION_CREDENTIALS"], should've worked to be honest. So, I would expected the following to be a good starting point -summarizing what we've done so far-:
Here is how you can modify your code to ensure it works correctly:
-
Reading the Databricks Secret and Setting the Spark Configuration:
import base64 # Read the secret from Databricks Secrets cred = dbutils.secrets.get(scope="bigquery-scope", key="secret-name").encode('ascii') cred = base64.b64encode(cred).decode('ascii') # Set the credentials in Spark configuration spark.conf.set("spark.hadoop.google.cloud.auth.service.account.json.keyfile", cred) -
Reading Data from BigQuery:
# Read data from BigQuery df = spark.read.format("bigquery") \ .option("parentProject", "<parent-project-id>") \ .option("viewsEnabled", "true") \ .option("table", "<table-name>") \ .load() -
Writing Data to BigQuery:
# Write data to BigQuery df.write.format("bigquery") \ .mode("overwrite") \ .option("temporaryGcsBucket", "<bucket-name>") \ .option("table", "<table-name>") \ .option("parentProject", "<parent-project-id>") \ .save()About the stacktrace itself, it simply looks like indeed the GOOGLE_APPLICATION_CREDENTIALS isn't set or is inaccessible, and none of the other defaultt credential sources are available:
-
- I think it eventually goes through https://github.com/googleapis/google-auth-library-java/blob/v0.27.0/oauth2_http/java/com/google/auth... and attempts the other methods for a couple of second before bailing out.