Exporting table to GCS bucket using job
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-29-2025 10:39 AM
Hi all,
Usecase: I want to send the result of a query to GCS bucket location in json format.
Approach: From my java based application I create a job and that job will be running a notebook`. Notebook will have something like this
```
I tried something like
spark.conf.set("spark.hadoop.fs.gs.auth.type", "OAuth")
spark.conf.set("spark.hadoop.fs.gs.auth.access.token", access_token)
which didn't had any effect. I am getting below error in my notebook
Py4JJavaError: An error occurred while calling o476.json. : java.io.IOException: Error getting access token from metadata server at:
Kind of stuck in this. Any help would be appreciated.
Thanks,
- Labels:
-
Workflows
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-29-2025 07:42 PM - edited 04-29-2025 07:48 PM
Hi @aswinvishnu
- GCS support in Spark via Hadoop Connectors has specific limitations, and using a raw access token (OAuth token) instead of a service account key file is tricky, especially in Databricks.
You’re trying to use access token–based authentication, but GCS's Hadoop connector (used under the hood by Spark) typically expects:
1. Service Account key file (standard)
2. Or ADC (Application Default Credentials) from the environment/metadata server (in GCP-native services like GKE or Dataproc)
Databricks is not natively GCP, so it doesn't have access to the GCP metadata server, hence the error:
Error getting access token from metadata server..
Use spark.hadoop.fs.gs.auth.type=ACCESS_TOKEN (Not "OAuth")
If you insist on using an access token instead of a key file, change your auth type:
spark.conf.set("spark.hadoop.fs.gs.auth.type", "ACCESS_TOKEN")
spark.conf.set("spark.hadoop.fs.gs.auth.access.token", access_token)
This is the correct config to pass a bearer token manually (OAuth is for interactive user flows; ACCESS_TOKEN is for static token use like this).
However, this still may not work reliably in Spark unless you're using the right version of the GCS connector (>= 2.2.0). Databricks may bundle older or customized versions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-01-2025 08:13 PM
Hi @lingareddy_Alva,
Thanks for the reply. I tried the 'ACCESS_TOKEN' auth type too, but it didn't made any difference.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-06-2025 01:23 AM
Consider using GCS signed URLs or access tokens for secure access.