Exporting table to GCS bucket using job
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-29-2025 10:39 AM
Hi all,
Usecase: I want to send the result of a query to GCS bucket location in json format.
Approach: From my java based application I create a job and that job will be running a notebook`. Notebook will have something like this
```
query = "SELECT * FROM table"
df = spark.sql(query)
gcs_path = "gs://<bucket>/path/"df.write.option("maxRecordsPerFile", int("100")).mode("overwrite").json(gcs_path)
```
I am able to provide access to my gcs bucket using a service account json which has access to my gcs account. But for my usecase. I cant provide the service account information to the databricks account. But rather I am okay with exposing an access token which will be created from the service account.
I tried something like
I tried something like
```
spark.conf.set("spark.hadoop.fs.gs.auth.type", "OAuth")
spark.conf.set("spark.hadoop.fs.gs.auth.access.token", access_token)
spark.conf.set("spark.hadoop.fs.gs.auth.type", "OAuth")
spark.conf.set("spark.hadoop.fs.gs.auth.access.token", access_token)
```
which didn't had any effect. I am getting below error in my notebook
Py4JJavaError: An error occurred while calling o476.json. : java.io.IOException: Error getting access token from metadata server at:
Kind of stuck in this. Any help would be appreciated.
Thanks,
which didn't had any effect. I am getting below error in my notebook
Py4JJavaError: An error occurred while calling o476.json. : java.io.IOException: Error getting access token from metadata server at:
Kind of stuck in this. Any help would be appreciated.
Thanks,
Aswin
Labels:
- Labels:
-
Workflows