Exporting table to GCS bucket using job

aswinvishnu · ‎04-29-2025

Hi all,

Usecase: I want to send the result of a query to GCS bucket location in json format.

Approach: From my java based application I create a job and that job will be running a notebook`. Notebook will have something like this

```

query = "SELECT * FROM table"

df = spark.sql(query)

gcs_path = "gs://<bucket>/path/"

df.write.option("maxRecordsPerFile", int("100")).mode("overwrite").json(gcs_path)

```

I am able to provide access to my gcs bucket using a service account json which has access to my gcs account. But for my usecase. I cant provide the service account information to the databricks account. But rather I am okay with exposing an access token which will be created from the service account.

I tried something like

```
spark.conf.set("spark.hadoop.fs.gs.auth.type", "OAuth")
spark.conf.set("spark.hadoop.fs.gs.auth.access.token", access_token)

```

which didn't had any effect. I am getting below error in my notebook
Py4JJavaError: An error occurred while calling o476.json. : java.io.IOException: Error getting access token from metadata server at:

Kind of stuck in this. Any help would be appreciated.
Thanks,

Aswin