cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Autoloader (GCP) Custom PubSub Queue

Ryan512
New Contributor III

I want to know if what I describe below is possible with AutoLoader in the Google Cloud Platform.

Problem Description:

We have GCS buckets for every client/account. Inside these buckets is a path/blob for each client's instances of our platform. A client can have 1 or many instances of our platform. Inside the path/blobs are the incremental data files we need to process for the clients. The paths look something like:

gs://<client specific bucket name>/<platform instance id>/data/<year>/<month>/<day>/datafile<some UUID>.json.gz

I want to set up a SINGLE autoloader to load all data files across all of the buckets and paths. Is this possible?

Potential Solution:

From reading the docs it looks like I might be able to create a PubSub topic, and then set notifications on the buckets manually to send the file notifications to the created PubSub topic.

After that I should be able to set the `cloudFiles.subscription` option to point at the PubSub topic I created and then set `pathGlobFilter` to filter to the correct data files so we don't read every file in the bucket.

Will this work as I am expecting? I do not want Autoloader to launch notifications on every bucket we have in our account when I add `gs://*/.....` to the `pathGlobFilter`.

2 REPLIES 2

Hi @Ryan Ebanks​,

Just a friendly follow-up. Do you still need help or the article helped you to resolve your question? please let us know.

Noopur_Nigam
Databricks Employee
Databricks Employee

Hello @Ryan Ebanks​ Please let us know if more help is needed on this.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group