cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Autoloader stream with EventBridge message

fhmessas
New Contributor II

Hi All,

I have a few streaming jobs running but we have been facing an issue related to messaging. We have multiple feeds within the same root rolder i.e. logs/{accountId}/CloudWatch|CloudTrail|vpcflow/yyyy-mm-dd/logs. Hence, the SQS allows to setup only one instance to the folder and we can't use wildcards.

We've tried to setup an eventbridge to filter the messages and consume from autoloader, but with this approach autoloader is not consuming the messages correctly, neither deleting them when reading.

Is there any way to setup autoloader using eventbridge between the SQS and the streamming?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

@Fernando Messas​ :

Yes, you can configure Autoloader to consume messages from an SQS queue using EventBridge. Here are the steps you can follow:

  1. Create an EventBridge rule to filter messages from the SQS queue based on a specific criteria (such as the feed type or account ID).
  2. Configure the rule to trigger a Lambda function when a message matching the criteria is received.
  3. Write the Lambda function to read the message from the SQS queue, extract the relevant data (such as the S3 object key), and call the Autoloader API to load the data into your streaming job.
  4. Configure the Lambda function to delete the message from the SQS queue after processing.

This approach allows you to use EventBridge to filter and route messages to different Lambda functions based on specific criteria, and then use the Autoloader API to load the data into your streaming job.

Keep in mind that you will need to handle any errors that occur during the processing of the messages, such as failed API calls or data validation errors. Additionally, you should monitor the health of your streaming job to ensure that it is processing messages correctly and not falling behind.

View solution in original post

1 REPLY 1

Anonymous
Not applicable

@Fernando Messas​ :

Yes, you can configure Autoloader to consume messages from an SQS queue using EventBridge. Here are the steps you can follow:

  1. Create an EventBridge rule to filter messages from the SQS queue based on a specific criteria (such as the feed type or account ID).
  2. Configure the rule to trigger a Lambda function when a message matching the criteria is received.
  3. Write the Lambda function to read the message from the SQS queue, extract the relevant data (such as the S3 object key), and call the Autoloader API to load the data into your streaming job.
  4. Configure the Lambda function to delete the message from the SQS queue after processing.

This approach allows you to use EventBridge to filter and route messages to different Lambda functions based on specific criteria, and then use the Autoloader API to load the data into your streaming job.

Keep in mind that you will need to handle any errors that occur during the processing of the messages, such as failed API calls or data validation errors. Additionally, you should monitor the health of your streaming job to ensure that it is processing messages correctly and not falling behind.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.