cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

File Arrival Trigger

BaburamShrestha
New Contributor

We are using Databricks in combination with Azure platforms, specifically working with Azure Blob Storage (Gen2). We frequently mount Azure containers in the Databricks file system and leverage external locations and volumes for Azure containers.

Our use case involves building several data pipelines in Databricks, and we are currently facing an issue with setting up a file arrival trigger. The goal is to trigger a workflow whenever a new file is dropped into an Azure Blob Storage container (Gen2), and we need to pass the complete file path to the subsequent processor in the workflow.

We would appreciate guidance on how to:

  1. Set up a file arrival trigger in Databricks for Azure Blob Storage (Gen2).
  2. Capture the file path that triggered the event and pass it as a parameter to the next task in the pipeline.

Any advice or best practices to solve this issue would be greatly appreciated!

Thank you for your time and assistance.

Best regards,
Baburam Shrestha

2 REPLIES 2

szymon_dybczak
Contributor III

Hi @BaburamShrestha ,

.1. To set up a file arrival trigger you can follow below guides:

File Arrival Triggers in Databricks Workflows (linkedin.com)

Trigger jobs when new files arrive | Databricks on AWS.

2. To capture the file path that triggered the event I think you can try to pass at task level parameter following value:

 

szymon_dybczak_0-1728037063209.png

 



.

noorbasha534
Visitor

@szymon_dybczak May I know what would be the best way to do this for around 2000 tables. I notice out of 15000 tables, 2000 delta tables receive files in bronze layer very rarely. We usually run streaming for 15000 tables through-out the day. Now, I like to tune this set-up by taking away these 2000 infrequently updated tables. The file locations would be different for each of these. Appreciate your thoughts.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group