Databricks now supports event-driven workloads, especially for loading cloud files from external locations. This means you can save costs and resources by triggering your Databricks jobs only when new files arrive in your cloud storage instead of mounting it as DBFS and polling it periodically. To use this feature, you need to follow these steps:
- Add an external location for your ADLS2 container,
- Make sure the storage credentials you use (such as Access Connector, service principal, or managed identity) have Storage Blob Data Contributor permissions for that container,
- Make sure the account you use to run your workload has at least read files permission for the external location,
- Write a notebook that loads cloud files from the external location,
- Set a file arrival trigger for your workflow and specify the exact external location as the source.
With these steps, you can easily create and run event-driven workloads on Databricks.