File trigger for workflows
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-11-2024 12:03 PM
Does anyone know what triggers a "new" file in workflows? Is it a checksum etc?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-11-2024 03:47 PM
Hi @tuckera how are you?
When you mean "file in workflows" you mean files that are added to a source on a Structured Streaming?
If you use Autoloader (AKA CloudFiles) on the process of ingesting, there are two ways it can identify files: The first one is by Lexical Ordering, and the second is the File Notification method.
For the first one, if you add files such as:
FILE_1.parquet and after add FILE_2.parquet on a bucket, the Autoloader will identify this as a new file using lexical ordering.
If you would like to add a more reliable way of ingesting, turning on the File Notification method will help you achieve it.
Please check further on this using the links of the documentation.
https://docs.databricks.com/en/ingestion/auto-loader/directory-listing-mode.html
https://docs.databricks.com/en/ingestion/auto-loader/file-notification-mode.html
Best,
Alessandro
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-11-2024 03:50 PM
You refere that one?
If so, it is a workfows trigger that trigger a job when a new file arrives in an external location.
https://docs.databricks.com/en/workflows/jobs/file-arrival-triggers.html