Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-06-2022 04:19 AM
In our streaming jobs, we currently run streaming (cloudFiles format) on a directory with sales transactions coming every 5 minutes.
In this directory, the transactions are ordered in the following format:
<streaming-checkpoint-root>/<transaction_date>/<transaction_hour>/transaction_x_y.json
Only the transactions of TODAY are of interest, all others are already obsolete.
When I start the streaming job, it will process all the historical transactions, which I don´t want.
Is it somehow possible to process only NEW files coming in after the streaming job has started?
Labels:
- Labels:
-
CloudFiles
-
TODAY