I have a very straightforward setup between Azure Eventhub and DLT using the kafka endpoint through spark streaming.
There were network issues and the stream didn't pick up some event, but still progressed (and committed) the offset for some reason
As such, the DLT now picks up any new data coming into the eventhub, but not the events that arrived prior to the network issue being resolved
Is there a way to force reset the offset of the spark reader to always be earliest? At the moment, setting the offset desired does not work as there already is a committed offset to be used, but I want to override that
Alternative would be to create a new partition and move events that were not picked up there, or re-ingest the events that are prior to the committed offset, but that's really not elegant imo