Dear,
I am working on a real-time use case and am therefore using Auto Loader with file notification to ingest json files from a Gen2 Azure Storage Account in real-time. Full refreshes of my table work fine but I noticed Auto Loader was not picking up new files landing in the storage account. I have checked the Queue Storage and it stays empty. However, when I manually add a file, a message is added to the queue and the file is processed as expected.
After some digging I found out the external system writing the files to the storage account was written these files as a stream (when I inspect the properties of the files written by the external system, I see "application/octet-stream" as CONTENT-TYPE whereas when I manually add a file I see "application/json"). This event type is not matched by default by the event subscription created by Databricks.
I tried to add it to the advanced filters of the event subscription (with key pair data.api: CreateFile). This generates messages in the queue but because the Microsoft.Storage.BlobCreated event is triggered when the CopyBlob operation is initiated and no... and the Create File API call first initiates files and then content is added to the file, the contentLength parameter of the corresponding message in the queue is set to 0 and Auto Loader considers the file to be empty, even though it's not.
Is there a solution/work-around or is this a limitation of file notification? Thanks in advance!