Hi. I have a question, and I've not been able to find an answer. I'm sure there is one...I just haven't found it through searching and browsing the docs.
How much does it matter (if it is indeed that simple) if source files read by auto loader are in a single folder or structured by subfolders (e.g. YYYY \ MM \ DD).
My environment is Azure Databricks and ADLS gen2 (using hierarchical namespace). In this case, I have 4 "folders" which each contain all the files we've ever received from various post API methods (1 folder for each method). It was not set up to create subfolders based on date. So there's currently from <1 million to > 5 million, depending on the method.
I need to migrate this data, and where this is coming from is - is it worth the effort of copying to a date-based structure, because it will make the auto loader part more efficient, or just dump it over as-is and carry on with life..?