I have an Azure Function that receives files (not volumes) and dumps them to cloud storage. One-five files are received approx. per second. I want to create a partitioned table in Databricks to work with. How should I do this? E.g.: register the container as an external location and create a bundle that creates a table and continuously trigger on arrival of new files and adds this data into databricks? What would such code look like - or are there something else I should do. I need something that runs continuously. (It is not an option to move the logic from the Azure Function into Databricks). Should an external or managed table be created?
I also have a similar case, with a lot less data - so partitioning is not required. Should then a managed table, external table or a view be created? What are the pros/cones for each in this case.
I would be very happy if someone could provide code - especially if that code works in a continuous job in Databricks (through bundles).