We are in the process of implementing data mesh in our organization. When trying to help the different teams produce raw data, the absolute majority want to do this through their APIs. We tried an implementation where we did web-requests directly from Databricks using UDFs, but this did not work very well unfortunately.
Dealing with APIs must be pretty common, but I find it hard to find resources on this subject. We are considering recommending to set up a cloud function and dump the API response into a storage bucket of some sort. However, I feel like there should exist a "better" or easier way.
We are looking for ways to minimize the workload a team needs to get their data into our platform. So I wonder, what is the best way to deal with API data ingestion into Databricks?