Hi there, when referencing Common data loading patterns > Enable flexible semi-structured data pipelines , I noticed this interesting code snippet:
spark.readStream.format("cloudFiles") \
.option("cloudFiles.format", "json") \
# will ensure that the headers column gets processed as a map
.option("cloudFiles.schemaHints",
"headers map<string,string>, statusCode SHORT") \
.load("/api/requests") \
.writeStream \
.option("mergeSchema", "true") \
.option("checkpointLocation", "<path-to-checkpoint>") \
.start("<path_to_target")
This may be a bit of a leap, but I'm wondering if anyone knows if Autoloader supports pulling data directly from an API (as opposed to incrementally loading data from a designated "landing" path). Not sure if I'm reading too much into it, but the `schemaHints` and `/api/requests/` seem awfully close to being literal API calls, and this would be an interesting use-case if we are able to store both the raw json data as well as the API status code in the same target table.