Hi @RodrigoE,
It would be helpful to have additional information to recommend the best options for your scenario.
- Who owns the REST API?
- Is that in your control?
- Can the source push data to Databricks, or should you pull on a schedule?
If the source can push the data, consider Zerobus. This is the cleanest, most scalable Databricks-native pattern if the producer is under your control.
If you have no control over the source, you can build a custom Python data source wrapping their REST API and run it as a Databricks job/stream. While the pattern will work for your volumes, the bottleneck is usually the APIโs own throughput/limits, not Databricks.
If this answer resolves your question, could you mark it as โAccept as Solutionโ? That helps other users quickly find the correct fix.
Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***