cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Expose low latency APIs from Deltalake for mobile apps and microservices

jcapplefields88
New Contributor II

My company is using Deltalake to extract customer insights and run batch scoring with ML models. I need to expose this data to some microservices thru gRPC and REST APIs. How to do this? I'm thinking to build Spark pipelines to extract teh data, store it in a fast cache like Redis and create some APIs in front. Is there a faster solution?

1 REPLY 1

stefnhuy
New Contributor III

Hey everyone 😃

It's awesome that your company is utilizing Deltalake for extracting customer insights and running batch scoring with ML models. I can totally relate to the excitement and challenges of dealing with data integration for microservices and mobile apps.

Now, onto your question, jcapplefields88. Your thought process of using Spark pipelines to extract data, caching it in Redis, and building APIs sounds pretty solid. It's a common approach and can work well for many use cases. However, if you're aiming for even lower latency and smoother integration, you might want to explore tools like Apache Kafka or Apache Pulsar. They provide real-time data streaming capabilities that could be more efficient than periodic batch processing. I might also recommend that you read this documentation: Augmented Reality Applications: Maximizing Opportunities with Next-Generation. It has some good tips for your case.

Also, if you're heavily invested in the Deltalake ecosystem, you might want to consider using Delta Lake's ACID capabilities to directly expose low latency APIs. This would eliminate the need for intermediate caching and simplify your architecture.

In terms of my experience, I've worked on similar setups before. Integrating diverse data sources, especially when dealing with real-time requirements, can be a bit of a puzzle. But it's incredibly rewarding once you get it right.

Keep in mind that every solution has its trade-offs. Depending on the scale of your data and the speed you require, choose the approach that aligns best with your project's goals.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.