Databricks Community

User16826994223 · ‎06-25-2021

I have a final layer of the gold delta table, that has final aggregated data from silver data . I want to access this final layer of data through the WEB interface

I think I need to write a web script that would run the spark SQL behind to get the data. and then i Can write the result set data in soem table like mango db and then show in web ui

Is there a known best practice solution?

User16826994223 · ‎06-25-2021

The real answer depends on your requirements regarding latency, the number of data located - HDFS/S3/...), etc. Possible approaches are:

Read data directly using the Delta Standalone Reader library for JVM, or via delta-rs library that works with Rust/Python/Ruby

Altay · ‎03-17-2023

Is there any update on this topic?

Thank you

stefanhieslas11 · ‎08-15-2023

Hey everyone 🙂

I totally get the frustration of dealing with these complex data layers, but don't worry, you're in the right place for some guidance! Accessing that final delta table through a web interface can indeed be a bit tricky, but it's not an unsolvable puzzle.

Your plan to use a web script running Spark SQL sounds pretty solid. This way, you can leverage the power of Spark to handle the heavy lifting and get the aggregated data. Storing the results in a database like MongoDB and then displaying it in the web UI is a sensible approach too, as it can help with faster retrieval and a smoother user experience.

However, if you're looking for a best practice solution, you might want to consider a microservices architecture. This could involve building a dedicated service that handles the interaction between your Spark cluster and the web UI. By decoupling these components, you could achieve better scalability and maintainability.

I am really glad that you are asking such relevant and important questions on this forum. Mobile development is still gaining popularity right now, and I would recommend you check out this resource to better understand your problem: Mobile App Development Industry: Explore Top Challenges in 2023. Let me know if this has helped you in any way.

User16826994223, about your question, kudos for your initiative! If you're going the Spark SQL route, remember to optimize your queries for performance, and perhaps consider caching if applicable. Also, explore visualization libraries to make your web interface more user-friendly.

shadowinc · ‎11-02-2024

@stefanhieslas11 Thanks for your input. However, are there other approaches to avoid staging databases like Cosmos or Mongo DB as input to web applications and directly fetch from delta tables while maintaining the read performance? Much appreciated.