Connect Delta Lake to OData API?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-17-2021 01:23 PM
I'd like to expose Delta Lake data to external customers via OData v4 APIs. What's the best way to do that?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-17-2021 01:28 PM
The Open Data Protocol (OData) is a data access protocol built on core protocols like HTTP and commonly accepted methodologies like REST for the web. Think of OData as a HTTP/REST version of JDBC/ODBC. Pretty thorough/complex.
This can be implemented by creating an intermediate service which handles requests and can query the Delta Lake. That service will need to have a REST API for servicing OData. For reading Delta, the service could use 2 options:
- Either use the native Delta reader via a Databricks SQL Endpoint, in which case the request pipeline would look like: client -> HTTP -> JDBC/ODBC -> Spark/Databricks,
- Or use the Rust/Python/Ruby/Golang APIs from delta-rs: https://github.com/delta-io/delta-rs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-16-2022 12:32 PM
@Joseph Bradley ,
Is the best answer to this still to implement the the OData intermediate service yourself? Or is there a better way now?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2023 02:38 AM
Is the best answer to this still to implement the the OData intermediate service yourself? Or is there a better way now?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-02-2023 07:55 AM
I believe it's still the best option. That said, it would be good to know what the OData API is needed for. When I added the original answer, Databricks SQL was nowhere near where it is today, and it's now easy to connect DB SQL directly to PowerBI or other tools which might otherwise "need" OData APIs. For PowerBI, see https://learn.microsoft.com/en-us/azure/databricks/partners/bi/power-bi

