Simba ODBC driver // .Net Core
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-29-2025 07:00 AM
Hi,
I have been looking into the Simba Spark ODBC driver to see if it can simplify our integration with .Net Core. The first results were promising, but when I started to process larger queries I started to notice out-of-memory exceptions in the container. The code is simple, make a query against the sample data, in this case it the table in samples.tpch.lineitem and select all rows. If I use the 2.8.0 version of the Simba driver it works as expected. If i use 2.8.2 or 2.9.1 it results in the container running out of memory.
When reading the release notes for version 2.8.2 or 2.9.1 I can't see anything about significant changes in configuration that I should adjust for in the ODBC settings. Is there something that I'm missing?
I have a sample repository with code and Dockerfile - any help would be appreciated.
GitHub repository: https://github.com/Sondergaard/simba-poc/tree/main
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-29-2025 10:23 AM
Something we're considering for a similar purpose (.NET Core service pulling data from Databricks) is the ADO.NET connector from CData: Databricks Driver: ADO.NET Provider | Create & integrate .NET apps
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-29-2025 10:58 AM
Hi @Rjdudley,
thanks for bringing my attention to CData's solution.
I'm still curious on what's the reason to the different behaviour we are experiencing.