Simba ODBC driver // .Net Core

sondergaard
New Contributor II

Hi,

I have been looking into the Simba Spark ODBC driver to see if it can simplify our integration with .Net Core. The first results were promising, but when I started to process larger queries I started to notice out-of-memory exceptions in the container. The code is simple, make a query against the sample data, in this case it the table in samples.tpch.lineitem and select all rows. If I use the 2.8.0 version of the Simba driver it works as expected. If i use 2.8.2 or 2.9.1 it results in the container running out of memory. 

When reading the release notes for version 2.8.2 or 2.9.1 I can't see anything about significant changes in configuration that I should adjust for in the ODBC settings. Is there something that I'm missing?

I have a sample repository with code and Dockerfile - any help would be appreciated. 

GitHub repository: https://github.com/Sondergaard/simba-poc/tree/main

Rjdudley
Honored Contributor

Something we're considering for a similar purpose (.NET Core service pulling data from Databricks) is the ADO.NET connector from CData: Databricks Driver: ADO.NET Provider | Create & integrate .NET apps

Hi @Rjdudley,

thanks for bringing my attention to CData's solution.

I'm still curious on what's the reason to the different behaviour we are experiencing.