cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Simba ODBC driver // .Net Core

sondergaard
New Contributor II

Hi,

I have been looking into the Simba Spark ODBC driver to see if it can simplify our integration with .Net Core. The first results were promising, but when I started to process larger queries I started to notice out-of-memory exceptions in the container. The code is simple, make a query against the sample data, in this case it the table in samples.tpch.lineitem and select all rows. If I use the 2.8.0 version of the Simba driver it works as expected. If i use 2.8.2 or 2.9.1 it results in the container running out of memory. 

When reading the release notes for version 2.8.2 or 2.9.1 I can't see anything about significant changes in configuration that I should adjust for in the ODBC settings. Is there something that I'm missing?

I have a sample repository with code and Dockerfile - any help would be appreciated. 

GitHub repository: https://github.com/Sondergaard/simba-poc/tree/main

2 REPLIES 2

Rjdudley
Honored Contributor

Something we're considering for a similar purpose (.NET Core service pulling data from Databricks) is the ADO.NET connector from CData: Databricks Driver: ADO.NET Provider | Create & integrate .NET apps

Hi @Rjdudley,

thanks for bringing my attention to CData's solution.

I'm still curious on what's the reason to the different behaviour we are experiencing. 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now