cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to connect Databricks Database with Springboot application using JPA

satishnavik
New Contributor II

facing issue with integrating our Spring boot JPA supported application with Databricks.

Below are the steps and setting we did for the integration.

When we are starting the spring boot application we are getting a warning as :

HikariPool-1 - Driver does not support get/set network timeout for connections. 
([Databricks][JDBC](10220) Driver does not support this optional feature.) 

and the application gets started.

When interacting with DB for any CRUD operation we are getting exception : Caused by: java. SQL. SQL Feature Not Supported Exception: [Databricks][JDBC](10220) Driver does not support this optional feature.

Driver used: com. databricks .client. jdbc. Driver

Url: jdbc:databricks://************************ t:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/778046939806498/0307-074105-********;AuthMech=3;UID=token;PWD=*****************

Dialect : org. hibernate. dialect. My SQL Dialect

Below are other drivers that we tried as well suggested in databricks -jdbc-driver-install-and-configuration-guide:

com.databricks.client.jdbc.DataSource
com.databricks.client.jdbc42.Driver
com.databricks.client.jdbc42.DataSource

Also we would like to highlight that we are able to make single connection and able to perform CRUD operation with custom queries but not with Spring Data Jpa.

5 REPLIES 5

Hi Kaniz,

I have followed the steps mentioned above but it didn't solve the problem. Now, I can retrieve the data using JPA but the transaction methods like save, and save All are failing. It is used to do the Insert and Update query using JPA. However I was able to run the explicit Insert and Update query. Now getting the below error.

org.springframework.dao.InvalidDataAccessApiUsageException: No EntityManager with actual transaction available for current thread - cannot reliably process 'persist' call; nested exception is javax.persistence.TransactionRequiredException: No EntityManager with actual transaction available for current thread - cannot reliably process 'persist' call

I wasn't able to switch the auto-commit to false 

spring.datasource.hikari.auto-commit=false

It was throwing the below error when i tried to switch the auto-commit to False. I tried it in the properties files as well while making the connection.

java.sql.SQLFeatureNotSupportedException: [Databricks][JDBC](10220) Driver does not support this optional feature.

It would highly appreciated if you can help me to make a connection with Java Spring Boot JPA with the Delta lake.

 

 

@satishnavik - The error message indicates that Spring JPA is unable to manage transactions with Delta Lake because the Delta Lake JDBC driver doesn't support them.

The Issue:

  • Spring JPA relies on JTA (Java Transaction API) for managing transactions across different data sources.
  • Delta Lake JDBC driver, while allowing basic CRUD operations, doesn't support features like auto-commit or manual transaction control through JTA.

Possible Solutions:

Since JPA transactions won't work with Delta Lake, here are alternative approaches:

  1. Native Delta Lake API:

    • Use the Databricks Delta Lake API for Java or Scala. This API provides full control over data manipulation, including inserts, updates, and deletes. However, it requires writing code specifically for Delta Lake operations, bypassing Spring JPA.
  2. Separate Layers:

    • Develop a separate data access layer using the Delta Lake API. This layer can handle data manipulation in Delta Lake. Spring JPA can then interact with this data access layer for retrieving data but wouldn't be used for insert/update operations.
  3. Spring Data JPA with another Datasource:

    • Consider using Spring Data JPA with a traditional relational database like MySQL or PostgreSQL for managing entities and transactions. This database can then act as a staging area for data before being bulk loaded into Delta Lake using separate Delta Lake specific operations.

Choosing the Right Approach:

The best approach depends on your specific needs. Here's a breakdown:

  • Native Delta Lake API: If you primarily focus on Delta Lake operations and don't require full JPA functionality, this might be a good choice. It offers flexibility but requires separate data access code.
  • Separate Layers: This approach separates data manipulation logic from data retrieval using JPA. It keeps a clear separation of concerns but adds complexity with managing two data access layers.
  • Spring Data JPA with another Datasource: This is suitable if you need full JPA functionality with transactions for some data and want to leverage Delta Lake for bulk data storage and analytics. It requires managing data flow between the relational database and Delta Lake.

@satishnavik I am also facing the same issue with my Spring app. Can I ask what version of the databricks jdbc and springboot you are using?

My app starts, but I get a series of errors stemming from log4j

ERROR StatusLogger Unable to create Lookup for bundle
 java.lang.ClassCastException: class org.apache.logging.log4j.core.lookup.ResourceBundleLookup
	at java.base/java.lang.Class.asSubclass(Class.java:3924)
	at com.databricks.client.jdbc42.internal.apache.logging.log4j.core.lookup.Interpolator.<init>(Interpolator.java:84)
	at com.databricks.client.jdbc42.internal.apache.logging.log4j.core.lookup.Interpolator.<init>(Interpolator.java:105)[...]
ERROR StatusLogger Unable to create Lookup for ctx
 java.lang.ClassCastException: class org.apache.logging.log4j.core.lookup.ContextMapLookup
	at java.base/java.lang.Class.asSubclass(Class.java:3924)
	at com.databricks.client.jdbc42.internal.apache.logging.log4j.core.lookup.Interpolator.<init>(Interpolator.java:84)
	at com.databricks.client.jdbc42.internal.apache.logging.log4j.core.lookup.Interpolator.<init>(Interpolator.java:105)

 

SpringBoot
New Contributor II

Thanks @SanjayTS for the response.

Unfortunately, none of the approach seems very promising given the dependencies and efforts requires to makes these changes.
in the current scenario, we are looking for some ready to use drivers or options to minimize the new efforts.

We have already tried multiple versions of databricks jdbc driver and also, we tried setting auto-commit flag to false but databricks driver does not support this feature at present.

Separate data layer also is not an option as it doesn't support update / delete operations apart from additional overhead.

Spring Data JPA with another Datasource: This approach is not approved by architecture team as this results in additional cost and duplication of data.

Native Delta Lake API:
We are exploring integration with the Native Delta Lake API approach, but this approach requires significant amount of code changes at repository layer.
Using Jdbc template approach we are able to create test connection and interact with databricks delta lake tables by modifying existing jpa code into native queries. There are issues related to spring boot security and sessions that results in connection issues and as well results in significant performance degradation.

172036
New Contributor II

Was there any resolution to this?  Is Spring datasource supported now?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group