cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to connect Databricks Database with Springboot application using JPA

satishnavik
New Contributor II

facing issue with integrating our Spring boot JPA supported application with Databricks.

Below are the steps and setting we did for the integration.

When we are starting the spring boot application we are getting a warning as :

HikariPool-1 - Driver does not support get/set network timeout for connections. 
([Databricks][JDBC](10220) Driver does not support this optional feature.) 

and the application gets started.

When interacting with DB for any CRUD operation we are getting exception : Caused by: java. SQL. SQL Feature Not Supported Exception: [Databricks][JDBC](10220) Driver does not support this optional feature.

Driver used: com. databricks .client. jdbc. Driver

Url: jdbc:databricks://************************ t:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/778046939806498/0307-074105-********;AuthMech=3;UID=token;PWD=*****************

Dialect : org. hibernate. dialect. My SQL Dialect

Below are other drivers that we tried as well suggested in databricks -jdbc-driver-install-and-configuration-guide:

com.databricks.client.jdbc.DataSource
com.databricks.client.jdbc42.Driver
com.databricks.client.jdbc42.DataSource

Also we would like to highlight that we are able to make single connection and able to perform CRUD operation with custom queries but not with Spring Data Jpa.

5 REPLIES 5

Kaniz
Community Manager
Community Manager

Hi @satishnavik, It seems you’re encountering issues while integrating your Spring Boot JPA application with Databricks.

Let’s address the warnings and exceptions you’re facing.

  1. Warning: Driver Does Not Support Network Timeout for Connections

    • The warning message you’re seeing, “HikariPool-1 - Driver does not support get/set network timeout for connections,” is related to the Databricks JDBC driver.
    • This warning indicates that the driver does not support setting network timeouts for connections.
    • While this warning doesn’t prevent your application from starting, it’s essential to understand its implications.
  2. Exception: Driver Does Not Support Optional Feature

    • When performing CRUD (Create, Read, Update, Delete) operations, you encounter the exception: “Caused by: java.sql.SQLFeatureNotSupportedException: [Databricks]JDBC Driver does not support this optional feature.”
    • This exception occurs because the Databricks JDBC driver lacks support for certain features that Spring Data JPA relies on.
  3. Possible Solutions:

    • AutoCommit Configuration:
    • Use the Correct JDBC Driver:
    • Check Dialect Configuration:
      • Ensure that your Hibernate dialect configuration (org.hibernate.dialect.MySQLDialect) is compatible with Databricks.
      • Verify that the dialect settings match the Databricks environment.
  4. Custom Queries vs. Spring Data JPA:

    • You mentioned that you can make a single connection and perform CRUD operations with custom queries but not with Spring Data JPA.
    • This suggests that the issue lies specifically with Spring Data JPA.
    • Double-check your Spring Data JPA configuration, including entity classes, repositories, and transaction management.

Remember to adjust your configuration based on the suggestions above, and ensure that your Spring Boot application can seamlessly interact with Databricks using Spring Data JPA. If you encounter any further issues, feel free to ask for additional assistance! 🌟

 

satishnavik
New Contributor II

Hi Kaniz,

I have followed the steps mentioned above but it didn't solve the problem. Now, I can retrieve the data using JPA but the transaction methods like save, and save All are failing. It is used to do the Insert and Update query using JPA. However I was able to run the explicit Insert and Update query. Now getting the below error.

org.springframework.dao.InvalidDataAccessApiUsageException: No EntityManager with actual transaction available for current thread - cannot reliably process 'persist' call; nested exception is javax.persistence.TransactionRequiredException: No EntityManager with actual transaction available for current thread - cannot reliably process 'persist' call

I wasn't able to switch the auto-commit to false 

spring.datasource.hikari.auto-commit=false

It was throwing the below error when i tried to switch the auto-commit to False. I tried it in the properties files as well while making the connection.

java.sql.SQLFeatureNotSupportedException: [Databricks][JDBC](10220) Driver does not support this optional feature.

It would highly appreciated if you can help me to make a connection with Java Spring Boot JPA with the Delta lake.

 

 

@satishnavik - The error message indicates that Spring JPA is unable to manage transactions with Delta Lake because the Delta Lake JDBC driver doesn't support them.

The Issue:

  • Spring JPA relies on JTA (Java Transaction API) for managing transactions across different data sources.
  • Delta Lake JDBC driver, while allowing basic CRUD operations, doesn't support features like auto-commit or manual transaction control through JTA.

Possible Solutions:

Since JPA transactions won't work with Delta Lake, here are alternative approaches:

  1. Native Delta Lake API:

    • Use the Databricks Delta Lake API for Java or Scala. This API provides full control over data manipulation, including inserts, updates, and deletes. However, it requires writing code specifically for Delta Lake operations, bypassing Spring JPA.
  2. Separate Layers:

    • Develop a separate data access layer using the Delta Lake API. This layer can handle data manipulation in Delta Lake. Spring JPA can then interact with this data access layer for retrieving data but wouldn't be used for insert/update operations.
  3. Spring Data JPA with another Datasource:

    • Consider using Spring Data JPA with a traditional relational database like MySQL or PostgreSQL for managing entities and transactions. This database can then act as a staging area for data before being bulk loaded into Delta Lake using separate Delta Lake specific operations.

Choosing the Right Approach:

The best approach depends on your specific needs. Here's a breakdown:

  • Native Delta Lake API: If you primarily focus on Delta Lake operations and don't require full JPA functionality, this might be a good choice. It offers flexibility but requires separate data access code.
  • Separate Layers: This approach separates data manipulation logic from data retrieval using JPA. It keeps a clear separation of concerns but adds complexity with managing two data access layers.
  • Spring Data JPA with another Datasource: This is suitable if you need full JPA functionality with transactions for some data and want to leverage Delta Lake for bulk data storage and analytics. It requires managing data flow between the relational database and Delta Lake.

SpringBoot
New Contributor II

Thanks @SanjayTS for the response.

Unfortunately, none of the approach seems very promising given the dependencies and efforts requires to makes these changes.
in the current scenario, we are looking for some ready to use drivers or options to minimize the new efforts.

We have already tried multiple versions of databricks jdbc driver and also, we tried setting auto-commit flag to false but databricks driver does not support this feature at present.

Separate data layer also is not an option as it doesn't support update / delete operations apart from additional overhead.

Spring Data JPA with another Datasource: This approach is not approved by architecture team as this results in additional cost and duplication of data.

Native Delta Lake API:
We are exploring integration with the Native Delta Lake API approach, but this approach requires significant amount of code changes at repository layer.
Using Jdbc template approach we are able to create test connection and interact with databricks delta lake tables by modifying existing jpa code into native queries. There are issues related to spring boot security and sessions that results in connection issues and as well results in significant performance degradation.

172036
New Contributor II

Was there any resolution to this?  Is Spring datasource supported now?