cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to use from standalone Spark Jar running from Intellij Idea the library installed in Databricks DBR?

Anonymous
Not applicable

Hello,

I tried without success to use several libraries installed by use in the Databricks 9.1 cluster (not provived by default in DBR) from a standalone Spark application runs from Intellij Idea.

For instance, for connecting to Redshift it works only using the embedded Postgre driver.

If I try to use the Redshift driver and Sap Hana driver without success.

Libraries for have been installed as JARs in the Databricks cluster

  • Sap Hana ngdbc_2_10_15.jar
  • redshift_jdbc42_1_2_16_1027-330d8.jar

When I try to run in Debug mode or normal mode from Intellij Idea I get ClassnotFound.

Exception in thread "main" java.lang.ClassNotFoundException: com.sap.db.jdbc.Driver

SAP driver exists well in POM.XML dependencies, tried to add it to Intellij dependencies as well.

<dependency>

<groupId>com.sap.cloud.db.jdbc</groupId>

<artifactId>ngdbc</artifactId>

<version>2.10.15</version>

<type>jar</type>

</dependency>

Databricks connect JARS declared into Intellij dependencies properly: Basic Spark commands run well from Intellij standalone app to the cluster.

Technical environment:

  • DBR 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12)
  • Intellij Idea 2021.2.2 (from Windows 10)
  • AWS Redshift driver or Sap Hana driver

Thank you four your support,

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

In addition to previous message, I am now able to run properly the job using Redshift driver from Databricks job: just add the Redshift library and the JAR for the job and it works.

However, not very convenient for debugging.

When I tried to run the app from Intellij to the remote Databricks cluster issue : Exception in thread "main" java.lang.ClassNotFoundException: Could not load an Amazon Redshift JDBC driver; see the README for instructions on downloading and configuring the official Amazon driver.

Tried :

  • to add the driver JAR into Databricks connect jar path : no success
  • use sc.addJars : no success
  • use sc.addJars with dbfs path of the driver : no success

Is there a way to run in debug mode Scala app in Databricks cluster or not ?

Thank you

View solution in original post

7 REPLIES 7

Kaniz
Community Manager
Community Manager

Hi @ vanausloos ! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

Anonymous
Not applicable

Dear Kaniz,

Would you please provide an answer ? It is pretty urgent as we need to rewrite Databricks notebooks as standalone applications for better maintability.

Many thanks for your feedback.

Anonymous
Not applicable

In addition to previous message, I am now able to run properly the job using Redshift driver from Databricks job: just add the Redshift library and the JAR for the job and it works.

However, not very convenient for debugging.

When I tried to run the app from Intellij to the remote Databricks cluster issue : Exception in thread "main" java.lang.ClassNotFoundException: Could not load an Amazon Redshift JDBC driver; see the README for instructions on downloading and configuring the official Amazon driver.

Tried :

  • to add the driver JAR into Databricks connect jar path : no success
  • use sc.addJars : no success
  • use sc.addJars with dbfs path of the driver : no success

Is there a way to run in debug mode Scala app in Databricks cluster or not ?

Thank you

Anonymous
Not applicable

Dear Fatma

Sorry but no. It was not the question.

My customer (big German company will raise with Databricks support team directly). Please close this.

Anonymous
Not applicable

@Xavier VAN AUSLOOS​ - Would you be able to post the answer and mark it as best, so the community will be able to find the solution more easily if they run into this?

Anonymous
Not applicable

Unfortunately, I did not find any solution. We have to package JAR and run it from Databricks job for test/debug. Not efficient but as no solution for remote debug has been found/provided.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.