cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to use from standalone Spark Jar running from Intellij Idea the library installed in Databricks DBR?

Anonymous
Not applicable

Hello,

I tried without success to use several libraries installed by use in the Databricks 9.1 cluster (not provived by default in DBR) from a standalone Spark application runs from Intellij Idea.

For instance, for connecting to Redshift it works only using the embedded Postgre driver.

If I try to use the Redshift driver and Sap Hana driver without success.

Libraries for have been installed as JARs in the Databricks cluster

  • Sap Hana ngdbc_2_10_15.jar
  • redshift_jdbc42_1_2_16_1027-330d8.jar

When I try to run in Debug mode or normal mode from Intellij Idea I get ClassnotFound.

Exception in thread "main" java.lang.ClassNotFoundException: com.sap.db.jdbc.Driver

SAP driver exists well in POM.XML dependencies, tried to add it to Intellij dependencies as well.

<dependency>

<groupId>com.sap.cloud.db.jdbc</groupId>

<artifactId>ngdbc</artifactId>

<version>2.10.15</version>

<type>jar</type>

</dependency>

Databricks connect JARS declared into Intellij dependencies properly: Basic Spark commands run well from Intellij standalone app to the cluster.

Technical environment:

  • DBR 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12)
  • Intellij Idea 2021.2.2 (from Windows 10)
  • AWS Redshift driver or Sap Hana driver

Thank you four your support,

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

In addition to previous message, I am now able to run properly the job using Redshift driver from Databricks job: just add the Redshift library and the JAR for the job and it works.

However, not very convenient for debugging.

When I tried to run the app from Intellij to the remote Databricks cluster issue : Exception in thread "main" java.lang.ClassNotFoundException: Could not load an Amazon Redshift JDBC driver; see the README for instructions on downloading and configuring the official Amazon driver.

Tried :

  • to add the driver JAR into Databricks connect jar path : no success
  • use sc.addJars : no success
  • use sc.addJars with dbfs path of the driver : no success

Is there a way to run in debug mode Scala app in Databricks cluster or not ?

Thank you

View solution in original post

5 REPLIES 5

Anonymous
Not applicable

Dear Kaniz,

Would you please provide an answer ? It is pretty urgent as we need to rewrite Databricks notebooks as standalone applications for better maintability.

Many thanks for your feedback.

Anonymous
Not applicable

In addition to previous message, I am now able to run properly the job using Redshift driver from Databricks job: just add the Redshift library and the JAR for the job and it works.

However, not very convenient for debugging.

When I tried to run the app from Intellij to the remote Databricks cluster issue : Exception in thread "main" java.lang.ClassNotFoundException: Could not load an Amazon Redshift JDBC driver; see the README for instructions on downloading and configuring the official Amazon driver.

Tried :

  • to add the driver JAR into Databricks connect jar path : no success
  • use sc.addJars : no success
  • use sc.addJars with dbfs path of the driver : no success

Is there a way to run in debug mode Scala app in Databricks cluster or not ?

Thank you

Anonymous
Not applicable

Dear Fatma

Sorry but no. It was not the question.

My customer (big German company will raise with Databricks support team directly). Please close this.

Anonymous
Not applicable

@Xavier VAN AUSLOOS​ - Would you be able to post the answer and mark it as best, so the community will be able to find the solution more easily if they run into this?

Anonymous
Not applicable

Unfortunately, I did not find any solution. We have to package JAR and run it from Databricks job for test/debug. Not efficient but as no solution for remote debug has been found/provided.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group