cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to use/access in a python notebook a scala library installed from JAR file?

blackcoffeeAR
Contributor

I'm using Azure Event Hubs Connector https://github.com/Azure/azure-event-hubs-spark to connect an Even Hub.

When I install this library from Maven , then everything works, I can access lib classes using JVM:

connection_string = "<connection_string>"
sc._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connection_string)

However for some reasons I have to istall the same lib from previously downloaded JAR file. The file is downloaded from https://search.maven.org/artifact/com.microsoft.azure/azure-eventhubs-spark_2.12/2.3.22/jar. But then I cannot access the lib classes:

ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/clientserver.py", line 516, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
 
During handling of the above exception, another exception occurred:
 
Traceback (most recent call last):
  File "/databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1038, in send_command
    response = connection.send_command(command)
  File "/databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/clientserver.py", line 539, in send_command
    raise Py4JNetworkError(
py4j.protocol.Py4JNetworkError: Error while sending or receiving
Py4JError: org.apache.spark.eventhubs.EventHubsUtils.encrypt does not exist in the JVM
---------------------------------------------------------------------------
Py4JError                                 Traceback (most recent call last)
<command-110542307469722> in <cell line: 17>()
     15 
     16 connectionString = ""
---> 17 sc._gateway.jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)
 
/databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py in __getattr__(self, name)
   1545                     answer, self._gateway_client, self._fqn, name)
   1546         else:
-> 1547             raise Py4JError(
   1548                 "{0}.{1} does not exist in the JVM".format(self._fqn, name))
   1549 
 
Py4JError: org.apache.spark.eventhubs.EventHubsUtils.encrypt does not exist in the JVM

What I have tried is to import the lib but it did not help:

from py4j.java_gateway import java_import
java_import(sc._gateway.jvm,"org.apache.spark.eventhubs")

Thank for any hints.

2023-02-02 09_30_01-Window

17 REPLIES 17

Kaniz
Community Manager
Community Manager

Hi @blackcoffee AR​, All event hub configuration-related configurations happen in your Event Hubs configuration dictionary. The configuration dictionary must contain an Event Hubs connection string:

connectionString = "YOUR.CONNECTION.STRING"  
 
ehConf = {}  
 
# For versions before 2.3.15, set the connection string without encryption  
 
ehConf['eventhubs.connectionString'] = connectionString  
 
# For the 2.3.15 version and above, the configuration dictionary requires that the connection string be encrypted.  
 
ehConf['eventhubs.connectionString'] = sc._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)  

For more details, refer to Structured Streaming + Event Hubs Integration Guide for PySpark.

I hope this will help.

I know how to create the Event hub configuration dictionary. The configuration is not the problem. Here is the bigger code snippet:

connectionString = "<exmaple>" # the value does not matter
ehConf = {
    "eventhubs.connectionString" : sc._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)
}

Main problem is here:

Py4JError: org.apache.spark.eventhubs.EventHubsUtils.encrypt does not exist in the JVM

Anonymous
Not applicable

Hi @blackcoffee AR​ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.