cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Maven Package install failing on DBR 11.3 LTS

ABVectr
New Contributor III

Hi Databricks Community,

I ran into the following issue when setting up a new cluster with the latest LTS Databricks runtime (11.3). When trying to install the package with the coordinates com.microsoft.azure.kusto:kusto-spark_3.0_2.12:3.1.4 from Maven, the install fails and I get the following error:

Library installation attempted on the driver node of cluster 1013-133741-jzb53a8t and failed. Please refer to the following error message to fix the library or contact Databricks support. Error Code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message: java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File file:/local_disk0/tmp/clusterWideResolutionDir/maven/ivy/jars/io.netty_netty-transport-native-kqueue-4.1.59.Final.jar does not exist

This issue occurs for more recent versions of the package as well, e.g., com.microsoft.azure.kusto:kusto-spark_3.0_2.12:3.1.10.

What have I already tried?

  • When using an older runtime, i.e., 10.4 LTS, this issue does not occur and the package installs succesfully.
  • When installing the package using a jar file from the project's repository (https://github.com/Azure/azure-kusto-spark/releases) the install succeeds no matter which package or runtime was used, which makes me think that the issue is not with the package.

Any help in resolving this issue is greatly appreciated.

Kind regards,

Andrei

1 ACCEPTED SOLUTION

Accepted Solutions

ABVectr
New Contributor III

Hi @Vidula Khanna​,

No, we were not able to resolve this issue.

As our goal was to connect to Azure Data Explorer, we worked around this issue by using the PyPI package azure-kusto-data.

Kind regards,

Andrei

View solution in original post

6 REPLIES 6

-werners-
Esteemed Contributor III

can you check if the dependencies are installed?

https://github.com/Azure/azure-kusto-spark#dependencies

ABVectr
New Contributor III

I didn't install these dependencies manually in the past because I assumed they would be resolved automatically. I tried to install them manually on the new cluster. 2/3 installs failed with a similar error as in the original post

Library installation attempted on the driver node of cluster 1013-133741-jzb53a8t and failed. Please refer to the following error message to fix the library or contact Databricks support. Error Code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message: java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File file:/local_disk0/tmp/clusterWideResolutionDir/maven/ivy/jars/io.netty_netty-transport-native-epoll-4.1.78.Final.jar does not exist

Seems like some dependency files are getting lost during the install?

-werners-
Esteemed Contributor III

bummer.

I guess MS did not take recent dbrx versions into account.

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi, I think the issue is with the compatibility with the latest version of DBR. But the error shows different, is the path right?

Anonymous
Not applicable

Hi @Andrei Bondarenko​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

ABVectr
New Contributor III

Hi @Vidula Khanna​,

No, we were not able to resolve this issue.

As our goal was to connect to Azure Data Explorer, we worked around this issue by using the PyPI package azure-kusto-data.

Kind regards,

Andrei

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.