โ05-28-2024 08:09 AM
We tried upgrading to JDK 17
Using Spark version 3.0.5 and runtime 14.3 LTS
Getting this exception using parallelstream()
With Java 17 I am not able to parallel process different partitions at the same time. This means when there is more than 1 partition to process, Java11 (which allows parallel processing) takes ~75 minutes while Java 17 takes ~150 minutes.The exception I face in Java 17 when I use list.parallelStream().foreach() is (workflow that faced this exception:
SecurityException: java.lang.SecurityException: setContextClassLoader Caused by: SecurityException: setContextClassLoader at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480) at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:562) at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:591) at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:689) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:765)
โ05-28-2024 01:53 PM - edited โ05-28-2024 01:55 PM
Hello @prith,
The Databricks DBR is bundled with Java already. For DBR 14.3 the system environment is:
Operating System: Ubuntu 22.04.3 LTS
Java: Zulu 8.74.0.17-CA-linux64
Scala: 2.12.15
Python: 3.10.12
R: 4.3.1
Delta Lake: 3.1.0
Changing the DBR Java version is not supported.
โ05-28-2024 01:53 PM - edited โ05-28-2024 01:55 PM
Hello @prith,
The Databricks DBR is bundled with Java already. For DBR 14.3 the system environment is:
Operating System: Ubuntu 22.04.3 LTS
Java: Zulu 8.74.0.17-CA-linux64
Scala: 2.12.15
Python: 3.10.12
R: 4.3.1
Delta Lake: 3.1.0
Changing the DBR Java version is not supported.
โ05-28-2024 02:26 PM
Well we use spark_version: "14.3.x-scala2.12" along with
spark_env_vars:
JNAME: "zulu17-ca-arm64"
This is a pure java jar based workflow - the entire code and jar is in Java - nothing to do with Scala / Python
โ05-28-2024 02:30 PM
@prith, Databricks DBR includes a specific Java version, and altering it is not a supported scenario by Databricks. I hope this information is helpful.
โ05-28-2024 03:08 PM
I'm sorry - I dont understand - We are not trying to alter the JDK version.. Isn't JDK 17 supported or not?
โ05-28-2024 03:11 PM
Hello @prith ,
Currently, Java 17 is not available on Databricks runtimes (DBR). If JDK 17 becomes available on Databricks in the future, it will be included in the DBR. You can refer to the Databricks Runtime release notes versions and compatibility .
โ05-28-2024 04:14 PM
Anyways - thanks for your response - We found a workaround for this error and JDK 17 is actually working - it appears faster than JDK 8
โ07-22-2024 06:28 PM
Hi @prith,
I'm also trying to use Java 17 or 11 in Databricks clusters. Are you using the environment variable `JNAME=zulu17-ca-amd64` as mentioned in https://docs.databricks.com/en/dev-tools/sdk-java.html#create-a-cluster-that-uses-jdk-17 ? Could you share your experience and workaround? Much appreciated!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group