How to circumvent Py4JSecurityException for spark-nlp : Constructor public com.johnsnowlabs.nlp.***(java.lang.String) is not whitelisted.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-12-2022 03:21 PM
Running into the following error on our company's cluster.
py4j.security.Py4JSecurityException: Constructor public com.johnsnowlabs.nlp.DocumentAssembler(java.lang.String) is not whitelisted.For the following code(which is just tutorial code from the spark-nlp page)
df = spark.createDataFrame([("Yeah, I get that. is the",)], ["comment"])
document_assembler = DocumentAssembler() \
.setInputCol("comment") \
.setOutputCol("document")
sentence_detector = SentenceDetector() \
.setInputCols(["document"]) \
.setOutputCol("sentence") \
.setUseAbbreviations(True)
tokenizer = Tokenizer() \
.setInputCols(["sentence"]) \
.setOutputCol("token")
stemmer = Stemmer() \
.setInputCols(["token"]) \
.setOutputCol("stem")
normalizer = Normalizer() \
.setInputCols(["stem"]) \
.setOutputCol("normalized")
finisher = Finisher() \
.setInputCols(["normalized"]) \
.setOutputCols(["ntokens"]) \
.setOutputAsArray(True) \
.setCleanAnnotations(True)
nlp_pipeline = Pipeline(stages=[document_assembler, sentence_detector, tokenizer, stemmer, normalizer, finisher])
nlp_model = nlp_pipeline.fit(df)
processed = nlp_model.transform(df).persist()
processed.count()
processed.show()When I tried adding this to the spark config
spark.databricks.pyspark.enablePy4JSecurity false
It says
spark.databricks.pyspark.enablePy4JSecurity is not allowed when choosing access mode
I would appreciate any help. It seems others at my company have run into the same issue with other packages.
Thank you
- Labels:
-
Public
-
Spark config
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2022 03:56 AM
That error is prevalent in high concurrency / shared clusters. Please test it on a single user / standard standalone cluster.
My blog: https://databrickster.medium.com/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-27-2022 04:50 AM
Hi @Kenan Spruill
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-25-2023 09:42 AM
Hi @Vidula Khanna ,
I would like to know more about the solution to the suggested solution to the above problem. I have upgraded my cluster to 11.3 LTS (unity catalog enabled ) and shared cluster mode. But one of the java functions I am using gives the whitelisting error. Could you please suggest a possible solution while still keeping the shared cluster access mode?