How to use SparkNLP library and JohnSnowLabs maven coordinates in cluster which is not connected to internet

ssy
New Contributor II

Hi,

I am trying SparkNLP library for the first time. The cluster I'm using is corporate and cannot be connected to internet. I can only download packages that are provided to us or by using a jar file.

I've three questions:

  1. What jar files do I need to install SparkNLP library for NLP work. I will be needing BERT transformers and encoders as well as other packages required for NER work using SparkNLP library.
  2. How can I add the proper johnsnowlabs maven coordinates and jar file to my cluster when it's not connected to internet
  3. How can I reference these installed libraries in my notebook that is running on the cluster with the packages installed

Thanks!