By far the most popular and comprehensive library, to my knowledge, for Spark-native distributed NLP, is spark-nlp from John Snow Labs. https://nlp.johnsnowlabs.com/ It is open source (but with commercial support options) and has a whole lot of functionality.
You can also use spacy, nltk, and other non-Spark NLP libraries with Spark, by writing pandas UDFs that leverage these libraries, then applying them to data with Spark.