cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to Use VectorAssembler in PySpark 3.5.0 Due to Whitelisting

Ritchie
New Contributor

Hi,
I am currently using PySpark version 3.5.0 on my Databricks cluster. Despite setting the required configuration using the command: spark.conf.set("spark.databricks.ml.whitelist", "true"), I am still encountering an issue while trying to use the VectorAssembler module from PySpark MLlib.

When I try to import it using the statement "from pyspark.ml.feature import VectorAssembler", I receive the following error:

Py4JError: An error occurred while calling None.org.apache.spark.ml.feature.VectorAssembler.
py4j.security.Py4JSecurityException: Constructor public org.apache.spark.ml.feature.VectorAssembler(java.lang.String) is not whitelisted.

It appears that the class is not whitelisted despite enabling the necessary configuration. Kindly assist in resolving this issue so that I can proceed with my Spark MLlib operations.

1 REPLY 1

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @Ritchie,

Can you run and validate outputs True:

print(spark.conf.get("spark.databricks.ml.whitelist"))

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group