cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Azure Shared Clusters - P4J Security Exception on non-whitelisted classes

XavierPereVives
New Contributor II

When I try to use a third party JAR on an Azure shared cluster - which is installed via Maven and I can successfully import - , I get the following message:

 

 

py4j.security.Py4JSecurityException: Method public static org.apache.spark.sql.Column com.databricks.spark.xx.yy.zz() is not whitelisted on class class com.databricks.spark.xx.yy

 

How do I whitelist third-party library code?

3 REPLIES 3

Kaniz_Fatma
Community Manager
Community Manager

Hi @XavierPereVivesTo whitelist third-party library code, you must ensure that the method you try to access is marked as safe for clusters. If not, you might encounter an Py4JSecurityException an error indicating that the technique is not whitelisted. 

However, the information provided does not contain a specific process to whitelist a method or a third-party library in Databricks.

It seems that the issue you're facing is related to running a structured streaming query on a cluster that has table access control enabled.

Streaming is not supported on clusters with table access control because it requires user interaction to validate and refresh credentials, and streaming queries run continuously.

The recommended solution is to use a cluster that does not have table access control enabled for streaming queries.

Sources:
- [readStream() is not whitelisted error when running a query]

(https://kb.databricks.com/streaming/readstream-is-not-whitelisted)

XavierPereVives
New Contributor II

Thanks Kaniz.

I must use a shared cluster because I'm reading from a DLT table stored in a Unity Catalog.

https://docs.databricks.com/en/data-governance/unity-catalog/compute.html

My understanding is that shared clusters are enforcing the Py4J policy I referenced.  I am not sure if this is the same as what you refer to as "table access control", but also I am not trying to use readStream().  Rather I'm trying to use code from a third-party library that isn't included in the base cluster runtime. I've installed this library by supplying Maven coordinates in the compute configuration.

So I am wondering if it's possible to, as a customer that must use a shared cluster under the circumstances I described, allowlist third party code that I choose.  Otherwise, how is one to use third-party code that hasn't yet been allowlisted while reading from DLT in Unity?

Hi @XavierPereVives , 

โ€ข It is unclear if third-party libraries can be allowlisted in a shared cluster while reading from a DLT in Unity.
โ€ข The use of third-party libraries may be subject to security policies and access controls enforced by the shared cluster and Unity Catalog.
โ€ข Shared clusters do not have default credentials to access Unity Catalog tables.
โ€ข Short-lived URLs from the unity catalogue service are provided to the cluster if the user can access the external table.
โ€ข Instance profiles cannot be used with Unity Catalog (UC) Shared mode clusters.
โ€ข If an instance profile is desired, Single User or Service principal enabled clusters must be used.
โ€ข The catalog creator owns all data objects under a catalog and can manage permissions for them.
โ€ข Privileges are inherited downward, and users granted SELECT privilege on the catalog will have it on all schemas and tables unless revoked.
โ€ข Table data under a shared catalog is read-only, allowing read operations like DESCRIBE, SHOW, and SELECT.
โ€ข Consult Databricks support by filing a support ticket for specific capabilities and limitations of shared clusters and Unity Catalog related to third-party libraries.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!