cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
cancel
Showing results for 
Search instead for 
Did you mean: 

Unity Catalog + Streaming error: method public X is not whitelisted on class DataStreamReader

data_boy_2022
New Contributor III

I want to import data using the autoloader from a S3 bucket into a table which is managed inside a Unity Catalog.

Right now, I run the code on an interactive cluster inside a notebook. In the future the code should run in a job cluster.

The error I get is the following:

py4j.security.Py4JSecurityException: Method public org.apache.spark.sql.streaming.DataStreamReader org.apache.spark.sql.streaming.DataStreamReader.format(java.lang.String) is not whitelisted on class class org.apache.spark.sql.streaming.DataStreamReader

What have I tried so far:

Enabling credential passthrough on the cluster:

Screenshot 2022-09-09 at 6.15.00 PM=> Doesn't work since Unity Catalog can't be used with this option

I also tried setup an external location as described here:

https://docs.databricks.com/data-governance/unity-catalog/manage-external-locations-and-credentials....

On top, I have found this article but the solution is not actionable for me:

https://kb.databricks.com/en_US/streaming/readstream-is-not-whitelisted

Can anybody help?

1 ACCEPTED SOLUTION

Accepted Solutions

Tian
New Contributor III

Hi!

Databricks recently released the documentation on using Unity Catalog with Structured Streaming: https://docs.databricks.com/structured-streaming/unity-catalog.html

Per document requirement, for both interactive notebooks and scheduled jobs, you must use single user clusters for Structured Streaming on Unity Catalog. Python and Scala are supported. Could you verify if the cluster access model is single user?

View solution in original post

2 REPLIES 2

Tian
New Contributor III

Hi!

Databricks recently released the documentation on using Unity Catalog with Structured Streaming: https://docs.databricks.com/structured-streaming/unity-catalog.html

Per document requirement, for both interactive notebooks and scheduled jobs, you must use single user clusters for Structured Streaming on Unity Catalog. Python and Scala are supported. Could you verify if the cluster access model is single user?

data_boy_2022
New Contributor III

Works with single user mode. Thank you!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.