cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results for 
Search instead for 
Did you mean: 

Unity Catalog + Streaming error: method public X is not whitelisted on class DataStreamReader

data_boy_2022
New Contributor III

I want to import data using the autoloader from a S3 bucket into a table which is managed inside a Unity Catalog.

Right now, I run the code on an interactive cluster inside a notebook. In the future the code should run in a job cluster.

The error I get is the following:

py4j.security.Py4JSecurityException: Method public org.apache.spark.sql.streaming.DataStreamReader org.apache.spark.sql.streaming.DataStreamReader.format(java.lang.String) is not whitelisted on class class org.apache.spark.sql.streaming.DataStreamReader

What have I tried so far:

Enabling credential passthrough on the cluster:

Screenshot 2022-09-09 at 6.15.00 PM=> Doesn't work since Unity Catalog can't be used with this option

I also tried setup an external location as described here:

https://docs.databricks.com/data-governance/unity-catalog/manage-external-locations-and-credentials....

On top, I have found this article but the solution is not actionable for me:

https://kb.databricks.com/en_US/streaming/readstream-is-not-whitelisted

Can anybody help?

1 ACCEPTED SOLUTION

Accepted Solutions

Tian
New Contributor III

Hi!

Databricks recently released the documentation on using Unity Catalog with Structured Streaming: https://docs.databricks.com/structured-streaming/unity-catalog.html

Per document requirement, for both interactive notebooks and scheduled jobs, you must use single user clusters for Structured Streaming on Unity Catalog. Python and Scala are supported. Could you verify if the cluster access model is single user?

View solution in original post

2 REPLIES 2

Tian
New Contributor III

Hi!

Databricks recently released the documentation on using Unity Catalog with Structured Streaming: https://docs.databricks.com/structured-streaming/unity-catalog.html

Per document requirement, for both interactive notebooks and scheduled jobs, you must use single user clusters for Structured Streaming on Unity Catalog. Python and Scala are supported. Could you verify if the cluster access model is single user?

data_boy_2022
New Contributor III

Works with single user mode. Thank you!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group