I want to import data using the autoloader from a S3 bucket into a table which is managed inside a Unity Catalog.
Right now, I run the code on an interactive cluster inside a notebook. In the future the code should run in a job cluster.
The error I get is the following:
py4j.security.Py4JSecurityException: Method public org.apache.spark.sql.streaming.DataStreamReader org.apache.spark.sql.streaming.DataStreamReader.format(java.lang.String) is not whitelisted on class class org.apache.spark.sql.streaming.DataStreamReader
What have I tried so far:
Enabling credential passthrough on the cluster:
=> Doesn't work since Unity Catalog can't be used with this option
I also tried setup an external location as described here:
https://docs.databricks.com/data-governance/unity-catalog/manage-external-locations-and-credentials....
On top, I have found this article but the solution is not actionable for me:
https://kb.databricks.com/en_US/streaming/readstream-is-not-whitelisted
Can anybody help?