Databricks Community

sha · ‎02-08-2024

Environment details:

DataBricks on Azure, 13.3 LTS, Unity Catalog, Shared Cluster mode.

Currently in the environment I'm in, we run imports from S3 with code like:

spark.read.option('inferSchema', 'true').json(s3_path).

When running on a cluster in Shared Mode with Unity Catalog enabled, I get this error:

"Import for <table> failed with error: An error occurred while calling o453.json. : org.apache.spark.SparkSecurityException: [INSUFFICIENT_PERMISSIONS] Insufficient privileges: User does not have permission SELECT on any file."

There's a proposed workaround , but this isn't possible since I don't have admin access and the admins don't want to bypass all the security controls provided by Unity Catalog. Running the code in Single User mode works with no issues, but having a bunch of Single User mode clusters to support my team isn't a feasible solution.

Basic question is: what mechanisms can be used to import S3 data into a Unity Catalog enabled Shared Cluster environment, if any, without resorting to being a cluster admin?

BR_DatabricksAI · ‎02-09-2024

Hello Sha,

We usually get such error while working with shared cluster mode assuming this your dev environment just to avoid errors, please use different clusters.

However as a alternative solution in case if would like to keep the shared cluster then you create a group and assign multiple users in the groups and then request admin to grant the necessary select privileges on specific catalog, schema, view and tables.

Databricks Community

Importing data from S3 to Azure DataBricks Cluster with Unity Catalog in Shared Mode

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!