cancel
Showing results for 
Search instead for 
Did you mean: 
Community Discussions
cancel
Showing results for 
Search instead for 
Did you mean: 

Reading csv file with spark throws [insufficient privelage] error

chari
Contributor

Hello Community,

I have some csv files saved in databricks workspace and want to read them with spark. I make use of the command

df = spark.read.format('csv').load(r'filepath')
 
However, it throws the error.
org.apache.spark.SparkSecurityException: [INSUFFICIENT_PERMISSIONS] Insufficient privileges:
 
I own these files as they reside in my workspace. Could you please help me to understand the cause
and a solution for the error ?
 
Thanks
3 REPLIES 3

Ajay-Pandey
Esteemed Contributor III

Hi @chari 

Where you saved the files it in DBFS or an external location such as ADLS, S3 Also, please confirm if you have enabled UC on your workspace or not?

Kaniz
Community Manager
Community Manager

Hi @chari The INSUFFICIENT_PERMISSIONS error you’re encountering in Databricks is related to access control and permissions.

  • When using shared clusters in Databricks, especially with Unity Catalog (UC), you need to consider access control differently.
  • UC + shared clusters provide robust user isolation, preventing unauthorized data access.
  • DBFS (Databricks File System) lacks fine-grained access control, and ADLS (Azure Data Lake Storage) provides access control only at the file level.
  • Recommendations:
    • Create external locations for ADLS data and grant corresponding permissions to users.
    • Instead of using DBFS, consider using Unity Catalog Volumes for unstructured data, configuration files, libraries, etc.
    • Migrate data from DBFS to UC Volumes using single-user clusters (a one-time activity)
    • ADLS via abfss:
      • Create external locations for ADLS data.
      • Grant necessary permissions to users for these locations.
    • Unity Catalog Volumes:
      • Use UC Volumes for unstructured data.
      • Migrate data from DBFS to UC Volumes.
    • Avoid DBFS:
      • DBFS is not recommended for non-temporary data.
      • UC Volumes provide better control and isolation.
    • Ensure your cluster configuration aligns with your requirements.
    • While changing to SINGLE_USER mode might be tempting, consider that your setup is shared by multiple users/notebooks.
    • Balancing isolation and access is crucial.
    •  If you’re migrating to Unity Catalog, follow the recommendations above to ensure secure and efficien...

Lakshay
Esteemed Contributor
Esteemed Contributor

If this a UC enabled workspace, you need to provide the right access.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.