cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Reading csv file with spark throws [insufficient privelage] error

chari
Contributor

Hello Community,

I have some csv files saved in databricks workspace and want to read them with spark. I make use of the command

df = spark.read.format('csv').load(r'filepath')
 
However, it throws the error.
org.apache.spark.SparkSecurityException: [INSUFFICIENT_PERMISSIONS] Insufficient privileges:
 
I own these files as they reside in my workspace. Could you please help me to understand the cause
and a solution for the error ?
 
Thanks
3 REPLIES 3

Ajay-Pandey
Esteemed Contributor III

Hi @chari 

Where you saved the files it in DBFS or an external location such as ADLS, S3 Also, please confirm if you have enabled UC on your workspace or not?

Ajay Kumar Pandey

Kaniz_Fatma
Community Manager
Community Manager

Hi @chari The INSUFFICIENT_PERMISSIONS error you’re encountering in Databricks is related to access control and permissions.

  • When using shared clusters in Databricks, especially with Unity Catalog (UC), you need to consider access control differently.
  • UC + shared clusters provide robust user isolation, preventing unauthorized data access.
  • DBFS (Databricks File System) lacks fine-grained access control, and ADLS (Azure Data Lake Storage) provides access control only at the file level.
  • Recommendations:
    • Create external locations for ADLS data and grant corresponding permissions to users.
    • Instead of using DBFS, consider using Unity Catalog Volumes for unstructured data, configuration files, libraries, etc.
    • Migrate data from DBFS to UC Volumes using single-user clusters (a one-time activity)
    • ADLS via abfss:
      • Create external locations for ADLS data.
      • Grant necessary permissions to users for these locations.
    • Unity Catalog Volumes:
      • Use UC Volumes for unstructured data.
      • Migrate data from DBFS to UC Volumes.
    • Avoid DBFS:
      • DBFS is not recommended for non-temporary data.
      • UC Volumes provide better control and isolation.
    • Ensure your cluster configuration aligns with your requirements.
    • While changing to SINGLE_USER mode might be tempting, consider that your setup is shared by multiple users/notebooks.
    • Balancing isolation and access is crucial.
    •  If you’re migrating to Unity Catalog, follow the recommendations above to ensure secure and efficien...

Lakshay
Esteemed Contributor
Esteemed Contributor

If this a UC enabled workspace, you need to provide the right access.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group