cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Upload local files (Create/Modify table)

VJ3
New Contributor III

Hello Team,

I believe Databricks come out recently feature of Create or modify a table using file upload which is less than 2 GB (file format CSV, TSV, or JSON, Avro, Parquet, or text files to create or overwrite a managed Delta Lake table) on Self Serve workspace. (https://learn.microsoft.com/en-us/azure/databricks/ingestion/add-data/upload-data)

I am looking for your guidance on below:

- How do we ensure that One user uploading file can not shared with another user?

- Do we know if Databricks Local File upload abide with Bell–LaPadula model? Here is the information on Bell–LaPadula model. https://en.wikipedia.org/wiki/Bell%E2%80%93LaPadula_model

- What are the best practice abide with least privilege, need to know, and segregation duty for File Upload on Databricks Self-Serve Workspace?

- Can user overwrite the data (table) uploaded by another user?

- Can we use File upload on Non Secure Cluster?

 

Thank you

2 REPLIES 2

NandiniN
Honored Contributor

Hi @VJ3 ,

 

The "Imported files are uploaded to a secure internal location within your account which is garbage collected daily."

I created a new table and tried to check the path from the details but was not able to access the underlying file.

Unity Catalog should help you with the permissions for the tables if you do not want other users to override.

For access control we have the below that we follow, there is no explicit mention of the Bell–LaPadula model- https://docs.databricks.com/en/data-governance/table-acls/table-acl.html#enable-table-access-control...

Can we use File upload on Non Secure Cluster? Are you facing any issue? 

  • You can upload data to the staging area without connecting to compute resources, but you must select an active compute resource to preview and configure your table.
  • You must have access to a running compute resource and permissions to create tables in a target schema.

Thanks!

VJ3
New Contributor III

Hello Nandini,

Thank you for reply. Apologies for delay. Let's say I uploaded CSV file containing PII data using Upload feature available in Databricks UI. Will I be able to share that file with another user who should not have access to PII data elements? Can the user modify the table not owned by him? What is required to mask PII data before sharing the CSV file with another user? How do we ensure that user can not upload the file to DBFS root which is accessible to all users?

Thank you

Vijay

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group