Databricks Community

dev_puli · ‎11-29-2023

Hi!

I have been carrying out a POC, so I created the CSV file in my workspace and tried to read the content using the techniques below in a Python notebook, but did not work.

Option1:

repo_file = "/Workspace/Users/u1@org.com/csv files/f1.csv"

tmp_file_name = file_name.replace(".","").replace("/","").replace(" ","")

file_location = f"/FileStore/tmp/{tmp_file_name}"

dbutils.fs.rm("/FileStore/tmp/", True)

dbutils.fs.mkdirs("/FileStore/tmp/")

shutil.copyfile(repo_file, f"/dbfs{file_location}")

Option2:

repo_file = "/Workspace/Users/u1@org.com/csv files/f1.csv"

tmp_file_name = file_name.replace(".","").replace("/","").replace(" ","")

file_location = f"/FileStore/tmp/{tmp_file_name}"

dbutils.fs.rm("/FileStore/tmp/", True)

dbutils.fs.mkdirs("/FileStore/tmp/")

dbutils.fs.cp(repo_file, f"/dbfs{file_location}")

Both the options throw the same exception java.io.FileNotFoundException. I realized problem is with the source file path. Above code works fine, if I try to read the file from repos instead of my workspace.

As an alternative, I uploaded the CSV file into a blob storage account and able to read it without any issues.

I am curious to find as I believe there must be a way to read the CSV file from my work space aswell.

I would be glad if you can post here how to do so?

Thanks!

dev_puli · ‎11-29-2023

I tried the below repo_file values in both options. However, I continue to see the same exception

repo_file = os.path.abspath("./csv files/f1.csv")

repo_file = os.path.abspath("/Workspace/Users/u1@org.com/csv files/f1.csv")

Krishnamatta · ‎11-29-2023

Hi @Dev

you can read like below with the latest cluster 12.2 and above (prefix file:/)

df = spark.read.option("header", "true").csv("file:/Workspace/Users/u1@org.com/csv files/f1.csv")

df.display()

wfowler · ‎08-05-2024

This worked for me thanks, adding file:/ before Workspace

dev_puli · ‎11-29-2023

@Retired_mod I copied the file path and used the same, but it didn't help. It has been working fine if I copy the file path from repos but not from the user's workspace area.

xiangzhu · ‎11-30-2024

for unity catalog enabled clusters, with the default security permmissions, I think we cannot access like this anymore.

MujtabaNoori · ‎12-02-2024

Hi @Dev ,

Generally, What happens spark reader APIs point to the DBFS by default. And, to read the file from User workspace, we need to append 'file:/' in the prefix.

Thanks

Databricks Community

how to read the CSV file from users workspace

Photos

Connect with Databricks Users in Your Area

Share Your Thoughts on Databricks & Get Rewarded!

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Databricks Community Champion - February 2025 - Stefan Koch

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!