cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Issue while trying to read a text file in databricks using Local File API's instead of Spark API.

RiyazAli
Valued Contributor

I'm trying to read a small txt file which is added as a table to the default db on Databricks. While trying to read the file via Local File API, I get a `FileNotFoundError`, but I'm able to read the same file as Spark RDD using SparkContext.

Please find the code below:

with open("/FileStore/tables/boringwords.txt", "r") as f_read:
  for line in f_read:
    print(line)

The error I get is:

FileNotFoundError                         Traceback (most recent call last)
<command-2618449717515592> in <module>
----> 1 with open("dbfs:/FileStore/tables/boringwords.txt", "r") as f_read:
      2   for line in f_read:
      3     print(line)
 
FileNotFoundError: [Errno 2] No such file or directory: 'dbfs:/FileStore/tables/boringwords.txt'

Where as, I have no problem reading the file using SparkContext:

boring_words = sc.textFile("/FileStore/tables/boringwords.txt")
set(i.strip() for i in boring_words.collect())

And as expected, I get the result for the above block of code:

Out[4]: {'mad',
 'mobile',
 'filename',
 'circle',
 'cookies',
 'immigration',
 'anticipated',
 'editorials',
 'review'}

I was also referring to the DBFS documentation to understand the Local File API's limitations but of no lead on the issue. Any help would be greatly appreciated. Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

Hi @Riyaz Ali​ ,

In the community edition, in DBR 7+, this mount is disabled.

If you're using the community edition, please run your code in a notebook with DBR version < 7 . It shall definitely work.

Screenshot 2021-11-24 at 7.02.14 PMScreenshot 2021-11-24 at 7.10.57 PM

View solution in original post

9 REPLIES 9

-werners-
Esteemed Contributor III

can you try with /dbfs/Filestore/tables/boringwords.txt?

RiyazAli
Valued Contributor

Hey there @Werner Stinckens​ ! Thanks for your response!

I've tried your suggestion and I still get the same error!

PFA the snip below:

error_snipMoreover, I've realized that adding ```/dbfs``` to the path is optional, as I've stored the data in the default database. Refer to the OP that I'm creating a RDD by passing the path ```"/FileStore/Tables/filename.txt"``` in `sc.textFile`.

Thank you!

-werners-
Esteemed Contributor III

you forgot a "/" as the first character in your file path.

RiyazAli
Valued Contributor

Hello @Werner Stinckens​ !

You're right! I missed the '/' earlier.

But, nothing changed after adding the '/' before dbfs. Below is the snip:

imageMoreover, when I tried the same path notation with SparkContext - It threw me an error:

imageI'm starting to wonder if this is the right way to provide the absolute path.

On the contrary, I've gave the path as "dbfs:/FileStore/tables/boringwords.txt" and it worked.

imageBut again this doesn't work for reading the file from Local API.

-werners-
Esteemed Contributor III

No that should work.

I just tested it on my environment.

Also:

https://docs.microsoft.com/en-us/azure/databricks/data/databricks-file-system#python

https://community.databricks.com/s/question/0D53f00001HKHS7CAP/python-open-function-is-unable-to-det...

But maybe you use the community edition of Databricks? If I recall correctly, the dbfs mounting is limited. So the local file interface might not work.

(See https://community.databricks.com/s/question/0D53f00001HKIFjCAP/where-is-dbfs-mounted-with-community-...), not sure though.

If not: all I could think of is that the file is not there (so incorrect path), but SC can find it so that won't be it.

Proof it works:

image

RiyazAli
Valued Contributor

hey @Werner Stinckens​ ,

My apologies! Forgot to mention that I'm using the Databricks community edition. Thanks for the references, much appreciated!!

Hi @Riyaz Ali​ ,

In the community edition, in DBR 7+, this mount is disabled.

If you're using the community edition, please run your code in a notebook with DBR version < 7 . It shall definitely work.

Screenshot 2021-11-24 at 7.02.14 PMScreenshot 2021-11-24 at 7.10.57 PM

RiyazAli
Valued Contributor

Thank you for the help @Kaniz Fatma​ !! Appreciate it. 😀

@Riyaz Ali​ , I'm happy to know that it helped you 😊.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!