cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Issue while trying to read a text file in databricks using Local File API's instead of Spark API.

RiyazAli
Valued Contributor

I'm trying to read a small txt file which is added as a table to the default db on Databricks. While trying to read the file via Local File API, I get a `FileNotFoundError`, but I'm able to read the same file as Spark RDD using SparkContext.

Please find the code below:

with open("/FileStore/tables/boringwords.txt", "r") as f_read:
  for line in f_read:
    print(line)

The error I get is:

FileNotFoundError                         Traceback (most recent call last)
<command-2618449717515592> in <module>
----> 1 with open("dbfs:/FileStore/tables/boringwords.txt", "r") as f_read:
      2   for line in f_read:
      3     print(line)
 
FileNotFoundError: [Errno 2] No such file or directory: 'dbfs:/FileStore/tables/boringwords.txt'

Where as, I have no problem reading the file using SparkContext:

boring_words = sc.textFile("/FileStore/tables/boringwords.txt")
set(i.strip() for i in boring_words.collect())

And as expected, I get the result for the above block of code:

Out[4]: {'mad',
 'mobile',
 'filename',
 'circle',
 'cookies',
 'immigration',
 'anticipated',
 'editorials',
 'review'}

I was also referring to the DBFS documentation to understand the Local File API's limitations but of no lead on the issue. Any help would be greatly appreciated. Thanks!

7 REPLIES 7

-werners-
Esteemed Contributor III

can you try with /dbfs/Filestore/tables/boringwords.txt?

RiyazAli
Valued Contributor

Hey there @Werner Stinckens​ ! Thanks for your response!

I've tried your suggestion and I still get the same error!

PFA the snip below:

error_snipMoreover, I've realized that adding ```/dbfs``` to the path is optional, as I've stored the data in the default database. Refer to the OP that I'm creating a RDD by passing the path ```"/FileStore/Tables/filename.txt"``` in `sc.textFile`.

Thank you!

-werners-
Esteemed Contributor III

you forgot a "/" as the first character in your file path.

RiyazAli
Valued Contributor

Hello @Werner Stinckens​ !

You're right! I missed the '/' earlier.

But, nothing changed after adding the '/' before dbfs. Below is the snip:

imageMoreover, when I tried the same path notation with SparkContext - It threw me an error:

imageI'm starting to wonder if this is the right way to provide the absolute path.

On the contrary, I've gave the path as "dbfs:/FileStore/tables/boringwords.txt" and it worked.

imageBut again this doesn't work for reading the file from Local API.

-werners-
Esteemed Contributor III

No that should work.

I just tested it on my environment.

Also:

https://docs.microsoft.com/en-us/azure/databricks/data/databricks-file-system#python

https://community.databricks.com/s/question/0D53f00001HKHS7CAP/python-open-function-is-unable-to-det...

But maybe you use the community edition of Databricks? If I recall correctly, the dbfs mounting is limited. So the local file interface might not work.

(See https://community.databricks.com/s/question/0D53f00001HKIFjCAP/where-is-dbfs-mounted-with-community-...), not sure though.

If not: all I could think of is that the file is not there (so incorrect path), but SC can find it so that won't be it.

Proof it works:

image

RiyazAli
Valued Contributor

hey @Werner Stinckens​ ,

My apologies! Forgot to mention that I'm using the Databricks community edition. Thanks for the references, much appreciated!!

RiyazAli
Valued Contributor

Thank you for the help @Kaniz Fatma​ !! Appreciate it. 😀

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group