Issue while trying to read a text file in databricks using Local File API's instead of Spark API.

RiyazAliM
Honored Contributor

I'm trying to read a small txt file which is added as a table to the default db on Databricks. While trying to read the file via Local File API, I get a `FileNotFoundError`, but I'm able to read the same file as Spark RDD using SparkContext.

Please find the code below:

with open("/FileStore/tables/boringwords.txt", "r") as f_read:
  for line in f_read:
    print(line)

The error I get is:

FileNotFoundError                         Traceback (most recent call last)
<command-2618449717515592> in <module>
----> 1 with open("dbfs:/FileStore/tables/boringwords.txt", "r") as f_read:
      2   for line in f_read:
      3     print(line)
 
FileNotFoundError: [Errno 2] No such file or directory: 'dbfs:/FileStore/tables/boringwords.txt'

Where as, I have no problem reading the file using SparkContext:

boring_words = sc.textFile("/FileStore/tables/boringwords.txt")
set(i.strip() for i in boring_words.collect())

And as expected, I get the result for the above block of code:

Out[4]: {'mad',
 'mobile',
 'filename',
 'circle',
 'cookies',
 'immigration',
 'anticipated',
 'editorials',
 'review'}

I was also referring to the DBFS documentation to understand the Local File API's limitations but of no lead on the issue. Any help would be greatly appreciated. Thanks!

Riz