K_Anudeep
Databricks Employee
Databricks Employee

Hello @Mahesh_rathi__ ,

SparkContext.addFile is for shipping small side files to executors, not for creating an input path that you can pass to sc.textFile("file://...").

On a single-node cluster the driver and executor share the same machine, so the driver’s local path “happens to work.” In a multi-node cluster each executor has its own userFiles-<uuid> directory, so the driver-computed file:///local_disk0/... path won’t exist on the other nodes—hence the FileNotFoundException

 

You can skip addFile and the file:// scheme entirely. Read from DBFS directly, and Spark will parallelise it across executors. Does this not work for you?

 

Anudeep