Copy Local file using a Shared Cluster

Kaizen
Valued Contributor

Hi, 

I am saving some files locally on my cluster and moving them after my job. These are log files of my process so I cant directly reference a DBFS location. 

However the dbutils.fs.cp command does not work on the shared cluster. This does however work on a individual cluster. I believe this is related to how the clusters are split amongst users.

File location: "/home/spark-daed4064-233f-446c-b9f2-5b/log.txt''

Copy command: 

 

import os 

#path gets set to /home/spark-daed4064-233f-446c-b9f2-5b/ 
path = os.getcwd() 

new_path = f"{path}/logs.txt" 

# output printed out -> /home/spark-4c17311c-654a-4c71-b551-2e/logs.txt 
print(new_path) 

dbutils.fs.cp(new_path, "dbfs:/databricks/scripts/logs.txt")

 

Kaizen
Valued Contributor

For reference when doing this on a single user (personal) cluster - the file is store in:

/databricks/driver/logs.txt

 

Which has no issue accessing and copying to dbfs after using the dbutil commands

Kaizen
Valued Contributor

Hi @Retired_mod - thanks for mentioning that. The issue is accessing the local file on the cluster not the dbfs location.

But it is still like you said a cluster config issue:
org.apache.spark.api.python.PythonSecurityException: Path 'file:/home/spark-c989284b-a795-4ca0-858e-84/logs.txt' uses an untrusted filesystem 'com.databricks.backend.daemon.driver.WorkspaceLocalFileSystem'