When I try to convert a notebook into a job I frequently run into an issue with writing to the local filesystem. For this particular example, I did all my notebook testing with a bytestream for small files. When I tried to run as a job, I used the method I had to save the download to disk but I keep getting a `FileNotFoundError`. An example code snippet is below with two methods I've tried:
# method 1
def sftp_read(sftp_object, prefix):
key = f'{prefix}/{sftp_object}'
if not os.path.exists('/local_disk0/tmp/'):
os.makedirs('/local_disk0/tmp/')
sftp.get(sftp_object, f'/local_disk0/tmp/{sftp_object}')
# do stuff
os.remove(f'/local_disk0/tmp/{sftp_object}')
# method 2
def sftp_read(sftp_object, prefix):
key = f'{prefix}/{sftp_object}'
dbutils.fs.mkdirs('file:/tmp/')
sftp.get(sftp_object, f'file:/tmp/{sftp_object}')
# do stuff
dbutils.fs.rm(f'file:/tmp/{sftp_object}')
FileNotFoundError Traceback (most recent call last)
<command-3394785040378964> in <cell line: 1>()
3 if dryrun:
4 print(sftp_object)
----> 5 sftp_read(sftp_object, prefix)
<command-3394785040378909> in sftp_read(sftp_object, prefix)
57 if not os.path.exists('/local_disk0/tmp/'):
58 os.makedirs('/local_disk0/tmp/')
---> 59 sftp.get(sftp_object, f'/local_disk0/tmp/{sftp_object}')
60 # do stuff
61 os.remove(f'/local_disk0/tmp/{sftp_object}')
/local_disk0/.ephemeral_nfs/envs/pythonEnv-1e9ce7e1-d7d5-4473-b8d6-dbe59be12302/lib/python3.9/site-packages/paramiko/sftp_client.py in get(self, remotepath, localpath, callback, prefetch)
808 Added the ``prefetch`` keyword argument.
809 """
--> 810 with open(localpath, "wb") as fl:
811 size = self.getfo(remotepath, fl, callback, prefetch)
812 s = os.stat(localpath)
FileNotFoundError: [Errno 2] No such file or directory: '/local_disk0/tmp/path/to/file.ext'
I have referenced the DFBS local files documentation as well: https://docs.databricks.com/files/index.html
Any suggestions or something I need to know about Jobs running in a different manner than notebooks?