cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Paramiko SFTP Get fails on databricks file system

Kayla
Valued Contributor

I have an SFTP server I need to routinely download Excel files from and put into GCP cloud storage buckets.
Every variation of the filepath to either my GCP path or just the dbfs in-built file system is giving an error of " [Errno 2] No such file or directory:" Has anyone else encountered this before?

 

 

import paramiko

client = paramiko.SSHClient()
client.load_system_host_keys()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(host, port=port, username=username, password=password)
sftp = client.open_sftp()

remote_path = "/Upload/filename.txt"
local_path = "/dbfs/filename.txt"

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Kayla
Valued Contributor

I've discovered the cause of this issue. The path of '/dbfs/{mount-name}/filename.file_extension' is functional, but what I'm actually running into is a restriction due to the cluster set to "Shared" access mode:

  • Cannot use R, RDD APIs, or clients that directly read the data from cloud storage, such as DBUtils.

Changing the cluster mode resolved the issue.

View solution in original post

3 REPLIES 3

Kaniz_Fatma
Community Manager
Community Manager

Hi @Kayla , The error message [Errno 2] No such file or directory indicates that the file or directory you're trying to access doesn't exist or the path is incorrect.
- You're attempting to download files from an SFTP server and put them into GCP cloud storage buckets.


- Recommendations:
 - Check the remote and local file paths for correctness.
 - Verify the file's existence on the SFTP server.
 - Use the gs:// prefix for paths in GCP.
 - Ensure Databricks has permissions to read from and write to your GCP bucket.

Kayla
Valued Contributor

I've discovered the cause of this issue. The path of '/dbfs/{mount-name}/filename.file_extension' is functional, but what I'm actually running into is a restriction due to the cluster set to "Shared" access mode:

  • Cannot use R, RDD APIs, or clients that directly read the data from cloud storage, such as DBUtils.

Changing the cluster mode resolved the issue.

jose_gonzalez
Moderator
Moderator

Thank you for sharing the solution. Many more users will find this information very useful. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group