cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

notebook for SFTP server connectivity without password.

s3
New Contributor II

I am trying to develop some script using python to access an sftp server without password and all valid public/private keys in a notebook. However I am not getting any such example. All examples has a password in it. Can I get some help?

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

@soumen sarangi​ , There is Spark SFTP library https://github.com/springml/spark-sftp

It sepends what is your data there. Sometimes better idea is to set some copy activity (like in Azure Data Factory) to have it in adls/blob mounted in databricks.

View solution in original post

4 REPLIES 4

Hubert-Dudek
Esteemed Contributor III

@soumen sarangi​ , There is Spark SFTP library https://github.com/springml/spark-sftp

It sepends what is your data there. Sometimes better idea is to set some copy activity (like in Azure Data Factory) to have it in adls/blob mounted in databricks.

I am currently using this library in my databricks where i am facing challenge to load the data from sftp

local_path = f'/Workspace/Users/<user.email>/Scard_etl/SFTP_LOCAL/JDL/'

sftp_df = spark.read.format("com.springml.spark.sftp")\
        .option("host", jdl_sftp.host)\
        .option("username", jdl_sftp.user)\
        .option("password", jdl_sftp.password)\
        .option("fileType", "txt")\
        .load("/dsdata/jdl/files/jdl_settlements.txt")
sftp_df.write.mode('overwrite').text(local_path)      
I am getting error below
Py4JJavaError: An error occurred while calling o504.load. : java.lang.NoClassDefFoundError: scala/Product$class at com.springml.spark.sftp.DatasetRelation.<init>(DatasetRelation.scala:24)

As per my research do I need to upgrade the version of library if yes where can find the upgraded version of this as i need to attach it to my cluster in databricks
com.springml:spark-sftp_2.11:1.0.3

Atanu
Databricks Employee
Databricks Employee

Atanu
Databricks Employee
Databricks Employee

again - I do not have this tested to my lab. so still unofficial. 🙂

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group