cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to write to Salesforce from Databricks using the spark salesforce library

Gauthy1825
New Contributor II

Hi, Im facing an issue while writing to Salesforce sandbox from Databricks. I have installed the "spark-salesforce_2.12-1.1.4" library and my code is as follows:-

df_newLeads.write\

      .format("com.springml.spark.salesforce")\

      .option("username","<<my username>>")\

      .option("password","<<my password+security token>>")\

      .option("login","https://test.salesforce.com")\

      .option("sfObject","Lead")\

      .option("upsert", True)\

      .option("version","56.0")\

      .save()

Im getting the following error:-

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 13207.0 failed 4 times, most recent failure: Lost task 0.3 in stage 13207.0 (TID 10757) (10.139.64.4 executor 0): java.lang.IllegalArgumentException: Can not instantiate Stax reader for XML source type class org.codehaus.stax2.io.Stax2ByteArraySource (unrecognized type)

Meanwhile im able to read from salesforce using the following code

soql = "select id, name, amount from opportunity"

tets = spark\

     .read\

     .format("com.springml.spark.salesforce")\

     .option("username","<<my username>>").option("password", "<<my password + security token>>")\

     .option("soql", soql)\

     .option("version", "56.0")\

     .option("login","https://test.salesforce.com")\

     .load()

9 REPLIES 9

Debayan
Esteemed Contributor III
Esteemed Contributor III

Anonymous
Not applicable

Hi @Gautham Ranjit​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

Unfortunately i havent been able to resolve this issue, i have tried installing different libraries as well

Anonymous
Not applicable

Hi @Gautham Ranjit​ 

 I'm sorry you could not find a solution to your problem in the answers provided.

Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.

I suggest providing more information about your problem, such as specific error messages, error logs or details about the steps you have taken. This can help our community members better understand the issue and provide more targeted solutions.

Alternatively, you can consider contacting the support team for your product or service. They may be able to provide additional assistance or escalate the issue to the appropriate section for further investigation.

Thank you for your patience and understanding, and please let us know if there is anything else we can do to assist you.

addy
New Contributor III

I am facing a similar issue. I am able to read from a salesforce table but unable to write to it. Our databricks has also been whitelisted in Salesforce. I am using the same library - "com.springml.spark.salesforce".

The error I am getting is not the same but quite similar -

Error writing to Salesforce: An error occurred while calling o3894.save.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 77.0 failed 4 times, most recent failure: Lost task 0.3 in stage 77.0 (TID 301) (10.42.102.202 executor 4): java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.DeserializationContext.handleSecondaryContextualization(Lcom/fasterxml/jackson/databind/JsonDeserializer;Lcom/fasterxml/jackson/databind/BeanProperty;)Lcom/fasterxml/jackson/databind/JsonDeserializer;

Any resolution on this will be helpful.

sasi2
New Contributor II

Could you please provide me the path to this jar file "com.springml.spark.salesforce". 
Please share the pyspark code snippet to read data from salesforce.

 

sasi2
New Contributor II

soql = "select id, name, amount from opportunity"

tets = spark\

     .read\

     .format("com.springml.spark.salesforce")\

     .option("username","<<my username>>").option("password", "<<my password + security token>>")\

     .option("soql", soql)\

     .option("version", "56.0")\

     .option("login","https://test.salesforce.com")\

     .load()

I have tried this code but I'm getting error java.lang.NoClassDefFoundError: com/springml/salesforce/wave/api/APIFactory

please help me on this

addy
New Contributor III

Actually we were not able to solve it so we started using simple salesforce package for doing this. It works smoothly.

addy
New Contributor III

I made a function that used the code below and returned url, connectionProperties, sfwrite

url ="https://login.salesforce.com/"
dom = url.split('//')[1].split('.')[0]
session_id, instance = SalesforceLogin(username=connectionProperties['name'], password= connectionProperties['pass'], domain=dom)
sfwrite = Salesforce(instance=instance, session_id=session_id)

By using the connection property, schema and a query you can read any object from salesforce

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.