06-08-2023 11:15 PM
I have 3 questions listed below.
Q1. I need to install third party library in Unity Catalog enabled shared cluster. But I am not able to install. It is not accepting dbfs path dbfs:/FileStore/jars/
Q2. I have a requirement to load the data to salesforce from s3 files. I am using simple salesforce library to perform read/write on Salesforce from databricks. As per the documentation we need to provide dictionary data in the write function. When I am trying to convert the pyspark dataframe I am getting the below error.
from pyspark.sql.types import StructType,StructField, StringType, IntegerType
data2 = [("Test_Conv1","testmailconv1@yopmail.com","Olivia","A",'3000000000'),
("Test_Conv2","testmailconv2@yopmail.com","Jack","B",4000000000),
("Test_Conv3","testmailconv3@yopmail.com","Williams","C",5000000000),
("Test_Conv4","testmailconv4@yopmail.com","Jones","D",6000000000),
("Test_Conv5","testmailconv5@yopmail.com","Brown",None,9000000000)
]
schema = StructType([ \
StructField("LastName",StringType(),True), \
StructField("Email",StringType(),True), \
StructField("FirstName",StringType(),True), \
StructField("MiddleName", StringType(), True), \
StructField("Phone", StringType(), True)
])
df = spark.createDataFrame(data=data2,schema=schema)
df_contact = df.rdd.map(lambda row: row.asDict()).collect()
sf.bulk.Contact.insert(df_contact,batch_size=20000,use_serial=True)
Error message :
py4j.security.Py4JSecurityException: Method public org.apache.spark.rdd.RDD org.apache.spark.api.java.JavaRDD.rdd() is not whitelisted on class class org.apache.spark.api.java.JavaRDD
Could you please help me to convert the dataframe to the dictionary.
Q3. Even if there is a way to convert the dataframe to dictionary, it could impact the performance for large data set. Is there any way to load the data in Salesforce in a more optimized way.
06-09-2023 03:45 AM
1. https://docs.databricks.com/dbfs/unity-catalog.html
To interact with files directly using DBFS, you must have
ANY FILE
permissions granted.
2.can you try one of these methods?
3.depending on the size of the data this will have an impact. But I think the bottleneck will be at the salesforce side.
06-09-2023 03:45 AM
1. https://docs.databricks.com/dbfs/unity-catalog.html
To interact with files directly using DBFS, you must have
ANY FILE
permissions granted.
2.can you try one of these methods?
3.depending on the size of the data this will have an impact. But I think the bottleneck will be at the salesforce side.
06-12-2023 11:03 AM
This is not a permission issue. I have uploaded third-party libraries in databricks but databricks cluster is not accepting the jar paths.
06-13-2023 05:51 AM
third-party libs are not in dbfs, so it might still be that issue.
06-13-2023 11:13 PM
Hi @SK ASIF ALI
We haven't heard from you since the last response from @werners (Customer) . Kindly share the information with us, and in return, we will provide you with the necessary solution.
Thanks and Regards
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group