<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Issue in Converting Pyspark Dataframe to dictionary in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3349#M116</link>
    <description>&lt;P&gt;Hi @SK ASIF ALI​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We haven't heard from you since the last response from @werners (Customer)​ . Kindly share the information with us, and in return, we will provide you with the necessary solution.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks and Regards&lt;/P&gt;</description>
    <pubDate>Wed, 14 Jun 2023 06:13:04 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2023-06-14T06:13:04Z</dc:date>
    <item>
      <title>Issue in Converting Pyspark Dataframe to dictionary</title>
      <link>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3345#M112</link>
      <description>&lt;P&gt;I have 3 questions listed below.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Q1. I need to install third party library in Unity Catalog enabled shared cluster. But I am not able to install. It is not accepting dbfs path &lt;B&gt;dbfs:/FileStore/jars/&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Q2. I have a requirement to load the data to salesforce from s3  files. I am using simple salesforce library to perform read/write on Salesforce from databricks. As per the documentation we need to provide dictionary data in the write function. When I am trying to convert the pyspark dataframe I am getting the below error.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql.types import StructType,StructField, StringType, IntegerType
data2 = [("Test_Conv1","testmailconv1@yopmail.com","Olivia","A",'3000000000'),
    ("Test_Conv2","testmailconv2@yopmail.com","Jack","B",4000000000),
    ("Test_Conv3","testmailconv3@yopmail.com","Williams","C",5000000000),
    ("Test_Conv4","testmailconv4@yopmail.com","Jones","D",6000000000),
    ("Test_Conv5","testmailconv5@yopmail.com","Brown",None,9000000000)
  ]
schema = StructType([ \
    StructField("LastName",StringType(),True), \
    StructField("Email",StringType(),True), \
    StructField("FirstName",StringType(),True), \
    StructField("MiddleName", StringType(), True), \
    StructField("Phone", StringType(), True)
  ])
df = spark.createDataFrame(data=data2,schema=schema)
df_contact = df.rdd.map(lambda row: row.asDict()).collect()
sf.bulk.Contact.insert(df_contact,batch_size=20000,use_serial=True)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Error message :&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;py4j.security.Py4JSecurityException: Method public org.apache.spark.rdd.RDD org.apache.spark.api.java.JavaRDD.rdd() is not whitelisted on class class org.apache.spark.api.java.JavaRDD&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Could you please help me to convert the dataframe to the dictionary.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Q3. Even if there is a way to convert the dataframe to dictionary, it could impact the performance for large data set. Is there any way to load the data in Salesforce in a more optimized way.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jun 2023 06:15:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3345#M112</guid>
      <dc:creator>Databricks3</dc:creator>
      <dc:date>2023-06-09T06:15:21Z</dc:date>
    </item>
    <item>
      <title>Re: Issue in Converting Pyspark Dataframe to dictionary</title>
      <link>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3346#M113</link>
      <description>&lt;P&gt;1. &lt;A href="https://docs.databricks.com/dbfs/unity-catalog.html" target="test_blank"&gt;https://docs.databricks.com/dbfs/unity-catalog.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;To interact with files directly using DBFS, you must have&lt;/P&gt;&lt;P&gt;ANY FILE&lt;/P&gt;&lt;P&gt;permissions granted.&lt;/P&gt;&lt;P&gt;2.can you try one of &lt;A href="https://www.geeksforgeeks.org/convert-pyspark-dataframe-to-dictionary-in-python/" alt="https://www.geeksforgeeks.org/convert-pyspark-dataframe-to-dictionary-in-python/" target="_blank"&gt;these&lt;/A&gt; methods?&lt;/P&gt;&lt;P&gt;3.depending on the size of the data this will have an impact.  But I think the bottleneck will be at the salesforce side.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jun 2023 10:45:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3346#M113</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2023-06-09T10:45:53Z</dc:date>
    </item>
    <item>
      <title>Re: Issue in Converting Pyspark Dataframe to dictionary</title>
      <link>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3347#M114</link>
      <description>&lt;P&gt;This is not a permission issue. I have uploaded third-party libraries in databricks but databricks cluster is not accepting the jar paths.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 18:03:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3347#M114</guid>
      <dc:creator>Databricks3</dc:creator>
      <dc:date>2023-06-12T18:03:17Z</dc:date>
    </item>
    <item>
      <title>Re: Issue in Converting Pyspark Dataframe to dictionary</title>
      <link>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3348#M115</link>
      <description>&lt;P&gt;third-party libs are not in dbfs, so it might still be that issue.&lt;/P&gt;</description>
      <pubDate>Tue, 13 Jun 2023 12:51:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3348#M115</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2023-06-13T12:51:37Z</dc:date>
    </item>
    <item>
      <title>Re: Issue in Converting Pyspark Dataframe to dictionary</title>
      <link>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3349#M116</link>
      <description>&lt;P&gt;Hi @SK ASIF ALI​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We haven't heard from you since the last response from @werners (Customer)​ . Kindly share the information with us, and in return, we will provide you with the necessary solution.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks and Regards&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jun 2023 06:13:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/issue-in-converting-pyspark-dataframe-to-dictionary/m-p/3349#M116</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-06-14T06:13:04Z</dc:date>
    </item>
  </channel>
</rss>

