- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-17-2021 11:52 AM
I am doing a convertion of a data frame to nested dict/json. One of the column called "Problematic__c" is boolean type.
For some reason json does not accept this data type retriving error: "Object of type bool_ is not JSON serializable"
I need this as boolean as this json is later injected to Salesforce via API. I could easly make it string but the destination object accept boolean only.
Here is a python code:
all_rows = len(data)
y = []
for i in range(all_rows):
x = dict(data.iloc[i, 2:])
x["Account__r"] = dict(data.iloc[i, :1])
x["Product_Master__r"] = dict(data.iloc[i, 1:2])
y.append(x)
y=json.dumps(y)
and this is expected output:
[
{
"Recommended_Action__c":"Take action Z",
"Extra_Information_JSON__c":"[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
"Flag__c":"Rupture Ponctuel",
"Problematic__c":True,
"Value__c":5800.0,
"Source_Id__c":"538.0",
"Batch__c":"a2e7Y110000dO6WQAU",
"Account__r":{
"Code__c":"00001-B"
},
"Product_Master__r":{
"EAN_Code__c":"1111111111.0"
}
},
{
....
}.
.....
]
Data frame called "data" has structure as below with sample values:
"Code__c":"00001-B"
"EAN_Code__c":"1111111111.0"
"Recommended_Action__c":"Take action Z",
"Extra_Information_JSON__c":"[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
"Flag__c":"Rupture Ponctuel",
"Problematic__c":True,
"Value__c":5800.0,
"Source_Id__c":"538.0",
"Batch__c":"a2e7Y110000dO6WQAU",
"Code__c":"00001-B"
"EAN_Code__c":"1111111111.0"
- Labels:
-
Pandas Python
-
Python
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2021 11:01 AM
You can just use `to_json` to achieve this. Here is an example:
from pyspark.sql import Row
from pyspark.sql.types import *
from pyspark.sql.functions import to_json
data = [(1, Row(Code__c="00001-B",
EAN_Code__c="1111111111.0",
Extra_Information_JSON__c="[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
Flag__c="Rupture Ponctuel",
Problematic__c=True))]
df = spark.createDataFrame(data, ("key", "value"))
display(df.select(to_json(df.value).alias("json")))
This is just an example to point you in the right direction, you may need to adapt it to your specific input format. This is meant to run in a Databricks notebook, otherwise the final `display` will not work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2021 03:31 AM
Hi I had similar problem with boolean but with export to different data format.
- please try to write json directly from dataframe without dict and looping (all needed transformation can be done in dataframe):
df2 = df1.select(df1.Account__r, df1.Product_Master__r)
df2.coalesce(1).write.format('json').save('/path/file_name.json')
- you can also write spark dataframe also directly to Salesforce please check https://github.com/springml/spark-salesforce
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2021 05:51 AM
Thanks but not sure how do I "write json directly from dataframe without dict and looping".
df1.Account__r or df1.Product_Master__r simply won't work as there are no such objects as "Account__r " or "Product_Master__r" in a dataframe. That's why I used dict to create it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2021 07:39 AM
you can achieve it by transforming dataframe using built-in spark functions etc.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-18-2021 11:01 AM
You can just use `to_json` to achieve this. Here is an example:
from pyspark.sql import Row
from pyspark.sql.types import *
from pyspark.sql.functions import to_json
data = [(1, Row(Code__c="00001-B",
EAN_Code__c="1111111111.0",
Extra_Information_JSON__c="[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
Flag__c="Rupture Ponctuel",
Problematic__c=True))]
df = spark.createDataFrame(data, ("key", "value"))
display(df.select(to_json(df.value).alias("json")))
This is just an example to point you in the right direction, you may need to adapt it to your specific input format. This is meant to run in a Databricks notebook, otherwise the final `display` will not work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-03-2021 06:51 PM
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2021 01:16 AM
Thanks Dan, that make sens!

