cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Object of type bool_ is not JSON serializable

Braxx
Contributor II

I am doing a convertion of a data frame to nested dict/json. One of the column called "Problematic__c" is boolean type.

For some reason json does not accept this data type retriving error: "Object of type bool_ is not JSON serializable"

I need this as boolean as this json is later injected to Salesforce via API. I could easly make it string but the destination object accept boolean only.

Here is a python code:

all_rows = len(data)
y = []
    
for i in range(all_rows):    
    x = dict(data.iloc[i, 2:])
    
    x["Account__r"] = dict(data.iloc[i, :1])
    x["Product_Master__r"] = dict(data.iloc[i, 1:2])
    y.append(x)
    y=json.dumps(y)

and this is expected output:

[
   {
      "Recommended_Action__c":"Take action Z",
      "Extra_Information_JSON__c":"[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
      "Flag__c":"Rupture Ponctuel",
      "Problematic__c":True,
      "Value__c":5800.0,
      "Source_Id__c":"538.0",
      "Batch__c":"a2e7Y110000dO6WQAU",
      "Account__r":{
         "Code__c":"00001-B"
      },
      "Product_Master__r":{
         "EAN_Code__c":"1111111111.0"
      }
    },
    {  
        ....
    }.
    .....
 ]

Data frame called "data" has structure as below with sample values:

"Code__c":"00001-B"
"EAN_Code__c":"1111111111.0"
"Recommended_Action__c":"Take action Z",
"Extra_Information_JSON__c":"[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
"Flag__c":"Rupture Ponctuel",
"Problematic__c":True,
"Value__c":5800.0,
"Source_Id__c":"538.0",
"Batch__c":"a2e7Y110000dO6WQAU",
"Code__c":"00001-B"
"EAN_Code__c":"1111111111.0"

1 ACCEPTED SOLUTION

Accepted Solutions

Dan_Z
Honored Contributor
Honored Contributor

You can just use `to_json` to achieve this. Here is an example:

from pyspark.sql import Row
from pyspark.sql.types import *
from pyspark.sql.functions import to_json
 
data = [(1, Row(Code__c="00001-B", 
                EAN_Code__c="1111111111.0",
                Extra_Information_JSON__c="[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
                Flag__c="Rupture Ponctuel",
                Problematic__c=True))]
 
df = spark.createDataFrame(data, ("key", "value"))
display(df.select(to_json(df.value).alias("json")))

This is just an example to point you in the right direction, you may need to adapt it to your specific input format. This is meant to run in a Databricks notebook, otherwise the final `display` will not work.

View solution in original post

7 REPLIES 7

Kaniz
Community Manager
Community Manager

Hi @ Braxx ! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

Hubert-Dudek
Esteemed Contributor III

Hi I had similar problem with boolean but with export to different data format.

  • please try to write json directly from dataframe without dict and looping (all needed transformation can be done in dataframe):
df2 = df1.select(df1.Account__r, df1.Product_Master__r)
df2.coalesce(1).write.format('json').save('/path/file_name.json')

Thanks but not sure how do I "write json directly from dataframe without dict and looping".

df1.Account__r or df1.Product_Master__r simply won't work as there are no such objects as "Account__r " or "Product_Master__r" in a dataframe. That's why I used dict to create it.

Hubert-Dudek
Esteemed Contributor III

you can achieve it by transforming dataframe using built-in spark functions etc.

Dan_Z
Honored Contributor
Honored Contributor

You can just use `to_json` to achieve this. Here is an example:

from pyspark.sql import Row
from pyspark.sql.types import *
from pyspark.sql.functions import to_json
 
data = [(1, Row(Code__c="00001-B", 
                EAN_Code__c="1111111111.0",
                Extra_Information_JSON__c="[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
                Flag__c="Rupture Ponctuel",
                Problematic__c=True))]
 
df = spark.createDataFrame(data, ("key", "value"))
display(df.select(to_json(df.value).alias("json")))

This is just an example to point you in the right direction, you may need to adapt it to your specific input format. This is meant to run in a Databricks notebook, otherwise the final `display` will not work.

Anonymous
Not applicable

Thanks

Braxx
Contributor II

Thanks Dan, that make sens!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.