cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Object of type bool_ is not JSON serializable

Braxx
Contributor II

I am doing a convertion of a data frame to nested dict/json. One of the column called "Problematic__c" is boolean type.

For some reason json does not accept this data type retriving error: "Object of type bool_ is not JSON serializable"

I need this as boolean as this json is later injected to Salesforce via API. I could easly make it string but the destination object accept boolean only.

Here is a python code:

all_rows = len(data)
y = []
    
for i in range(all_rows):    
    x = dict(data.iloc[i, 2:])
    
    x["Account__r"] = dict(data.iloc[i, :1])
    x["Product_Master__r"] = dict(data.iloc[i, 1:2])
    y.append(x)
    y=json.dumps(y)

and this is expected output:

[
   {
      "Recommended_Action__c":"Take action Z",
      "Extra_Information_JSON__c":"[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
      "Flag__c":"Rupture Ponctuel",
      "Problematic__c":True,
      "Value__c":5800.0,
      "Source_Id__c":"538.0",
      "Batch__c":"a2e7Y110000dO6WQAU",
      "Account__r":{
         "Code__c":"00001-B"
      },
      "Product_Master__r":{
         "EAN_Code__c":"1111111111.0"
      }
    },
    {  
        ....
    }.
    .....
 ]

Data frame called "data" has structure as below with sample values:

"Code__c":"00001-B"
"EAN_Code__c":"1111111111.0"
"Recommended_Action__c":"Take action Z",
"Extra_Information_JSON__c":"[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
"Flag__c":"Rupture Ponctuel",
"Problematic__c":True,
"Value__c":5800.0,
"Source_Id__c":"538.0",
"Batch__c":"a2e7Y110000dO6WQAU",
"Code__c":"00001-B"
"EAN_Code__c":"1111111111.0"

1 ACCEPTED SOLUTION

Accepted Solutions

Dan_Z
Honored Contributor
Honored Contributor

You can just use `to_json` to achieve this. Here is an example:

from pyspark.sql import Row
from pyspark.sql.types import *
from pyspark.sql.functions import to_json
 
data = [(1, Row(Code__c="00001-B", 
                EAN_Code__c="1111111111.0",
                Extra_Information_JSON__c="[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
                Flag__c="Rupture Ponctuel",
                Problematic__c=True))]
 
df = spark.createDataFrame(data, ("key", "value"))
display(df.select(to_json(df.value).alias("json")))

This is just an example to point you in the right direction, you may need to adapt it to your specific input format. This is meant to run in a Databricks notebook, otherwise the final `display` will not work.

View solution in original post

7 REPLIES 7

Kaniz_Fatma
Community Manager
Community Manager

Hi @ Braxx ! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

Hubert-Dudek
Esteemed Contributor III

Hi I had similar problem with boolean but with export to different data format.

  • please try to write json directly from dataframe without dict and looping (all needed transformation can be done in dataframe):
df2 = df1.select(df1.Account__r, df1.Product_Master__r)
df2.coalesce(1).write.format('json').save('/path/file_name.json')

Thanks but not sure how do I "write json directly from dataframe without dict and looping".

df1.Account__r or df1.Product_Master__r simply won't work as there are no such objects as "Account__r " or "Product_Master__r" in a dataframe. That's why I used dict to create it.

Hubert-Dudek
Esteemed Contributor III

you can achieve it by transforming dataframe using built-in spark functions etc.

Dan_Z
Honored Contributor
Honored Contributor

You can just use `to_json` to achieve this. Here is an example:

from pyspark.sql import Row
from pyspark.sql.types import *
from pyspark.sql.functions import to_json
 
data = [(1, Row(Code__c="00001-B", 
                EAN_Code__c="1111111111.0",
                Extra_Information_JSON__c="[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
                Flag__c="Rupture Ponctuel",
                Problematic__c=True))]
 
df = spark.createDataFrame(data, ("key", "value"))
display(df.select(to_json(df.value).alias("json")))

This is just an example to point you in the right direction, you may need to adapt it to your specific input format. This is meant to run in a Databricks notebook, otherwise the final `display` will not work.

Anonymous
Not applicable

Thanks

Braxx
Contributor II

Thanks Dan, that make sens!

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!