cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Object of type bool_ is not JSON serializable

Braxx
Contributor II

I am doing a convertion of a data frame to nested dict/json. One of the column called "Problematic__c" is boolean type.

For some reason json does not accept this data type retriving error: "Object of type bool_ is not JSON serializable"

I need this as boolean as this json is later injected to Salesforce via API. I could easly make it string but the destination object accept boolean only.

Here is a python code:

all_rows = len(data)
y = []
    
for i in range(all_rows):    
    x = dict(data.iloc[i, 2:])
    
    x["Account__r"] = dict(data.iloc[i, :1])
    x["Product_Master__r"] = dict(data.iloc[i, 1:2])
    y.append(x)
    y=json.dumps(y)

and this is expected output:

[
   {
      "Recommended_Action__c":"Take action Z",
      "Extra_Information_JSON__c":"[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
      "Flag__c":"Rupture Ponctuel",
      "Problematic__c":True,
      "Value__c":5800.0,
      "Source_Id__c":"538.0",
      "Batch__c":"a2e7Y110000dO6WQAU",
      "Account__r":{
         "Code__c":"00001-B"
      },
      "Product_Master__r":{
         "EAN_Code__c":"1111111111.0"
      }
    },
    {  
        ....
    }.
    .....
 ]

Data frame called "data" has structure as below with sample values:

"Code__c":"00001-B"
"EAN_Code__c":"1111111111.0"
"Recommended_Action__c":"Take action Z",
"Extra_Information_JSON__c":"[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
"Flag__c":"Rupture Ponctuel",
"Problematic__c":True,
"Value__c":5800.0,
"Source_Id__c":"538.0",
"Batch__c":"a2e7Y110000dO6WQAU",
"Code__c":"00001-B"
"EAN_Code__c":"1111111111.0"

1 ACCEPTED SOLUTION

Accepted Solutions

Dan_Z
Databricks Employee
Databricks Employee

You can just use `to_json` to achieve this. Here is an example:

from pyspark.sql import Row
from pyspark.sql.types import *
from pyspark.sql.functions import to_json
 
data = [(1, Row(Code__c="00001-B", 
                EAN_Code__c="1111111111.0",
                Extra_Information_JSON__c="[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
                Flag__c="Rupture Ponctuel",
                Problematic__c=True))]
 
df = spark.createDataFrame(data, ("key", "value"))
display(df.select(to_json(df.value).alias("json")))

This is just an example to point you in the right direction, you may need to adapt it to your specific input format. This is meant to run in a Databricks notebook, otherwise the final `display` will not work.

View solution in original post

6 REPLIES 6

Hubert-Dudek
Esteemed Contributor III

Hi I had similar problem with boolean but with export to different data format.

  • please try to write json directly from dataframe without dict and looping (all needed transformation can be done in dataframe):
df2 = df1.select(df1.Account__r, df1.Product_Master__r)
df2.coalesce(1).write.format('json').save('/path/file_name.json')

Thanks but not sure how do I "write json directly from dataframe without dict and looping".

df1.Account__r or df1.Product_Master__r simply won't work as there are no such objects as "Account__r " or "Product_Master__r" in a dataframe. That's why I used dict to create it.

Hubert-Dudek
Esteemed Contributor III

you can achieve it by transforming dataframe using built-in spark functions etc.

Dan_Z
Databricks Employee
Databricks Employee

You can just use `to_json` to achieve this. Here is an example:

from pyspark.sql import Row
from pyspark.sql.types import *
from pyspark.sql.functions import to_json
 
data = [(1, Row(Code__c="00001-B", 
                EAN_Code__c="1111111111.0",
                Extra_Information_JSON__c="[{\"name\":\"Action\",\"value\":\"Verifier remplissage\"},{\"name\":\"Stock Disponible\",\"value\":\"18\"}]",
                Flag__c="Rupture Ponctuel",
                Problematic__c=True))]
 
df = spark.createDataFrame(data, ("key", "value"))
display(df.select(to_json(df.value).alias("json")))

This is just an example to point you in the right direction, you may need to adapt it to your specific input format. This is meant to run in a Databricks notebook, otherwise the final `display` will not work.

Anonymous
Not applicable

Thanks

Braxx
Contributor II

Thanks Dan, that make sens!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group