Hello guys.
I'm trying to read JSON file which contains backslash and failed to read it via pyspark.
Tried a lot of options but didn't solve this yet, I thought to read all the JSON as text and replace all "\" with "/" but pyspark fail to read it as text too.
example to json:
{
"fname": "max",
"lname" :" tom",
"path ": " c\\dir1\\dir2"
}
code that i tried:
df = spark.read.option('mode','PERMISSIVE').option('columnNameOfCorruptRecord', '_corrupt_record').json('path_to_json', multiLine=True)
df = spark.read.text('path_to_json')
At the first code example when i don't specify the schema i get error unable to infer schema, and if i specify it i get Query returned no result.
At the second code example i get Query returned no result.
the path contains the JSON data , but because the path field pyspark fail to read it as valid json.
(If there is a way to drop the path field while reading the JSON i dont mind to do it, but didn't find any information on how to achieve that.)
Hope some one can help me out.
Thanks!