-werners-
Esteemed Contributor III

did you enable the multiline option while reading the json, because that could be the cause?

See also here.

If you can, try with single-line format. Because then you can really leverage the parallel processing power of spark.