cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

kk007
by New Contributor III
  • 1659 Views
  • 4 replies
  • 4 kudos

Photon engine throws error "JSON document exceeded maximum allowed size 400.0 MiB"

I am reading a 83MB json file using " spark.read.json(storage_path)", when I display the data is seems displaying fine, but when I try command line count, it complains about file size , being more than 400MB, which is not true.Photon JSON reader erro...

  • 1659 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

@Kamal Kumar​ :The error message suggests that the JSON document size is exceeding the maximum allowed size of 400MB. This could be caused by one or more documents in your JSON file being larger than this limit. It is not a bug, but a limitation set ...

  • 4 kudos
3 More Replies
sanjay
by Valued Contributor II
  • 8237 Views
  • 20 replies
  • 17 kudos

Resolved! How to limit number of files in each batch in streaming batch processing

Hi,I am running batch job which processes incoming files. I am trying to limit number of files in each batch process so added maxFilesPerTrigger option. But its not working. It processes all incoming files at once.(spark.readStream.format("delta").lo...

  • 8237 Views
  • 20 replies
  • 17 kudos
Latest Reply
Anonymous
Not applicable
  • 17 kudos

Hi @Sanjay Jain​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

  • 17 kudos
19 More Replies
User16826992666
by Valued Contributor
  • 746 Views
  • 1 replies
  • 0 kudos
  • 746 Views
  • 1 replies
  • 0 kudos
Latest Reply
sean_owen
Honored Contributor II
  • 0 kudos

There shouldn't be. Generally speaking, models will be serialized according to their 'native' format for well-known libraries like Tensorflow, xgboost, sklearn, etc. Custom model will be saved with pickle. The files exist on distributed storage as ar...

  • 0 kudos
Labels