admo
New Contributor III

The philosophy for the job would be something like this in Scala :

feature_dataset.foreachPartition { block =>
   block.grouped(10000).foreach { chunk =>
   run_inference_and_write_to_db(chunk)
}

Would you know how to do this with pyspark and rdds ?