ML inference server (with REST api) how can I specify a GPU cluster.?

alonisser
Contributor II

is there an API for that? as I couldn't find a way to do this through the UI

Classic serving for now (didn't get access to the new "serverless" offering)

alonisser
Contributor II

any clues?