Hello Databricks community!
We are facing a strong need of serving some of public and our private models on GPU clusters and we have several requirements:
1) We'd like to be able to start/stop the endpoints (best with scheduling) to avoid excess consumption
2) We'd like to have a static address of the endpoint
3) (optional) We'd like to be able to run several models on one cluster (to use GPU more efficiently)
As far as we know you have GPU clusters and Container Services. The question: is it possible to run a docker container (or group) and expose it?
We know that most of the GPU services are either in preview or in beta, however, we would like to hear any advice from you. Right now we are using Databricks on Azure for different purposes than ML but would love to start using your platform to host our ML models.
Please suggest us possible approaches from your experience.
Thank you 🙂