cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Model serving with GPU cluster

alisher_pwc
New Contributor II

Hello Databricks community!

We are facing a strong need of serving some of public and our private models on GPU clusters and we have several requirements:

1) We'd like to be able to start/stop the endpoints (best with scheduling) to avoid excess consumption

2) We'd like to have a static address of the endpoint

3) (optional) We'd like to be able to run several models on one cluster (to use GPU more efficiently)

As far as we know you have GPU clusters and Container Services. The question: is it possible to run a docker container (or group) and expose it?

We know that most of the GPU services are either in preview or in beta, however, we would like to hear any advice from you. Right now we are using Databricks on Azure for different purposes than ML but would love to start using your platform to host our ML models.

Please suggest us possible approaches from your experience.

Thank you 🙂

2 REPLIES 2

Debayan
Databricks Employee
Databricks Employee

Hi,

You can use Databricks Container Services on clusters with GPUs to create portable deep learning environments with customized libraries. See Customize containers with Databricks Container Services for instructions.

To create custom images for GPU clusters, you must select a standard runtime version instead of Databricks Runtime ML for GPU. When you select Use your own Docker container, you can choose GPU clusters with a standard runtime version. The custom images for GPU clusters are based on the official CUDA containers, which is different from Databricks Runtime ML for GPU.

When you create custom images for GPU clusters, you cannot change the NVIDIA driver version, because it must match the driver version on the host machine.

 Docker Hub contains example base images with GPU capability. The Dockerfiles used to generate these images are located in the example containers GitHub repository, which also has details on what the example images provide and how to customize them.

Please refer to : https://docs.databricks.com/clusters/gpu.html#databricks-container-services-on-gpu-clusters

Please let us know if this helps. 

Also please tag @Debayan​ with your next response which will notify me, Thank you!

Vartika
Databricks Employee
Databricks Employee

Hi @Alisher Akh​ 

Does @Debayan Mukherjee​'s answer help? If yes, would you be happy to mark the answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you further. 

Cheers!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group