cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Model serving with GPU cluster

alisher_pwc
New Contributor II

Hello Databricks community!

We are facing a strong need of serving some of public and our private models on GPU clusters and we have several requirements:

1) We'd like to be able to start/stop the endpoints (best with scheduling) to avoid excess consumption

2) We'd like to have a static address of the endpoint

3) (optional) We'd like to be able to run several models on one cluster (to use GPU more efficiently)

As far as we know you have GPU clusters and Container Services. The question: is it possible to run a docker container (or group) and expose it?

We know that most of the GPU services are either in preview or in beta, however, we would like to hear any advice from you. Right now we are using Databricks on Azure for different purposes than ML but would love to start using your platform to host our ML models.

Please suggest us possible approaches from your experience.

Thank you 🙂

2 REPLIES 2

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi,

You can use Databricks Container Services on clusters with GPUs to create portable deep learning environments with customized libraries. See Customize containers with Databricks Container Services for instructions.

To create custom images for GPU clusters, you must select a standard runtime version instead of Databricks Runtime ML for GPU. When you select Use your own Docker container, you can choose GPU clusters with a standard runtime version. The custom images for GPU clusters are based on the official CUDA containers, which is different from Databricks Runtime ML for GPU.

When you create custom images for GPU clusters, you cannot change the NVIDIA driver version, because it must match the driver version on the host machine.

 Docker Hub contains example base images with GPU capability. The Dockerfiles used to generate these images are located in the example containers GitHub repository, which also has details on what the example images provide and how to customize them.

Please refer to : https://docs.databricks.com/clusters/gpu.html#databricks-container-services-on-gpu-clusters

Please let us know if this helps. 

Also please tag @Debayan​ with your next response which will notify me, Thank you!

Vartika
Moderator
Moderator

Hi @Alisher Akh​ 

Does @Debayan Mukherjee​'s answer help? If yes, would you be happy to mark the answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you further. 

Cheers!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.