Building DBRX class Custom LLMs with Mosaic AI Training

lara_rachidi — Fri, 14 Jun 2024 17:25:16 GMT

DBRX was trained, fine-tuned, and evaluated using Mosaic AI Training, scaling training to 3072 NVIDIA H100s and processing more than 12 trillion tokens in the process. Mosaic AI Training is available today for Databricks customers to build custom models on their own enterprise data that are tailored to a specific business context, language and domain, and can efficiently power key business use cases. We discuss a blog post that details Mosaic AI Training's core capabilities and how they were critical to the successful training of DBRX. Training LLMs and other large AI models requires the integration of numerous components. To simplify this complexity and to deliver an experience that “just works", Mosaic AI Training offers an optimized training stack that handles all aspects of large-scale distributed training. The stack supports multiple GPU cloud providers (AWS, Azure, OCI, Coreweave, to name a few), is configured with the latest GPU drivers including NVIDIA CUDA and AMD ROCm, and includes core neural network and training libraries (PyTorch, MegaBlocks, Composer, Streaming). Lastly, battle-tested scripts for training, fine-tuning, and evaluating LLMs are available in LLMFoundry, enabling customers to start training their own LLMs immediately. Link to blog: https://www.databricks.com/blog/mosaic-ai-training-capabilities?utm_source=bambu&utm_medium=social&utm_campaign=advocacy

article Building DBRX class Custom LLMs with Mosaic AI Training in Databricks TV

Building DBRX class Custom LLMs with Mosaic AI Training