Question about response time by Llama 3.3 70B
Hey everyone !So I'm new into Databricks and I'm learning about the possibilities offered by Mosaic AI Foundation Model Serving. I'm mostly following the Azure's documentation to learn about it.In my testing, I've created 4 unity catalog functions vi...
- 628 Views
- 1 replies
- 1 kudos
Latest Reply
Llama 3.3 normally offers faster inference speeds compared to earlier versions. It provides approximately 40% faster responses and reduced batch processing time However, the usual performance for Mosaic AI Model Serving are also influenced by configu...
- 1 kudos