November 18, 2024: Cerebras Boosts LLM Inference to Record Speeds - Cerebras Systems has upgraded its AI inference service, achieving record performance for Metas Llama 3.1 405B model with 969 tokens per second, surpassing GPU benchmarks. Their unique silicon architecture enables faster, cost-effective AI processing and outspeeds competitors like OpenAI and Anthropic significantly. The new service, set to launch in early 2025, offers competitive pricing and is attracting clients like GlaxoSmithKline. Additionally, Cerebras hardware has set world records in molecular dynamics simulations, achieving speeds 700x faster than the Frontier supercomputer.