Datagrom AI News Logo

DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch

DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch

December 26, 2024: DeepSeek-V3: Outperforms Rivals with Efficient AI - Chinese startup DeepSeek has launched DeepSeek-V3, a 671B parameter model that surpasses Meta's Llama-3.1 and OpenAI's GPT-4o. Utilizing a mixture-of-expert architecture, it activates 37B parameters per task, ensuring efficient performance. Innovations such as auxiliary loss-free load balancing and multi-token prediction enhance training speed and efficacy, while significant cost reductions in training add to its appeal.

DeepSeek-V3 achieves high scores on benchmarks, particularly in Chinese and math tasks. As an open-source model, it narrows the gap with closed models, fostering industry competition and providing customizable solutions for enterprises. This development offers a competitive edge in the market and solidifies DeepSeek's position as a notable player in the AI landscape.

KEEP UP WITH THE INNOVATIVE AI TECH TRANSFORMING BUSINESS

Datagrom keeps business leaders up-to-date on the latest AI innovations, automation advances,
policy shifts, and more, so they can make informed decisions about AI tech.