The affordability of DeepSeek is a myth: The revolutionary AI actually cost $1.6 billion to develop
DeepSeek's surprisingly inexpensive AI model challenges industry giants. The Chinese startup claims to have trained its powerful DeepSeek V3 neural network for a mere $6 million, utilizing only 2048 GPUs, significantly undercutting competitors. However, this figure is misleading.
Image: ensigame.com
DeepSeek V3 leverages innovative technologies: Multi-token Prediction (MTP) for improved accuracy and efficiency; Mixture of Experts (MoE), employing 256 neural networks, to accelerate training; and Multi-head Latent Attention (MLA) to focus on crucial sentence elements.
Image: ensigame.com
The reality, uncovered by SemiAnalysis, reveals a far more substantial investment. DeepSeek operates a massive infrastructure of approximately 50,000 Nvidia GPUs, valued at roughly $1.6 billion, with annual operational costs nearing $944 million. This includes substantial salaries, with some researchers earning over $1.3 million annually. The company's self-funded nature, however, allows for agile innovation.
Image: ensigame.com
While DeepSeek's $6 million pre-training cost is a fraction of competitors' expenses (e.g., ChatGPT-4's $100 million), the overall investment exceeds $500 million. The company’s success stems from substantial funding, technological advancements, and a highly skilled team, rather than a revolutionary cost-cutting approach. Despite this, its operational costs still significantly undercut those of its rivals.
Image: ensigame.com
DeepSeek's example highlights the potential of a well-funded, independent AI company to compete effectively. However, the narrative of exceptionally low development costs requires careful scrutiny.
Latest Articles