DeepSeek V3: A Game-Changing Open AI Model Developed by Chinese Firm

December 27, 2024

910

A Chinese lab has unveiled one of the most powerful open AI models to date—DeepSeek V3—developed by the AI firm DeepSeek. Released under a permissive license, this model allows developers to download and modify it for most applications, including commercial use. DeepSeek V3 is designed to handle a wide range of text-based tasks, such as coding, translating, and generating content like essays and emails from descriptive prompts.
In internal benchmark testing, DeepSeek V3 has outperformed both open, downloadable models and closed AI models that are only accessible via API. In coding competitions hosted on Codeforces, DeepSeek V3 surpassed models from Meta, OpenAI, and Alibaba, including Meta’s Llama 3.1 405B, OpenAI’s GPT-4o, and Alibaba’s Qwen 2.5 72B. The model also excelled in Aider Polyglot, a test measuring the ability to integrate new code into existing code, further demonstrating its superiority in practical applications.
DeepSeek V3 was trained on a massive dataset of 14.8 trillion tokens. In data science, tokens are representations of raw data—1 million tokens equate to about 750,000 words. The model is also enormous in terms of size, boasting 671 billion parameters (or 685 billion on AI platform Hugging Face). This is roughly 1.6 times the size of Llama 3.1 405B, which has 405 billion parameters. Larger models, like DeepSeek V3, often outperform smaller models, though they also require significant hardware resources to run efficiently. An unoptimized version of DeepSeek V3 would need a high-end GPU setup to process queries at a reasonable speed.
The remarkable size and performance of DeepSeek V3 are a testament to the AI firm’s capabilities. DeepSeek trained the model using a data center of Nvidia H800 GPUs in just around two months, a feat that is even more impressive considering the recent restrictions on Chinese companies acquiring high-end GPUs from the U.S. The company claims it spent only $5.5 million on training DeepSeek V3, a fraction of the cost for models like OpenAI’s GPT-4, which highlights the efficiency of the development process.
However, the model does have limitations. As with many AI systems developed in China, DeepSeek V3’s responses are influenced by local regulations, which require AI models to align with the country’s “core socialist values.” For instance, the model does not respond to queries about sensitive topics like the Tiananmen Square protests, a reflection of the censorship that applies to Chinese AI models.
Despite these political constraints, DeepSeek V3 is a significant achievement in the AI landscape. It showcases the growing capabilities of Chinese AI firms and highlights the increasing global competition in AI development.

- Advertisement -

DeepSeek V3: A Game-Changing Open AI Model Developed by Chinese Firm

Related Articles

Union Budget 2026: Tech Leaders Hail Push for ‘Intelligence-First’ Economy, Decentralised Innovation, and Digital Consumption

Ankit Shah Promoted to Director of IT at Marsh to Lead Digital Transformation

Shalini Kaul Joins Hewlett Packard Enterprise as Head of Marketing – HPE Networking

Kapil Malhotra Takes on Expanded Leadership Role at Google for India Startups and AI-Native Businesses

LEAVE A REPLY Cancel reply

Latest Articles

Union Budget 2026: Tech Leaders Hail Push for ‘Intelligence-First’ Economy, Decentralised...

Ankit Shah Promoted to Director of IT at Marsh to Lead...

Shalini Kaul Joins Hewlett Packard Enterprise as Head of Marketing –...

Kapil Malhotra Takes on Expanded Leadership Role at Google for India...

DSP Capabilities for India & SEA App Growth: Why Scale...

Pre-Budget Quote: Ankit Agarwal, Managing Director, STL

Pre-Budget Quote: Sonali Chowdhry, CEO- OfficeNet

SRF Limited Appoints A.N. Srinivasan as Senior Vice President – Information...

Deepti Jaisingh Takes on Expanded Leadership Role as Head of Lean...

Oracle Targets $45–$50 Billion Funding Raise to Meet Surging AI Cloud...