DeepSeek V3: A Game-Changing Open AI Model Developed by Chinese Firm

A Chinese lab has unveiled one of the most powerful open AI models to date—DeepSeek V3—developed by the AI firm DeepSeek. Released under a permissive license, this model allows developers to download and modify it for most applications, including commercial use. DeepSeek V3 is designed to handle a wide range of text-based tasks, such as coding, translating, and generating content like essays and emails from descriptive prompts.
In internal benchmark testing, DeepSeek V3 has outperformed both open, downloadable models and closed AI models that are only accessible via API. In coding competitions hosted on Codeforces, DeepSeek V3 surpassed models from Meta, OpenAI, and Alibaba, including Meta’s Llama 3.1 405B, OpenAI’s GPT-4o, and Alibaba’s Qwen 2.5 72B. The model also excelled in Aider Polyglot, a test measuring the ability to integrate new code into existing code, further demonstrating its superiority in practical applications.
DeepSeek V3 was trained on a massive dataset of 14.8 trillion tokens. In data science, tokens are representations of raw data—1 million tokens equate to about 750,000 words. The model is also enormous in terms of size, boasting 671 billion parameters (or 685 billion on AI platform Hugging Face). This is roughly 1.6 times the size of Llama 3.1 405B, which has 405 billion parameters. Larger models, like DeepSeek V3, often outperform smaller models, though they also require significant hardware resources to run efficiently. An unoptimized version of DeepSeek V3 would need a high-end GPU setup to process queries at a reasonable speed.
The remarkable size and performance of DeepSeek V3 are a testament to the AI firm’s capabilities. DeepSeek trained the model using a data center of Nvidia H800 GPUs in just around two months, a feat that is even more impressive considering the recent restrictions on Chinese companies acquiring high-end GPUs from the U.S. The company claims it spent only $5.5 million on training DeepSeek V3, a fraction of the cost for models like OpenAI’s GPT-4, which highlights the efficiency of the development process.
However, the model does have limitations. As with many AI systems developed in China, DeepSeek V3’s responses are influenced by local regulations, which require AI models to align with the country’s “core socialist values.” For instance, the model does not respond to queries about sensitive topics like the Tiananmen Square protests, a reflection of the censorship that applies to Chinese AI models.
Despite these political constraints, DeepSeek V3 is a significant achievement in the AI landscape. It showcases the growing capabilities of Chinese AI firms and highlights the increasing global competition in AI development.

- Advertisement -

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles

error: Content is protected !!

Share your details to download the CISO Handbook 2025

Sign Up for CXO Digital Pulse Newsletters

Sign Up for CXO Digital Pulse Newsletters to Download the Research Report

Sign Up for CXO Digital Pulse Newsletters to Download the Coffee Table Book

Sign Up for CXO Digital Pulse Newsletters to Download the Vision 2023 Research Report

Download 8 Key Insights for Manufacturing for 2023 Report

Sign Up for CISO Handbook 2023

Download India’s Cybersecurity Outlook 2023 Report

Unlock Exclusive Insights: Access the article

Download CIO VISION 2024 Report

Share your details to download the report

Share your details to download the CISO Handbook 2024

Fill your details to Watch