10+ DeepSeek Facts and Statistics You Need to Know

hero image
Join IT Pulse, our weekly newsletter

Receive the latest news of the IT world, right in your inbox.

DeepSeek has quickly become one of China’s most talked-about AI labs. Spun out of High-Flyer Quant, a hedge fund known for its data-driven strategies, it entered the AI race with serious ambition.

There’s been plenty of debate about its access to hardware, the efficiency of its models, and the talent driving its research. While DeepSeek has positioned itself as a contender in AI development, people are still wondering about its long-term strategy and whether it can really compete on a global scale.

Here’s a look at some facts about Deepseek and stats we know so far — its origins, achievements, and the challenges ahead.

What is DeepSeek?

DeepSeek is a free AI-powered chatbot that closely resembles ChatGPT in appearance, functionality, and performance. The AI research lab DeepSeek emerged from Fire-Flyer, the deep-learning division of High-Flyer, a Chinese quantitative hedge fund. 

High-Flyer, founded in 2015, became well known for its use of advanced computing to analyze financial data. In 2023, its founder, Liang Wenfeng, shifted resources toward AI research, leading to the creation of DeepSeek with the goal of building cutting-edge AI models.

DeepSeek has also built a strong research team. Many of its employees are graduates from leading Chinese universities such as Peking University and Tsinghua University. The company has focused on recruiting talent with expertise in deep learning and large-scale AI model training, aligning with its ambitions to compete with global AI leaders.

Why is everyone talking about DeepSeek? 

What caused a lot of buzz was the revelation that the AI model is much cheaper to train and run. DeepSeek was developed at a surprisingly low cost of just USD 5.6 million for its base model, although some experts question the accuracy of this figure and whether it reflects the true expenditure.

This stood in stark contrast to the exorbitant expenses incurred by competitors. For instance, OpenAI's Chat GPT-4 development is estimated to have cost between $100 million and $150 million. Similarly, Google's Gemini project reportedly cost around $191 million.  Even more staggering is the estimated cost for Meta's Llama 3 training, which some users estimate could range from $720 million to a whopping $1 billion.

This clearly did not go unnoticed by the world; the app skyrocketed in Apple Store's downloads and shook the global stock market. In a historic US stock market drop, the company Nvidia´s shares dropped by almost $600bn in one day.

People are paying close attention to what this means for the global AI race. According to a recent study from the China Academy of Information and Communications Technology, a state-affiliated research institute, the global tally of AI large language models stands at 1,328, with China responsible for 36% of these advancements. This places China as the second-largest contributor to AI innovations worldwide, just behind the United States.

Join IT Pulse, our weekly newsletter

Receive the latest news of the IT world, right in your inbox.

12 must-know DeepSeek statistics and facts

Everyone is talking about the emergence of the AI model DeepSeek, and for good reasons. Here we will explore what this global phenomenon looks like in numbers.

DeepSeek’s hardware and infrastructure

Before the US imposed strict export controls on advanced AI chips, DeepSeek procured an estimated 10,000 Nvidia A100 GPUs. This stockpile of chips allowed DeepSeek to build a robust infrastructure for AI model development. Other estimates suggest that DeepSeek now could have around 50,000 GPUs in stock.

In an interview with 36Kr in July 2024, Liang Wenfong acknowledged the challenges posed by US sanctions. He stated:

“We [most Chinese companies] have to consume twice the computing power to achieve the same results. Combined with data efficiency gaps, this could mean needing up to four times more computing power. Our goal is to continuously close these gaps.”

Early in 2024, experts predicted that the U.S. export restrictions might backfire, accelerating China’s efforts to develop its own AI expertise and reduce dependence on Western technology. While U.S. sanctions may have hampered Chinese firms in the short term, they have also created an incentive for Chinese companies to out-innovate their Western competitors in the long term.

Key stats:

  1. It’s still unclear which semiconductors they used to train their model, but experts argue that DeepSeek’s infrastructure may be powered by 10,000 (or more) Nvidia A100 GPUs, secured before the U.S. export restrictions.
  2. They also may have incorporated Chinese-designed semiconductors, like Ascend 910C by Huawei into its infrastructure.
  3. Beijing has declared its determination to leapfrog the West in every facet of the semiconductor supply chain. However, expert Chris Miller cautions that China’s chances of narrowing the gap in chipmaking tools are limited over the next five years.
  4. Despite these challenges, China is pouring hundreds of billions of dollars into developing its own semiconductor industry, signaling a long-term commitment to self-reliance.

Meanwhile, U.S. companies like Nvidia and Intel have designed chips for sale to China that fall just within the current export regulations, indicating their intent to maintain a foothold in the Chinese market despite sanctions.

DeepSeek’s model innovations

DeepSeek’s innovations have not only advanced the field, but according to experts, they have also set a new standard for efficiency and cost-effectiveness.

Key stats: 

  1. DeepSeek v3 is currently state-of-the-art among open-source models. According to a report by Epoch AI, it achieved its benchmarks using only 2.8 million H800 hours of training hardware time, equivalent to approximately 4e24 FLOPs. That’s around 10 times less training compute than Meta’s Llama 3.1, a significant achievement in cost-effectiveness.
  2. DeepSeek has open-sourced its flagship models, along with six smaller variants, ranging from 1.5 billion to 70 billion parameters. According to Perplexity CEO Aravind Srinivas, one of these variants outperforms OpenAI’s o1-mini on certain benchmarks.

DeepSeek’s approach to model optimization is rooted in its ability to reduce memory usage and accelerate calculations without sacrificing accuracy. The company has also pioneered novel techniques like multi-head latent attention (MLA), a Mixture of Experts (MoE) architectures, and the long-Chain-of-Thought (CoT) reasoning model.

DeepSeek’s model vulnerabilities

Despite its advancements, DeepSeek’s models have raised concerns about security vulnerabilities. Independent researchers have highlighted issues with guardrails and the potential for misuse.

Key stats:

  1. A study by Palo Alto Networks found that DeepSeek’s models are “fairly easy” to break to provide tips for writing code that could be used in hacking, phishing, or social engineering attacks.
  2. Enkrypt AI reported that DeepSeek’s R1 model is four times more likely to write malware and insecure code compared to OpenAI’s o1.
  3. Wiz Security identified a breach in DeepSeek’s system, exposing 1 million lines of chat logs and sensitive information. However, DeepSeek had fixed the exposure before Wiz released its findings.

It is worth noting that these vulnerabilities are not unique to DeepSeek, they underscore the broader challenges of ensuring the responsible deployment of AI technologies.

DeepSeek's environmental impact

AI development is notoriously energy-intensive, and DeepSeek is no exception. However, the company claims to have made strides in reducing its environmental footprint compared to U.S.-based models.

Key stats:

  1. While the exact energy consumption of DeepSeek’s models isn’t fully disclosed, the company emphasizes its efforts to optimize efficiency.
  2. U.S. technology investors are concerned about DeepSeek’s ability to replicate high-performance models with significantly less energy consumption.
  3. As the world grapples with the environmental costs of AI, DeepSeek’s progress in reducing computational requirements could set an important precedent.

Final words

Much about DeepSeek remains unclear, including whether its cost claims hold up under scrutiny. What is certain is that the company has secured essential hardware and developed efficient AI models, proving its adaptability and technical strength.

Still, challenges remain. Security concerns, regulatory pressures, and the broader implications of AI development all shape its trajectory. DeepSeek’s story isn’t just about technology—it’s about the intersection of innovation, geopolitics, and competition in an increasingly divided AI landscape.

Whether it thrives or stumbles, DeepSeek has made itself impossible to ignore.