DeepSeek, a Chinese artificial intelligence startup, recently launchedFree app of the same name, in a short period of time, it was launched on the App Store in the United States and other regions.Attract a large number of users to download and useAmong them, the open-source artificial intelligence model "DeepSeek V3" boasts that its performance exceeds Meta's Llama 3.1, and is comparable to Anthropic's Claude-3.5 and OpenAI's GPT-4.Comparable performanceAt the same time, the hardware computing power required behind the model is far lower than that of other competitors in the market, and the development cost is less than 600 million US dollars.
DeepSeek was founded in April 2023. Its founder Liang Wenfeng is also the founder of the quantitative hedge fund High-Flyer. This means that its operations can be supported by hedge funds, unlike other artificial intelligence startups that rely on external investment funds for operation. At the same time, it can also be more flexible in operational decisions.
DeepSeek's first artificial intelligence model, DeepSeek Coder, is free for researchers to use and can even be used for commercial purposes. It subsequently launched its first large-scale natural language model, DeepSeek LLM, and its second large-scale natural language model, DeepSeek-V5, in May last year. It claims to attract more people with lower costs and higher performance, and has even led Chinese technology companies such as ByteDance, Tencent, Baidu, and Alibaba to reduce the fees for using their artificial intelligence models to avoid losing their original user base.
As for the third large-scale natural language model DeepSeek-V3 launched recently, it claims to increase the parameter scale to 6710 billion groups, and its performance is claimed to exceed Meta's Llama 3.1 4050 billion parameter version. It only uses 2048 sets of NVIDIA H800 GPUs and completes the training in 2 months, costing only about US$560 million, which is much lower than the training costs of other technology companies.
DeepSeek's artificial intelligence models can be used on web pages, apps, or through API calls. Its DeepSeek-R1 version is also provided under the relatively loose and widely used software license MIT, and is also advertised as being able to be used for various commercial needs, thus attracting many industry players to introduce it.
Compared with other technology companies that purchase large quantities of GPUs and other acceleration hardware and invest billions of dollars in training artificial intelligence models, the emergence of DeepSeek shows that artificial intelligence models can not only be built with a lot of money, but can be built at a lower cost, which has also caused technology stocks such as NVIDIA to fall sharply.
On the other hand, DeepSeek's sudden rise not only demonstrates that it is possible to build higher-performance AI models with fewer hardware resources and less financial resources, but it also claims to be able to build AI models with NVIDIA's A100 accelerator, which means that even under the current US government's technology ban, it can still build its AI technology without restrictions.
At the same time, the development of DeepSeek also represents the feasibility of building higher-performance artificial intelligence technology at low cost, and highlights that the investment costs of artificial intelligence technology driven by many large technology companies in the United States may be unreasonable, which in turn leads to more companies being more inclined to invest in low-cost, fast, and effective artificial intelligence technology development, and may even affect the current competition in the development of artificial intelligence technology in the United States.
