Has China achieved AI breakthrough with DeepSeek?


For over two years, San Francisco-based OpenAI has dominated artificial intelligence (AI) with its generative pre-trained language models. The startup’s chatbot penned poems, wrote long-format stories, found bugs in code, and helped search the Internet (albeit with a cut off date). Its ability to generate coherent sentences flawlessly baffled users around the world. 

Far away, across the Pacific Ocean, in Beijing, China made its first attempt to counter America’s dominance in AI. In March 2023, Baidu received the government’s approval to launch its AI chatbot, Ernie bot. Ernie was touted as the China’s answer to ChatGPT after the bot received over 30 million user sign-ups within a day of its launch. 

But the initial euphoria around Ernie gradually ebbed as the bot fumbled and dodged questions about China’s President Xi Jinping, the Tiananmen Square crackdown and the human rights violation against the Uyghur Muslims. In response to questions on these topics, the bot replied: “Let’s talk about something else.” 

Late to the AI party

As the hype around Ernie met the reality of Chinese censorship, several experts pointed out that difficulty of building large language models (LLMs) in the communist country. Google’s former CEO and chairman, Eric Schmidt, in talk at the Harvard Kennedy School of Government, in October 2023, said: “They [China] were late to the party. They didn’t get to this [LLM] AI space early enough.” Mr. Schmidt further pointed out that lack of training data on language and China’s unfamiliarity with open-source ideas may make the Chinese fall behind in global AI race.  

As these Chinese tech giants trailed, the U.S. tech giants marched forward with their advances in LLMs. Microsoft-backed OpenAI cultivated a new crop of reasoning chatbots with its ‘O’ series that were better than ChatGPT. These AI models were the first to introduce inference-time scaling, which refers to how an AI model handles increasing amounts of data when it is giving answers.

AI trader turned AI builder

While the Chinese tech giants languished, a Zhejiang-based hedge fund, High-Flyer, that used AI for trading, set up its own AI lab, DeepSeek, in April 2024. Within a year, the AI spin off developed the DeepSeek-v2 model that performed well on several benchmarks and was able to provide the service at a significantly lower cost than other Chinese LLMs. 

When DeepSeek-v3 was launched in December, it stunned AI companies. The Mixture-of-Expert (MoE) model was pre-trained on 14.8 trillion tokens with 671 billion total parameters of which 37 billion are activated for each token. 

A MoE model uses different “experts” or sub-models that specialise in different aspects of language or tasks. And each expert is activated when its relevant to a particular task. This makes the model more efficient, saves resources and speeds up processing.

Training despite American sanctions

According to the technical paper released on December 26, DeepSeek-v3 was trained for 2.78 million GPU hours using Nvidia’s H800 GPUs. When compared to Meta’s Llama 3.1 training, which used Nvidia’s H100 chips, DeepSeek-v3 took 30.8 million GPU hours lesser.

After seeing early success in DeepSeek-v3, High-flyer built its most advanced reasoning models – – DeepSeek-R1-Zero and DeepSeek-R1 – – that has potentially disrupted the AI industry by becoming one of the most cost-efficient models in the market. 

When compared to OpenAI’s o1, DeepSeek’s R1 slashes costs by a staggering 93% per API call. This is a huge advantage for businesses and developers looking to integrate AI without breaking the bank. 

The savings don’t stop there. Unlike older models, R1 can run on high-end local computers — so, no need for costly cloud services or dealing with pesky rate limits. This gives users the freedom to run AI tasks faster and cheaper without relying on third-party infrastructure.  

Plus, R1 is designed to be memory efficient as it requires only a portion of RAM to operate, which is low for an AI of its calibre. Separately, by batching, the processing of multiple tasks at once, and leveraging the cloud, this model further lowers costs and speeds up performance, making it even more accessible for a wide range of users.

A close contest

While it may not be quite as advanced as OpenAI’s o3, it still offers comparable quality to the o1. According to benchmark data on both models on LiveBench, when it comes to overall performance, o1 edges out R1 with a global average score of 75.67 compared to the Chinese model’s 71.38. OpenAI’s o1 continues to perform well on reasoning tasks with a nearly nine-point lead against its competitor, making it a go-to choice for complex problem-solving, critical thinking and language-related tasks. 

When it comes to coding, mathematics and data analysis, the competition is quite tighter. Specifically, in data analysis, R1 proves to be a better choice for analysing large datasets. 

One important area where R1 fails miserably, which is reminiscent of the Ernie Bot, is on topics that are censored in China. For instance, to any question on the Chinese President Xi Jinping, the Tiananmen Square protest, and the Uyghur Muslims, the bot tells its users: “Let’s talk about something else.”

Unlike Ernie, this time around, despite the reality of Chinese censorship, DeepSeek’s R1 has soared in popularity globally. It has already surpassed major competitors like ChatGPT, Gemini, and Claude to become the number one downloaded app in the U.S. (In India, DeepSeek is at the third spot under productivity, followed by Gmail and ChatGPT apps.) This meteoric rise in popularity highlights just how quickly the AI community is embracing R1’s promise of affordability and performance.

Smaller models rise

While OpenAI’s o4 continues to be the state-of-art AI model out there, it is only be a matter of time before other models could take the lead in building super intelligence.

DeepSeek shows that, through its distillation process, it can effectively transfers the reasoning patterns of larger models into smaller models. This means, instead of training smaller models from scratch using reinforcement learning (RL), which can be computationally expensive, the knowledge and reasoning abilities acquired by a larger model can be transferred to smaller models, resulting in better performance. 

In its technical paper, DeepSeek compares the performance of distilled models with models trained using large scale RL. The results indicate that the distilled ones outperformed smaller models that were trained with large scale RL without distillation. Specifically, a 32 billion parameter base model trained with large scale RL achieved performance on par with QwQ-32B-Preview, while the distilled version, DeepSeek-R1-Distill-Qwen-32B, performed significantly better across all benchmarks. (Qwen is part of an LLM family on Alibaba Cloud.)

This, in essence, would mean that inference could shift to the edge, changing the landscape of AI infrastructure companies as more efficient models could reduce reliance on centralised data centres. 

The future of AI race

While distillation is a powerful method for enabling smaller models to achieve high performance, it has limits. For instance, as distilled models will be tied to the “teacher“ model, the limitations in the larger models will also be transferred to the smaller ones. Also, distilled models may not be able to replicate the full range of capabilities or nuances of the larger model. This can affect the distilled model’s performance in complex or multi-faceted tasks.

Distillation is an effective tool for transferring existing knowledge, but it may not be the path to major paradigm shifts in AI on its own. That means, the need for GPUs may increase companies build only increase as more powerful intelligent models.

DeepSeek’s R1 and OpenAI’ o1 are the first reasoning models that are actually working. And R1 is the first successful demo of using RL for reasoning. From here, more compute power will be needed for training, running experiments, and exploring advanced methods for creating agents. There are many ways to leverage compute to improve performance, and right now, American companies are in a better position to do this, thanks to their larger scale and access to more powerful chips.



Source link

Share

Latest Updates

Frequently Asked Questions

Related Articles

China’s JD.com moves into food delivery, starts recruiting restaurants

China's e-commerce giant JD.com is venturing into the country's highly competitive food delivery...

Realme P3 Pro Design Teased; to Be Available With a Glow in the Dark Rear Panel

Realme P3 Pro is set to be unveiled in India on February 18....

Texas awards grants to five space companies

WASHINGTON — A Texas state agency awarded $47.7 million in grants to five...
SULTAN88
SULTANSLOT
RAJA328
JOIN88
GFC88
HOKIBET
RUSIASLOT88
TAHU69
BONANZA99
PRAGMABET
MEGA55
LUXURY777
LUXURY333
BORJU89
QQGAMING
KEDAI168
MEGA777
NAGASLOT777
TAKSU787
KKSLOT777
MAS77TOTO
bandar55
BOS303
HOKI99
NUSA365
YUHUSLOT
KTP168
GALAXY138
NEXIA138
PETIR33
BOOM138
MEGA888
CABE888
FOSIL777
turbospin138
KAPAKBET
SUPERJP
sultankoin99
dragon88
raffi888
kenzobet
aladin666
rgo365
ubm4d
GERCEP88
VIVA99
CR777
VOXY88
delman567
intan69
CABE888
RNR303
LOGO303
PEMBURUGACOR
mpo383
cermin4d
bm88
ANGKA79
WOWHOKI
ROKET303
MPOXL
GURITA168
SUPRASLOT
SGCWIN
DESA88
ARWANA388
DAUNEMAS
ALADDIN666
BIOWIN69
SKY77
DOTA88
NAGA138
API5000
y200m
PLAYBOOK88
LUXURY12
A200M
MPO700
KENANGAN4D
cakrabola
PANDAGENDUT
MARVEL77
UG300
HOKI178
MONTE77
JASABOLA
UNTAR4D
LIDO88
MAFIABOLA77
GASPOL189
mpo999
untung138
TW88
JAGUAR33
MPOBOS
SHIO88
VIVO4D
MPOXL
JARISAKTI
BBO303
AONCASH
ANGKER4D
LEVIS4D
JAGO88
REPUBLIK365
BOSDEAL88
BOLA168
akunjp
WARTEGBET
EZEBET
88PULSA
KITAB4D
BOSDEAL88
STUDIOBET
MESINKOIN
BIMA88
PPNUSA
ABGBET88
TOP77
BAYAR77
YES77
BBTN4D
BBCA4D
VSLOTS88
MPO800
PAHALA4D
KPI4D
JURAGAN77
QQ188
BOLAPELANGI
C200M
QQ998
GWKTOGEL
MEGABANDAR
COLOWIN
VIP579
SEVEN4D
MPO188
DEWATA88
SURAT4D
SINAR123
LAMBO77
GUDANG4D
AWAN4D
PLANETLIGA
GT88
ROYALSPIN88
MAMAJITU
MITO99
PEDIA4D
WIBU69JP
333HOKI
SIDARMA88
NAGAEMAS99
HOLA88
CAKAR76
KINGTOTO
RATUGAMING
SSI168
PILAR168
ACTOTO
EYANGTOGEL
KAISAR328
SLOT628
KAISAR88
DOTA88
MAXWIN369
ALIBABA99
MM168
SQUAD777
NAGABET88
JAYABOLA
SEMPATIGAME
PANDAJAGO
PIKAT4D
SINGA77
YUYU33
MASTERPLAY99
VICTORY39
NASA4D
PERMATA55
SAKAUSLOT
CK303
MPOTOWER
CIPUTRABET
WINJUDI
DEWI5000
IYA777
MAHIRTOTO
GOSLOT88
TIPTOP4D
RAJA787
JBO680
JOKER188
EPICPLAY88
TRIVABET
KAISAR189
JOKER81
JPSPIN88
MAYORA4D
DJARUMPLAY
OVO88
BAKTI78
WINGSLOT77
ICAFE4D
PDTOTO
JETPLAY88
CMBET88
CMBET88
didascaliasdelteatrocaminito.com
glenellynrent.com
gypsumboardequipment.com
realseller.org
https://harrysphone.com/upin
gyergyoalfalu.ro/tokek
vipokno.by/gokil
winjospg.com
winjos801.com/
www.logansquarerent.com
internationalfintech.com/bamsz
condowizard.ca
jawatoto889.com
hikaribet3.live
hikaribet1.com
heylink.me/hikaribet
www.nomadsumc.org
condowizard.ca/aromatoto
euro2024gol.com
www.imaracorp.com
daftarsekaibos.com
stuffyoucanuse.org/juragan
Toto Macau 4d
Aromatoto
Lippototo
Mbahtoto
Winjos
152.42.229.23
bandarlotre126.com
heylink.me/sekaipro
www.get-coachoutletsonline.com
wholesalejerseyslord.com
Lippototo
Zientoto
Lippototo
Situs Togel Resmi
Fajartoto
Situs Togel
Toto Macau
Winjos
Winlotre
Aromatoto
design-develop-test.com
winlotre.online
winlotre.xyz
winlotre.us
winlotrebandung.com
winlotrepalu.com
winlotresurabaya.shop
winlotrejakarta.com
winlotresemarang.shop
winlotrebali.shop
winlotreaceh.shop
winlotremakmur.com
Dadu Online
Taruhantoto
a Bandarlotre
bursaliga
lakitoto
aromatoto
Rebahin
untungslot.pages.dev
slotpoupler.pages.dev
rtpliveslot88a.pages.dev
tipsgameslot.pages.dev
pilihslot88.pages.dev
fortuertiger.pages.dev
linkp4d.pages.dev
linkslot88a.pages.dev
slotpgs8.pages.dev
markasjudi.pages.dev
saldo69.pages.dev
slotbenua.pages.dev
saingtoto.pages.dev
markastoto77.pages.dev
jowototo88.pages.dev
sungli78.pages.dev
volatilitas78.pages.dev
bonusbuy12.pages.dev
slotoffiline.pages.dev
dihindari77.pages.dev
rtpdislot1.pages.dev
agtslot77.pages.dev
congtoto15.pages.dev
hongkongtoto7.pages.dev
sinarmas177.pages.dev
hours771.pages.dev
sarana771.pages.dev
kananslot7.pages.dev
balitoto17.pages.dev
jowototo17.pages.dev
aromatotoding.com
unyagh.org
fairparkcounseling.com/gap/
impress-newtex.com/ajax/
SULTAN88
SULTANSLOT
RAJA328
JOIN88+
HOKIBET
GFC88
RusiaSlot88
Tahu69
BONANZA99
Pragmabet
mega55
luxury777
luxury333
borju89
qqgaming
KEDAI168
mega777
nagaslot777
TAKSU787
kkslot777
MAS77TOTO
BANDAR55+
BOS303
Login-HOKI99/
NUSA365
YUHUSLOT
ktp168
GALAXY138