Open-source DeepSeek-R1 uses pure reinforcement learning to match OpenAI o1 — at 95% less cost


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Chinese AI startup DeepSeek, known for challenging leading AI vendors with open-source technologies, just dropped another bombshell: a new open reasoning LLM called DeepSeek-R1.

Based on the recently introduced DeepSeek V3 mixture-of-experts model, DeepSeek-R1 matches the performance of o1, OpenAI’s frontier reasoning LLM, across math, coding and reasoning tasks. The best part? It does this at a much more tempting cost, proving to be 90-95% more affordable than the latter.

The release marks a major leap forward in the open-source arena. It showcases that open models are further closing the gap with closed commercial models in the race to artificial general intelligence (AGI). To show the prowess of its work, DeepSeek also used R1 to distill six Llama and Qwen models, taking their performance to new levels. In one case, the distilled version of Qwen-1.5B outperformed much bigger models, GPT-4o and Claude 3.5 Sonnet, in select math benchmarks.

These distilled models, along with the main R1, have been open-sourced and are available on Hugging Face under an MIT license.

What does DeepSeek-R1 bring to the table?

The focus is sharpening on artificial general intelligence (AGI), a level of AI that can perform intellectual tasks like humans. A lot of teams are doubling down on enhancing models’ reasoning capabilities. OpenAI made the first notable move in the domain with its o1 model, which uses a chain-of-thought reasoning process to tackle a problem. Through RL (reinforcement learning, or reward-driven optimization), o1 learns to hone its chain of thought and refine the strategies it uses — ultimately learning to recognize and correct its mistakes, or try new approaches when the current ones aren’t working. 

Now, continuing the work in this direction, DeepSeek has released DeepSeek-R1, which uses a combination of RL and supervised fine-tuning to handle complex reasoning tasks and match the performance of o1. 

When tested, DeepSeek-R1 scored 79.8% on AIME 2024 mathematics tests and 97.3% on MATH-500. It also achieved a 2,029 rating on Codeforces — better than 96.3% of human programmers. In contrast, o1-1217 scored 79.2%, 96.4% and 96.6% respectively on these benchmarks. 

It also demonstrated strong general knowledge, with 90.8% accuracy on MMLU, just behind o1’s 91.8%. 

Performance of DeepSeek-R1 vs OpenAI o1 and o1-mini

The training pipeline

DeepSeek-R1’s reasoning performance marks a big win for the Chinese startup in the US-dominated AI space, especially as the entire work is open-source, including how the company trained the whole thing. 

However, the work isn’t as straightforward as it sounds.

According to the paper describing the research, DeepSeek-R1 was developed as an enhanced version of DeepSeek-R1-Zero — a breakthrough model trained solely from reinforcement learning. 

https://twitter.com/DrJimFan/status/1881353126210687089

The company first used DeepSeek-V3-base as the base model, developing its reasoning capabilities without employing supervised data, essentially focusing only on its self-evolution through a pure RL-based trial-and-error process. Developed intrinsically from the work, this ability ensures the model can solve increasingly complex reasoning tasks by leveraging extended test-time computation to explore and refine its thought processes in greater depth.

“During training, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors,” the researchers note in the paper. “After thousands of RL steps, DeepSeek-R1-Zero exhibits super performance on reasoning benchmarks. For instance, the pass@1 score on AIME 2024 increases from 15.6% to 71.0%, and with majority voting, the score further improves to 86.7%, matching the performance of OpenAI-o1-0912.”

However, despite showing improved performance, including behaviors like reflection and exploration of alternatives, the initial model did show some problems, including poor readability and language mixing. To fix this, the company built on the work done for R1-Zero, using a multi-stage approach combining both supervised learning and reinforcement learning, and thus came up with the enhanced R1 model.

“Specifically, we begin by collecting thousands of cold-start data to fine-tune the DeepSeek-V3-Base model,” the researchers explained. “Following this, we perform reasoning-oriented RL like DeepSeek-R1- Zero. Upon nearing convergence in the RL process, we create new SFT data through rejection sampling on the RL checkpoint, combined with supervised data from DeepSeek-V3 in domains such as writing, factual QA, and self-cognition, and then retrain the DeepSeek-V3-Base model. After fine-tuning with the new data, the checkpoint undergoes an additional RL process, taking into account prompts from all scenarios. After these steps, we obtained a checkpoint referred to as DeepSeek-R1, which achieves performance on par with OpenAI-o1-1217.”

Far more affordable than o1

In addition to enhanced performance that nearly matches OpenAI’s o1 across benchmarks, the new DeepSeek-R1 is also very affordable. Specifically, where OpenAI o1 costs $15 per million input tokens and $60 per million output tokens, DeepSeek Reasoner, which is based on the R1 model, costs $0.55 per million input and $2.19 per million output tokens. 

https://twitter.com/EMostaque/status/1881310721746804810

The model can be tested as “DeepThink” on the DeepSeek chat platform, which is similar to ChatGPT. Interested users can access the model weights and code repository via Hugging Face, under an MIT license, or can go with the API for direct integration.



Source link

Share

Latest Updates

Frequently Asked Questions

Related Articles

How to Spend Less Time on Social Media (or Leave It Altogether)

Have you been spending too many hours on your social feeds feeling anxious,...

Intel names tech veteran Lip-Bu Tan as its next CEO

Intel has named tech veteran Lip-Bu Tan the company’s next chief executive officer,...

Nous Research just launched an API that gives developers access to AI models that OpenAI and Anthropic won’t build

Join our daily and weekly newsletters for the latest updates and exclusive content...

Top 5 camera smartphones to capture your happiness this Holi

Holi is a festival of colors, joy, and unforgettable moments—and what better way...
PORN VIDEO
PORN VIDEO
PORN VIDEO
SULTAN88
SULTANSLOT
RAJA328
JOIN88
GFC88
HOKIBET
RUSIASLOT88
TAHU69
BONANZA99
PRAGMABET
MEGA55
LUXURY777
LUXURY333
BORJU89
QQGAMING
KEDAI168
MEGA777
NAGASLOT777
TAKSU787
KKSLOT777
MAS77TOTO
bandar55
BOS303
HOKI99
NUSA365
YUHUSLOT
KTP168
GALAXY138
NEXIA138
PETIR33
BOOM138
MEGA888
CABE888
FOSIL777
turbospin138
KAPAKBET
SUPERJP
sultankoin99
dragon88
raffi888
kenzobet
aladin666
rgo365
ubm4d
GERCEP88
VIVA99
CR777
VOXY88
delman567
intan69
CABE888
RNR303
LOGO303
PEMBURUGACOR
mpo383
cermin4d
bm88
ANGKA79
WOWHOKI
ROKET303
MPOXL
GURITA168
SUPRASLOT
SGCWIN
DESA88
ARWANA388
DAUNEMAS
ALADDIN666
BIOWIN69
SKY77
DOTA88
NAGA138
API5000
y200m
PLAYBOOK88
LUXURY12
A200M
MPO700
KENANGAN4D
cakrabola
PANDAGENDUT
MARVEL77
UG300
HOKI178
MONTE77
JASABOLA
UNTAR4D
LIDO88
MAFIABOLA77
GASPOL189
mpo999
untung138
TW88
JAGUAR33
MPOBOS
SHIO88
VIVO4D
MPOXL
JARISAKTI
BBO303
AONCASH
ANGKER4D
LEVIS4D
JAGO88
REPUBLIK365
BOSDEAL88
BOLA168
akunjp
WARTEGBET
EZEBET
88PULSA
KITAB4D
BOSDEAL88
STUDIOBET
MESINKOIN
BIMA88
PPNUSA
ABGBET88
TOP77
BAYAR77
YES77
BBTN4D
BBCA4D
VSLOTS88
MPO800
PAHALA4D
KPI4D
JURAGAN77
QQ188
BOLAPELANGI
C200M
QQ998
GWKTOGEL
MEGABANDAR
COLOWIN
VIP579
SEVEN4D
MPO188
DEWATA88
SURAT4D
SINAR123
LAMBO77
GUDANG4D
AWAN4D
PLANETLIGA
GT88
ROYALSPIN88
MAMAJITU
MITO99
PEDIA4D
WIBU69JP
333HOKI
SIDARMA88
NAGAEMAS99
HOLA88
CAKAR76
KINGTOTO
RATUGAMING
SSI168
PILAR168
ACTOTO
EYANGTOGEL
KAISAR328
SLOT628
KAISAR88
DOTA88
MAXWIN369
ALIBABA99
MM168
SQUAD777
NAGABET88
JAYABOLA
SEMPATIGAME
PANDAJAGO
PIKAT4D
SINGA77
YUYU33
MASTERPLAY99
VICTORY39
NASA4D
PERMATA55
SAKAUSLOT
CK303
MPOTOWER
CIPUTRABET
WINJUDI
DEWI5000
IYA777
MAHIRTOTO
GOSLOT88
TIPTOP4D
RAJA787
JBO680
JOKER188
EPICPLAY88
TRIVABET
KAISAR189
JOKER81
JPSPIN88
MAYORA4D
DJARUMPLAY
OVO88
BAKTI78
WINGSLOT77
ICAFE4D
PDTOTO
JETPLAY88
JETPLAY88
STADIUM4D
RAJAVIP777
ISB388
GASSPOL168
JITU33
ISTANA8899
CERI123
VIPPELANGI99
55WEALTH
LIGAJUARA
RAJAPKV
HMTOTO
PERKASA99
DEWIGG
MASTERKIU
DAFTARJP268
BATENGMERAH
YOGATOTO
GRAZYRICH88
RGO365
TIKI4D
GBOSKY
RANS4D
GRAND4D
GARUDABET77
BOLABESAR
KASIR777
WINPALACE88
SAMUDRBET
JAGO89
IBCBET
SUPER126
BIZZ77GAMES
ASET69
GAMESPOLLS
LOGO303
JETHOKI
FERRARITOTO
SULTAN69
BARUNATOTO
MDSBET
HOBBIQQ
SARANG188
HEPI55
NARUTOBET
ASIABET4D
PRAGMABET
OKEBOS138
HAHA55
VOCAL77
GATOT4D
LANANGBET
BONCEL4D
TUKUL777
BOOKIE7
PAJAKBOLA
5DEWA
WAHIDTOTO
CSOWIN
OMG303
WINLIVE4D
ALADDIN666
LUMIO777
GBOPLAY777
GEBER88
BETWIN89
BIBIT88
BIJITOGEL
BIMOIN88
BINGOSLOT88
BINTANG29
BINTANG4D
BISABET
BOJO88
BOLA99
BOLAKAWAN
BOROBUDURBET
BOSDEAL88
BOSKU123
HOKI138
BOSS177
BOSSKLIK
BP77
GARUDA999
ABO777
MAXBET268
BANDARSBO
UGDEWA
ANAKNAGA
BIGSLOT
FYP138
SKYWIN386
KOBOY789
YYPAUS
LUCKY77
ISTANAIMPIAN4
PEDRO4D
SEMAR123
AKSARA88
VIRGO168
JUALTOTO
KAISAR89
CAPSAWINS
SUKI99
SIARIL
BOSSLOT138
PRAGMATIC777
ARWANA89
DUKUN138
KOI77
SBA99
GOWD
ANAKTOTO
JAKJP
EU9
ZONA66
MURAH138
SULE88
PPNUSA
PENCETAJA
RAFI168
MURAH138_LOGIN
PATEN77
ACETOTO888
CUAN368
KENZO123
DEWAWIN365
KUPONTOTO
MPOTOP88
TOKYO188
SLOT88RESMI
CAPTAIN77
PECINTA4D
PANEN33
TANTAN88
OMEGA138
KUDA77
BLURAYUFR
YANDEXEU
K86SPORT
ASIAKLUB
ION55
OTW78
POOLS303
ALL303
MPOBOS
MEGA118
MAMEN123
MEVIUS88
77ROYAL
DRAGON222
337SPORTS
QQ1221
CAFE69
TKO77
GELEK4D
DOMINO76
PPSNUSA
ANDAHOKI
OASIS88
SOHIB4D
HERMES21
NEON4D
GASWIN
HOLA88
ALEXIS17
Y200M
MPLAY5000
MPOLANGIT
SIHOKI
SULTAN33
SAVAYASLOT
MONTE77
BARDI4D
PSTOTO99
SGO777
MACO4D
TAJIR77
UNOSLOT
BABE168
SULTANJP
KINGS128
KADERSLOT
TOTO911
KUATJP
LUNAS168
JOKER888
GIGASLOT88
GMSLOT88
HOBI188
IBET44
IDWIN
IGCWIN
OVOKER
TEXASPOKER
HOKIVEGAS
POKERBOYA
RGOPOKER
INDOWINBET
HKBPOKER
ROYALPOKER
HKBPOKERQQ
ALFA303
INDODINGDONG
RGOBET
EYANGPOKER
BROVEGAS
GITARTOGEL
GITARPOKER
AHABET
KTP303
MABOSWAY
KBO77
GIGASLOT88
GMSLOT88
HOBI188
IBET44
IDWIN
IGCWIN
DEWIJOKER
DRAGON303
FANTASYSLOT
FORWIN77
GBO007
GBOPLAY138
GBOSLOT
GBOWIN
NAGA168
PBOWIN
UANG77
MVP288
MURAHSLOT
MASHOKI
GITAR100
ERAPLAY88
GOLDENCROWNPOKER
HPPOKER
DNDPOKER
SUPER138
RAKSASA123
MOTORSLOT77
KUDASAKTI168
ERA77
526BET
52TOGEL
76SLOT
LEXISPOKER
LVONLINE
KAPAL4D
KAPAL4D2
MOMOPOKER
K7BOLA
NAGABOLA
TOGELHOK
WAZEPOKER
WARKOPPOKER
PORN VIDEO
https://link.space/@Hikaribet
https://bio.site/Hikaribet
https://heylink.me/Hikaribet39

Strategi Ampuh Menang di Slot Zeus: Panduan Pemula hingga Pro

Slot Zeus Online: Game RTP Tinggi yang Wajib Dicoba Pemain Slot!

Slot Gacor Paling Gacor Terbaik

Review Lengkap Slot Zeus Online: Apakah Game Ini Layak Dimainkan?

Rahasia Menang di Slot Zeus Online: Strategi dan Tips Terbaru 2025

Mitos vs Fakta: Apakah Slot Zeus Benar-benar Menguntungkan?

Keunggulan Slot Zeus Dibandingkan Game Slot Lain, Wajib Tahu!

Fakta Menarik Slot Zeus Online: Fitur Bonus dan Jackpot Besar!

Cara Bermain Slot Zeus Online Agar Maksimal dan Menghasilkan Cuan

Slot Zeus Online: Cara Memanfaatkan Free Spin untuk Maksimal Jackpot!

10 Alasan Kenapa Slot Zeus Online Jadi Favorit Para Pemain Slot

CMBET88
Gamelantogel
CMBET88
didascaliasdelteatrocaminito.com
glenellynrent.com
gypsumboardequipment.com
realseller.org
https://harrysphone.com/upin
gyergyoalfalu.ro/tokek
vipokno.by/gokil
winjospg.com
winjos801.com/
www.logansquarerent.com
internationalfintech.com/bamsz
condowizard.ca
jawatoto889.com
hikaribet3.live
hikaribet1.com
heylink.me/hikaribet
www.nomadsumc.org
condowizard.ca/aromatoto
euro2024gol.com
www.imaracorp.com
daftarsekaibos.com
stuffyoucanuse.org/juragan
Toto Macau 4d
Aromatoto
Lippototo
Mbahtoto
Winjos
152.42.229.23
bandarlotre126.com
heylink.me/sekaipro
www.get-coachoutletsonline.com
wholesalejerseyslord.com
Lippototo
Zientoto
Lippototo
Situs Togel Resmi
Fajartoto
Situs Togel
Toto Macau
Winjos
Winlotre
Aromatoto
design-develop-test.com
winlotre.online
winlotre.xyz
winlotre.us
winlotrebandung.com
winlotrepalu.com
winlotresurabaya.shop
winlotrejakarta.com
winlotresemarang.shop
winlotrebali.shop
winlotreaceh.shop
winlotremakmur.com
Dadu Online
Taruhantoto
a Bandarlotre
bursaliga
lakitoto
aromatoto
Rebahin
untungslot.pages.dev
slotpoupler.pages.dev
rtpliveslot88a.pages.dev
tipsgameslot.pages.dev
pilihslot88.pages.dev
fortuertiger.pages.dev
linkp4d.pages.dev
linkslot88a.pages.dev
slotpgs8.pages.dev
markasjudi.pages.dev
saldo69.pages.dev
slotbenua.pages.dev
saingtoto.pages.dev
markastoto77.pages.dev
jowototo88.pages.dev
sungli78.pages.dev
volatilitas78.pages.dev
bonusbuy12.pages.dev
slotoffiline.pages.dev
dihindari77.pages.dev
rtpdislot1.pages.dev
agtslot77.pages.dev
congtoto15.pages.dev
hongkongtoto7.pages.dev
sinarmas177.pages.dev
hours771.pages.dev
sarana771.pages.dev
kananslot7.pages.dev
balitoto17.pages.dev
jowototo17.pages.dev
aromatotoding.com
unyagh.org
fairparkcounseling.com/gap/
impress-newtex.com/ajax/
SULTAN88
SULTANSLOT
RAJA328
JOIN88+
HOKIBET
GFC88
RusiaSlot88
Tahu69
BONANZA99
Pragmabet
mega55
luxury777
luxury333
borju89
qqgaming
KEDAI168
mega777
nagaslot777
TAKSU787
kkslot777
MAS77TOTO
BANDAR55+
BOS303
Login-HOKI99/
NUSA365
YUHUSLOT
ktp168
GALAXY138