A look under the hood of transfomers, the engine driving AI model evolution


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Today, virtually every cutting-edge AI product and model uses a transformer architecture. Large language models (LLMs) such as GPT-4o, LLaMA, Gemini and Claude are all transformer-based, and other AI applications such as text-to-speech, automatic speech recognition, image generation and text-to-video models have transformers as their underlying technology.  

With the hype around AI not likely to slow down anytime soon, it’s time to give transformers their due, which is why I’d like to explain a little about how they work, why they are so important for the growth of scalable solutions and why they are the backbone of LLMs.  

Transformers are more than meets the eye 

In brief, a transformer is a neural network architecture designed to model sequences of data, making them ideal for tasks such as language translation, sentence completion, automatic speech recognition and more. Transformers have really become the dominant architecture for many of these sequence modeling tasks because the underlying attention-mechanism can be easily parallelized, allowing for massive scale when training and performing inference.  

Originally introduced in a 2017 paper, “Attention Is All You Need” from researchers at Google, the transformer was introduced as an encoder-decoder architecture specifically designed for language translation. The following year, Google released bidirectional encoder representations from transformers (BERT), which could be considered one of the first LLMs — although it’s now considered small by today’s standards. 

Since then — and especially accelerated with the advent of GPT models from OpenAI — the trend has been to train bigger and bigger models with more data, more parameters and longer context windows.   

To facilitate this evolution, there have been many innovations such as: more advanced GPU hardware and better software for multi-GPU training; techniques like quantization and mixture of experts (MoE) for reducing memory consumption; new optimizers for training, like Shampoo and AdamW; techniques for efficiently computing attention, like FlashAttention and KV Caching. The trend will likely continue for the foreseeable future. 

The importance of self-attention in transformers

Depending on the application, a transformer model follows an encoder-decoder architecture. The encoder component learns a vector representation of data that can then be used for downstream tasks like classification and sentiment analysis. The decoder component takes a vector or latent representation of the text or image and uses it to generate new text, making it useful for tasks like sentence completion and summarization. For this reason, many familiar state-of-the-art models, such the GPT family, are decoder only.   

Encoder-decoder models combine both components, making them useful for translation and other sequence-to-sequence tasks. For both encoder and decoder architectures, the core component is the attention layer, as this is what allows a model to retain context from words that appear much earlier in the text.  

Attention comes in two flavors: self-attention and cross-attention. Self-attention is used for capturing relationships between words within the same sequence, whereas cross-attention is used for capturing relationships between words across two different sequences. Cross-attention connects encoder and decoder components in a model and during translation. For example, it allows the English word “strawberry” to relate to the French word “fraise.”  Mathematically, both self-attention and cross-attention are different forms of matrix multiplication, which can be done extremely efficiently using a GPU. 

Because of the attention layer, transformers can better capture relationships between words separated by long amounts of text, whereas previous models such as recurrent neural networks (RNN) and long short-term memory (LSTM) models lose track of the context of words from earlier in the text. 

The future of models 

Currently, transformers are the dominant architecture for many use cases that require LLMs and benefit from the most research and development. Although this does not seem likely to change anytime soon, one different class of model that has gained interest recently is state-space models (SSMs) such as Mamba. This highly efficient algorithm can handle very long sequences of data, whereas transformers are limited by a context window.  

For me, the most exciting applications of transformer models are multimodal models. OpenAI’s GPT-4o, for instance, is capable of handling text, audio and images — and other providers are starting to follow. Multimodal applications are very diverse, ranging from video captioning to voice cloning to image segmentation (and more). They also present an opportunity to make AI more accessible to those with disabilities. For example, a blind person could be greatly served by the ability to interact through voice and audio components of a multimodal application.  

It’s an exciting space with plenty of potential to uncover new use cases. But do remember that, at least for the foreseeable future, are largely underpinned by transformer architecture. 

Terrence Alsup is a senior data scientist at Finastra.

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing an article of your own!

Read More From DataDecisionMakers



Source link

Share

Latest Updates

Frequently Asked Questions

Related Articles

Balancing national security and international cooperation in the competitive era of commercial space

Theodore Roosevelt, 26th President of the United States, and I share a few...

Swiggy Instamart to begin delivering smartphones in 10 major cities

Quick commerce platform Swiggy Instamart on Monday said it will now deliver smartphones...
https://iskra-ms.ru/dir/
http://www.anfahr.com/wp-admin/
https://nabinastore.com/
SULTAN88
SULTANSLOT
RAJA328
JOIN88
GFC88
HOKIBET
RUSIASLOT88
TAHU69
BONANZA99
PRAGMABET
MEGA55
LUXURY777
LUXURY333
BORJU89
QQGAMING
KEDAI168
MEGA777
NAGASLOT777
TAKSU787
KKSLOT777
MAS77TOTO
bandar55
BOS303
HOKI99
NUSA365
YUHUSLOT
KTP168
GALAXY138
NEXIA138
PETIR33
BOOM138
MEGA888
CABE888
FOSIL777
turbospin138
KAPAKBET
SUPERJP
sultankoin99
dragon88
raffi888
kenzobet
aladin666
rgo365
ubm4d
GERCEP88
VIVA99
CR777
VOXY88
delman567
intan69
CABE888
RNR303
LOGO303
PEMBURUGACOR
mpo383
cermin4d
bm88
ANGKA79
WOWHOKI
ROKET303
MPOXL
GURITA168
SUPRASLOT
SGCWIN
DESA88
ARWANA388
DAUNEMAS
ALADDIN666
BIOWIN69
SKY77
DOTA88
NAGA138
API5000
y200m
PLAYBOOK88
LUXURY12
A200M
MPO700
KENANGAN4D
cakrabola
PANDAGENDUT
MARVEL77
UG300
HOKI178
MONTE77
JASABOLA
UNTAR4D
LIDO88
MAFIABOLA77
GASPOL189
mpo999
untung138
TW88
JAGUAR33
MPOBOS
SHIO88
VIVO4D
MPOXL
JARISAKTI
BBO303
AONCASH
ANGKER4D
LEVIS4D
JAGO88
REPUBLIK365
BOSDEAL88
BOLA168
akunjp
WARTEGBET
EZEBET
88PULSA
KITAB4D
BOSDEAL88
STUDIOBET
MESINKOIN
BIMA88
PPNUSA
ABGBET88
TOP77
BAYAR77
YES77
BBTN4D
BBCA4D
VSLOTS88
MPO800
PAHALA4D
KPI4D
JURAGAN77
QQ188
BOLAPELANGI
C200M
QQ998
GWKTOGEL
MEGABANDAR
COLOWIN
VIP579
SEVEN4D
MPO188
DEWATA88
SURAT4D
SINAR123
LAMBO77
GUDANG4D
AWAN4D
PLANETLIGA
GT88
ROYALSPIN88
MAMAJITU
MITO99
PEDIA4D
WIBU69JP
333HOKI
SIDARMA88
NAGAEMAS99
HOLA88
CAKAR76
KINGTOTO
RATUGAMING
SSI168
PILAR168
ACTOTO
EYANGTOGEL
KAISAR328
SLOT628
KAISAR88
DOTA88
MAXWIN369
ALIBABA99
MM168
SQUAD777
NAGABET88
JAYABOLA
SEMPATIGAME
PANDAJAGO
PIKAT4D
SINGA77
YUYU33
MASTERPLAY99
VICTORY39
NASA4D
PERMATA55
SAKAUSLOT
CK303
MPOTOWER
CIPUTRABET
WINJUDI
DEWI5000
IYA777
MAHIRTOTO
GOSLOT88
TIPTOP4D
RAJA787
JBO680
JOKER188
EPICPLAY88
TRIVABET
KAISAR189
JOKER81
JPSPIN88
MAYORA4D
DJARUMPLAY
OVO88
BAKTI78
WINGSLOT77
ICAFE4D
PDTOTO
JETPLAY88
JETPLAY88
STADIUM4D
RAJAVIP777
ISB388
GASSPOL168
JITU33
ISTANA8899
CERI123
VIPPELANGI99
55WEALTH
LIGAJUARA
RAJAPKV
HMTOTO
PERKASA99
DEWIGG
MASTERKIU
DAFTARJP268
BATENGMERAH
YOGATOTO
GRAZYRICH88
RGO365
TIKI4D
GBOSKY
RANS4D
GRAND4D
GARUDABET77
BOLABESAR
KASIR777
WINPALACE88
SAMUDRBET
JAGO89
IBCBET
SUPER126
BIZZ77GAMES
ASET69
GAMESPOLLS
LOGO303
JETHOKI
FERRARITOTO
SULTAN69
BARUNATOTO
MDSBET
HOBBIQQ
SARANG188
HEPI55
NARUTOBET
ASIABET4D
PRAGMABET
OKEBOS138
HAHA55
VOCAL77
GATOT4D
LANANGBET
BONCEL4D
TUKUL777
BOOKIE7
PAJAKBOLA
5DEWA
WAHIDTOTO
CSOWIN
OMG303
WINLIVE4D
ALADDIN666
LUMIO777
GBOPLAY777
GEBER88
BETWIN89
BIBIT88
BIJITOGEL
BIMOIN88
BINGOSLOT88
BINTANG29
BINTANG4D
BISABET
BOJO88
BOLA99
BOLAKAWAN
BOROBUDURBET
BOSDEAL88
BOSKU123
HOKI138
BOSS177
BOSSKLIK
BP77
GARUDA999
ABO777
MAXBET268
BANDARSBO
UGDEWA
ANAKNAGA
BIGSLOT
FYP138
SKYWIN386
KOBOY789
YYPAUS
LUCKY77
ISTANAIMPIAN4
PEDRO4D
SEMAR123
AKSARA88
VIRGO168
JUALTOTO
KAISAR89
CAPSAWINS
SUKI99
SIARIL
BOSSLOT138
PRAGMATIC777
ARWANA89
DUKUN138
KOI77
SBA99
GOWD
ANAKTOTO
JAKJP
EU9
ZONA66
MURAH138
SULE88
PPNUSA
PENCETAJA
RAFI168
MURAH138_LOGIN
PATEN77
ACETOTO888
CUAN368
KENZO123
DEWAWIN365
KUPONTOTO
MPOTOP88
TOKYO188
SLOT88RESMI
CAPTAIN77
PECINTA4D
PANEN33
TANTAN88
OMEGA138
KUDA77
BLURAYUFR
YANDEXEU
K86SPORT
ASIAKLUB
ION55
OTW78
POOLS303
ALL303
MPOBOS
MEGA118
MAMEN123
MEVIUS88
77ROYAL
DRAGON222
337SPORTS
QQ1221
CAFE69
TKO77
GELEK4D
DOMINO76
PPSNUSA
ANDAHOKI
OASIS88
SOHIB4D
HERMES21
NEON4D
GASWIN
HOLA88
ALEXIS17
Y200M
MPLAY5000
MPOLANGIT
SIHOKI
SULTAN33
SAVAYASLOT
MONTE77
BARDI4D
PSTOTO99
SGO777
MACO4D
TAJIR77
UNOSLOT
BABE168
SULTANJP
KINGS128
KADERSLOT
TOTO911
KUATJP
LUNAS168
JOKER888
GIGASLOT88
GMSLOT88
HOBI188
IBET44
IDWIN
IGCWIN
OVOKER
TEXASPOKER
HOKIVEGAS
POKERBOYA
RGOPOKER
INDOWINBET
HKBPOKER
ROYALPOKER
HKBPOKERQQ
ALFA303
INDODINGDONG
RGOBET
EYANGPOKER
BROVEGAS
GITARTOGEL
GITARPOKER
AHABET
KTP303
MABOSWAY
KBO77
GIGASLOT88
GMSLOT88
HOBI188
IBET44
IDWIN
IGCWIN
DEWIJOKER
DRAGON303
FANTASYSLOT
FORWIN77
GBO007
GBOPLAY138
GBOSLOT
GBOWIN
NAGA168
PBOWIN
UANG77
MVP288
MURAHSLOT
MASHOKI
GITAR100
ERAPLAY88
GOLDENCROWNPOKER
HPPOKER
DNDPOKER
SUPER138
RAKSASA123
MOTORSLOT77
KUDASAKTI168
ERA77
526BET
52TOGEL
76SLOT
LEXISPOKER
LVONLINE
KAPAL4D
KAPAL4D2
MOMOPOKER
K7BOLA
NAGABOLA
TOGELHOK
WAZEPOKER
WARKOPPOKER
https://link.space/@Hikaribet
https://bio.site/Hikaribet
https://heylink.me/Hikaribet39

Strategi Ampuh Menang di Slot Zeus: Panduan Pemula hingga Pro

Slot Zeus Online: Game RTP Tinggi yang Wajib Dicoba Pemain Slot!

Slot Gacor Paling Gacor Terbaik

Review Lengkap Slot Zeus Online: Apakah Game Ini Layak Dimainkan?

Rahasia Menang di Slot Zeus Online: Strategi dan Tips Terbaru 2025

Mitos vs Fakta: Apakah Slot Zeus Benar-benar Menguntungkan?

Keunggulan Slot Zeus Dibandingkan Game Slot Lain, Wajib Tahu!

Fakta Menarik Slot Zeus Online: Fitur Bonus dan Jackpot Besar!

Cara Bermain Slot Zeus Online Agar Maksimal dan Menghasilkan Cuan

Slot Zeus Online: Cara Memanfaatkan Free Spin untuk Maksimal Jackpot!

10 Alasan Kenapa Slot Zeus Online Jadi Favorit Para Pemain Slot

PORN VIDEO
CMBET88
Gamelantogel
CMBET88
didascaliasdelteatrocaminito.com
glenellynrent.com
gypsumboardequipment.com
realseller.org
https://harrysphone.com/upin
gyergyoalfalu.ro/tokek
vipokno.by/gokil
winjospg.com
winjos801.com/
www.logansquarerent.com
internationalfintech.com/bamsz
condowizard.ca
jawatoto889.com
hikaribet3.live
hikaribet1.com
heylink.me/hikaribet
www.nomadsumc.org
condowizard.ca/aromatoto
euro2024gol.com
www.imaracorp.com
daftarsekaibos.com
stuffyoucanuse.org/juragan
Toto Macau 4d
Aromatoto
Lippototo
Mbahtoto
Winjos
152.42.229.23
bandarlotre126.com
heylink.me/sekaipro
www.get-coachoutletsonline.com
wholesalejerseyslord.com
Lippototo
Zientoto
Lippototo
Situs Togel Resmi
Fajartoto
Situs Togel
Toto Macau
Winjos
Winlotre
Aromatoto
design-develop-test.com
winlotre.online
winlotre.xyz
winlotre.us
winlotrebandung.com
winlotrepalu.com
winlotresurabaya.shop
winlotrejakarta.com
winlotresemarang.shop
winlotrebali.shop
winlotreaceh.shop
winlotremakmur.com
Dadu Online
Taruhantoto
a Bandarlotre
bursaliga
lakitoto
aromatoto
Rebahin
untungslot.pages.dev
slotpoupler.pages.dev
rtpliveslot88a.pages.dev
tipsgameslot.pages.dev
pilihslot88.pages.dev
fortuertiger.pages.dev
linkp4d.pages.dev
linkslot88a.pages.dev
slotpgs8.pages.dev
markasjudi.pages.dev
saldo69.pages.dev
slotbenua.pages.dev
saingtoto.pages.dev
markastoto77.pages.dev
jowototo88.pages.dev
sungli78.pages.dev
volatilitas78.pages.dev
bonusbuy12.pages.dev
slotoffiline.pages.dev
dihindari77.pages.dev
rtpdislot1.pages.dev
agtslot77.pages.dev
congtoto15.pages.dev
hongkongtoto7.pages.dev
sinarmas177.pages.dev
hours771.pages.dev
sarana771.pages.dev
kananslot7.pages.dev
balitoto17.pages.dev
jowototo17.pages.dev
aromatotoding.com
unyagh.org
fairparkcounseling.com/gap/
impress-newtex.com/ajax/
SULTAN88
SULTANSLOT
RAJA328
JOIN88+
HOKIBET
GFC88
RusiaSlot88
Tahu69
BONANZA99
Pragmabet
mega55
luxury777
luxury333
borju89
qqgaming
KEDAI168
mega777
nagaslot777
TAKSU787
kkslot777
MAS77TOTO
BANDAR55+
BOS303
Login-HOKI99/
NUSA365
YUHUSLOT
ktp168
GALAXY138
samuraitotoplay11.com haraldbluechel.de