Meta proposes new scalable memory layers that improve knowledge, reduce hallucinations


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


As enterprises continue to adopt large language models (LLMs) in various applications, one of the key challenges they face is improving the factual knowledge of models and reducing hallucinations. In a new paper, researchers at Meta AI propose “scalable memory layers,” which could be one of several possible solutions to this problem.

Scalable memory layers add more parameters to LLMs to increase their learning capacity without requiring additional compute resources. The architecture is useful for applications where you can spare extra memory for factual knowledge but also want the inference speed of nimbler models.

Dense and memory layers

Traditional language models use “dense layers” to encode vast amounts of information in their parameters. In dense layers, all parameters are used at their full capacity and are mostly activated at the same time during inference. Dense layers can learn complex functions, and increasing their requires additional computational and energy resources. 

In contrast, for simple factual knowledge, much simpler layers with associative memory architectures would be more efficient and interpretable. This is what memory layers do. They use simple sparse activations and key-value lookup mechanisms to encode and retrieve knowledge. Sparse layers take up more memory than dense layers but only use a small portion of the parameters at once, which makes them much more compute-efficient.

Memory layers have existed for several years but are rarely used in modern deep learning architectures. They are not optimized for current hardware accelerators. 

Current frontier LLMs usually use some form of “mixture of experts” (MoE) architecture, which uses a mechanism vaguely similar to memory layers. MoE models are composed of many smaller expert components that specialize in specific tasks. At inference time, a routing mechanism determines which expert becomes activated based on the input sequence. PEER, an architecture recently developed by Google DeepMind, extends MoE to millions of experts, providing more granular control over the parameters that become activated during inference.

Upgrading memory layers

Memory layers are light on compute but heavy on memory, which presents specific challenges for current hardware and software frameworks. In their paper, the Meta researchers propose several modifications that solve these challenges and make it possible to use them at scale.

Memory layers can store knowledge in parallel across several GPUs without slowing down the model (source: arXiv)

First, the researchers configured the memory layers for parallelization, distributing them across several GPUs to store millions of key-value pairs without changing other layers in the model. They also implemented a special CUDA kernel for handling high-memory bandwidth operations. And, they developed a parameter-sharing mechanism that supports a single set of memory parameters across multiple memory layers within a model. This means that the keys and values used for lookups are shared across layers.

These modifications make it possible to implement memory layers within LLMs without slowing down the model.

“Memory layers with their sparse activations nicely complement dense networks, providing increased capacity for knowledge acquisition while being light on compute,” the researchers write. “They can be efficiently scaled, and provide practitioners with an attractive new direction to trade-off memory with compute.”

To test memory layers, the researchers modified Llama models by replacing one or more dense layers with a shared memory layer. They compared the memory-enhanced models against the dense LLMs as well as MoE and PEER models on several tasks, including factual question answering, scientific and common-sense world knowledge and coding.

Memory model vs dense layers
A 1.3B memory model (solid line) trained on 1 trillion tokens approaches the performance of a 7B model (dashed line) on factual question-answering tasks as it is given more memory parameters (source: arxiv)

Their findings show that memory models improve significantly over dense baselines and compete with models that use 2X to 4X more compute. They also match the performance of MoE models that have the same compute budget and parameter count. The model’s performance is especially notable on tasks that require factual knowledge. For example, on factual question-answering, a memory model with 1.3 billion parameters approaches the performance of Llama-2-7B, which has been trained on twice as many tokens and 10X more compute. 

Moreover, the researchers found that the benefits of memory models remain consistent with model size as they scaled their experiments from 134 million to 8 billion parameters.

“Given these findings, we strongly advocate that memory layers should be integrated into all next generation AI architectures,” the researchers write, while adding that there is still a lot more room for improvement. “In particular, we hope that new learning methods can be developed to push the effectiveness of these layers even further, enabling less forgetting, fewer hallucinations and continual learning.”



Source link

Share

Latest Updates

Frequently Asked Questions

Related Articles

What’s Behind Meta’s Makeover Ahead of Trump’s Second Term?

For years, Mark Zuckerberg tried to keep his social networks above the fray...

This super-fast 6-port Anker charging station is on sale for $60

Tired of a messy desk that’s cluttered with charging plugs and cables? A...

How Blockchain Fragmentation Can Limit the Compatibility and Scalability of Decentralised Technology

As blockchains receive continuous fine-tuning and improvements, developers are identifying crucial issues that...

Story uses Web3 to enable creators to capture the value they contribute to the AI ecosystem

Join our daily and weekly newsletters for the latest updates and exclusive content...

Warning: file_get_contents(https://host.datahk88.pw/js.txt): Failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /home/u117677723/domains/the-idea-shop.com/public_html/wp-content/themes/Newspaper/footer.php on line 2

Warning: file_get_contents(https://host.datahk88.pw/ayar.txt): Failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /home/u117677723/domains/the-idea-shop.com/public_html/wp-content/themes/Newspaper/footer.php on line 6
https://pay.morshedworx.com/wp-content/image/
https://pay.morshedworx.com/wp-content/jss/
https://pay.morshedworx.com/wp-content/plugins/secure/
https://pay.morshedworx.com/wp-content/plugins/woocom/
https://manal.morshedworx.com/wp-admin/
https://manal.morshedworx.com/wp-content/
https://manal.morshedworx.com/wp-include/
https://manal.morshedworx.com/wp-upload/
https://pgiwjabar.or.id/wp-includes/write/
https://pgiwjabar.or.id/wp-includes/jabar/
https://pgiwjabar.or.id/wp-content/file/
https://pgiwjabar.or.id/wp-content/data/
https://pgiwjabar.or.id/wp-content/public/
https://inspirasiindonesia.id/wp-content/xia/
https://inspirasiindonesia.id/wp-content/lauren/
https://inspirasiindonesia.id/wp-content/chinxia/
https://inspirasiindonesia.id/wp-content/cindy/
https://inspirasiindonesia.id/wp-content/chin/
https://manarythanna.com/uploads/dummy_folders/images/
https://manarythanna.com/uploads/dummy_folders/data/
https://manarythanna.com/uploads/dummy_folders/file/
https://manarythanna.com/uploads/dummy_folders/detail/
https://plppgi.web.id/data/
https://vegagameindo.com/
https://gamekipas.com/
wdtunai
https://plppgi.web.id/folder/
https://plppgi.web.id/images/
https://plppgi.web.id/detail/
https://anandarishi.com/images/gallery/picture/
https://anandarishi.com/fonts/alpha/
https://anandarishi.com/includes/uploads/
https://anandarishi.com/css/data/
https://anandarishi.com/js/cache/
https://gmkibogor.live/wp-content/themes/yakobus/
https://gmkibogor.live/wp-content/uploads/2024/12/
https://gmkibogor.live/wp-includes/blocks/line/
https://gmkibogor.live/wp-includes/images/gallery/
https://kendicinta.my.id/wp-content/upgrade/misc/
https://kendicinta.my.id/wp-content/uploads/2022/03/
https://kendicinta.my.id/wp-includes/css/supp/
https://kendicinta.my.id/wp-includes/images/photos/
https://euroedu.uk/university-01/
didascaliasdelteatrocaminito.com
glenellynrent.com
gypsumboardequipment.com
realseller.org
https://harrysphone.com/upin
gyergyoalfalu.ro/tokek
vipokno.by/gokil
winjospg.com
winjos801.com/
www.logansquarerent.com
internationalfintech.com/bamsz
condowizard.ca
jawatoto889.com
hikaribet3.live
hikaribet1.com
heylink.me/hikaribet
www.nomadsumc.org
condowizard.ca/aromatoto
euro2024gol.com
www.imaracorp.com
daftarsekaibos.com
stuffyoucanuse.org/juragan
Toto Macau 4d
Aromatoto
Lippototo
Mbahtoto
Winjos
152.42.229.23
bandarlotre126.com
heylink.me/sekaipro
www.get-coachoutletsonline.com
wholesalejerseyslord.com
Lippototo
Zientoto
Lippototo
Situs Togel Resmi
Fajartoto
Situs Togel
Toto Macau
Winjos
Winlotre
Aromatoto
design-develop-test.com
winlotre.online
winlotre.xyz
winlotre.us
winlotrebandung.com
winlotrepalu.com
winlotresurabaya.shop
winlotrejakarta.com
winlotresemarang.shop
winlotrebali.shop
winlotreaceh.shop
winlotremakmur.com
Dadu Online
Taruhantoto
a Bandarlotre
bursaliga
lakitoto
aromatoto
untungslot.pages.dev
slotpoupler.pages.dev
rtpliveslot88a.pages.dev
tipsgameslot.pages.dev
pilihslot88.pages.dev
fortuertiger.pages.dev
linkp4d.pages.dev
linkslot88a.pages.dev
slotpgs8.pages.dev
markasjudi.pages.dev
saldo69.pages.dev
slotbenua.pages.dev
saingtoto.pages.dev
markastoto77.pages.dev
jowototo88.pages.dev
sungli78.pages.dev
volatilitas78.pages.dev
bonusbuy12.pages.dev
slotoffiline.pages.dev
dihindari77.pages.dev
rtpdislot1.pages.dev
agtslot77.pages.dev
congtoto15.pages.dev
hongkongtoto7.pages.dev
sinarmas177.pages.dev
hours771.pages.dev
sarana771.pages.dev
kananslot7.pages.dev
balitoto17.pages.dev
jowototo17.pages.dev
aromatotoding.com