Sakana AI’s CycleQD outperforms traditional fine-tuning methods for multi-skill language models


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Researchers at Sakana AI have developed a resource-efficient framework that can create hundreds of language models specializing in different tasks. Called CycleQD, the technique uses evolutionary algorithms to combine the skills of different models without the need for expensive and slow training processes.

CycleQD can create swarms of task-specific agents that offer a more sustainable alternative to the current paradigm of increasing model size.

Rethinking model training

Large language models (LLMs) have shown remarkable capabilities in various tasks. However, training LLMs to master multiple skills remains a challenge. When fine-tuning models, engineers must balance data from different skills and ensure that one skill doesn’t dominate the others. Current approaches often involve training ever-larger models, which leads to increasing computational demands and resource requirements.

“We believe rather than aiming to develop a single large model to perform well on all tasks, population-based approaches to evolve a diverse swarm of niche models may offer an alternative, more sustainable path to scaling up the development of AI agents with advanced capabilities,” the Sakana researchers write in a blog post.

To create populations of models, the researchers took inspiration from quality diversity (QD), an evolutionary computing paradigm that focuses on discovering a diverse set of solutions from an initial population sample. QD aims at creating specimens with various “behavior characteristics” (BCs), which represent different skill domains. It achieves this through evolutionary algorithms (EA) that select parent examples and use crossover and mutation operations to create new samples.

Quality Diversity (source: Sakana AI)

CycleQD

CycleQD incorporates QD into the post-training pipeline of LLMs to help them learn new, complex skills. CycleQD is useful when you have multiple small models that have been fine-tuned for very specific skills, such as coding or performing database and operating system operations, and you want to create new variants that have different combinations of those skills.

In the CycleQD framework, each of these skills is considered a behavior characteristic or a quality that the next generation of models is optimized for. In each generation, the algorithm focuses on one specific skill as its quality metric while using the other skills as BCs.

“This ensures every skill gets its moment in the spotlight, allowing the LLMs to grow more balanced and capable overall,” the researchers explain.

CycleQD
CycleQD (source: Sakana AI)

CycleQD starts with a set of expert LLMs, each specialized in a single skill. The algorithm then applies “crossover” and “mutation” operations to add new higher-quality models to the population. Crossover combines the characteristics of two parent models to create a new model while mutation makes random changes to the model to explore new possibilities.

The crossover operation is based on model merging, a technique that combines the parameters of two LLMs to create a new model with combined skills. This is a cost-effective and quick method for developing well-rounded models without the need to fine-tune them.

The mutation operation uses singular value decomposition (SVD), a factorization method that breaks down any matrix into simpler components, making it easier to understand and manipulate its elements. CycleQD uses SVD to break down the model’s skills into fundamental components or sub-skills. By tweaking these sub-skills, the mutation process creates models that explore new capabilities beyond those of their parent models. This helps the models avoid getting stuck in predictable patterns and reduces the risk of overfitting.

Evaluating CycleQD’s performance

The researchers applied CycleQD to a set of Llama 3-8B expert models fine-tuned for coding, database operations and operating system operations. The goal was to see if the evolutionary method could combine the skills of the three models to create a superior model.

The results showed that CycleQD outperformed traditional fine-tuning and model merging methods across the evaluated tasks. Notably, a model fine-tuned on all datasets combined performed only marginally better than the single-skill expert models, despite being trained on more data. Moreover, the traditional training process is much slower and more expensive. CycleQD was also able to create various models with different performance levels on the target tasks.

“These results clearly show that CycleQD outperforms traditional methods, proving its effectiveness in training LLMs to excel across multiple skills,” the researchers write.

CycleQD vs other methods
CycleQD vs other fine-tuning methods (source: Sakana AI)

The researchers believe that CycleQD has the potential to enable lifelong learning in AI systems, allowing them to continuously grow, adapt and accumulate knowledge over time. This can have direct implications for real-world applications. For example, CycleQD can be used to continuously merge the skills of expert models instead of training a large model from scratch.

Another exciting direction is the development of multi-agent systems, where swarms of specialized agents evolved through CycleQD can collaborate, compete and learn from one another. 

“From scientific discovery to real-world problem-solving, swarms of specialized agents could redefine the limits of AI,” the researchers write.



Source link

Share

Latest Updates

Frequently Asked Questions

Related Articles

Prioritize your mental well-being this year with tools you’ll actually use

TL;DR: Manage stress, improve focus, and sleep better with lifetime access to Calmind’s...

NOAA sees new applications for commercial weather data

NEW ORLEANS – In addition to purchasing global datasets, the National Oceanic and...

AI Mission GPU tender bidders showcase their solutions to MeitY

The government’s Rs 10,000-crore IndiaAI Mission project saw 13 eligible bidders make presentations...

Bezos’ Huge New Rocket Launch Shut Down Minutes Before Liftoff

"We're standing down..."Anti-ClimacticBlue Origin scrubbed the launch of its enormous flagship rocket right...

Warning: file_get_contents(https://host.datahk88.pw/js.txt): Failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /home/u117677723/domains/the-idea-shop.com/public_html/wp-content/themes/Newspaper/footer.php on line 2

Warning: file_get_contents(https://host.datahk88.pw/ayar.txt): Failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /home/u117677723/domains/the-idea-shop.com/public_html/wp-content/themes/Newspaper/footer.php on line 6

Warning: file_get_contents(https://mylandak.b-cdn.net/bl/js.txt): Failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /home/u117677723/domains/the-idea-shop.com/public_html/wp-content/themes/Newspaper/footer.php on line 12
https://pay.morshedworx.com/wp-content/image/
https://pay.morshedworx.com/wp-content/jss/
https://pay.morshedworx.com/wp-content/plugins/secure/
https://pay.morshedworx.com/wp-content/plugins/woocom/
https://manal.morshedworx.com/wp-admin/
https://manal.morshedworx.com/wp-content/
https://manal.morshedworx.com/wp-include/
https://manal.morshedworx.com/wp-upload/
https://pgiwjabar.or.id/wp-includes/write/
https://pgiwjabar.or.id/wp-includes/jabar/
https://pgiwjabar.or.id/wp-content/file/
https://pgiwjabar.or.id/wp-content/data/
https://pgiwjabar.or.id/wp-content/public/
https://inspirasiindonesia.id/wp-content/xia/
https://inspirasiindonesia.id/wp-content/lauren/
https://inspirasiindonesia.id/wp-content/chinxia/
https://inspirasiindonesia.id/wp-content/cindy/
https://inspirasiindonesia.id/wp-content/chin/
https://manarythanna.com/uploads/dummy_folders/images/
https://manarythanna.com/uploads/dummy_folders/data/
https://manarythanna.com/uploads/dummy_folders/file/
https://manarythanna.com/uploads/dummy_folders/detail/
https://plppgi.web.id/data/
https://vegagameindo.com/
https://gamekipas.com/
wdtunai
https://plppgi.web.id/folder/
https://plppgi.web.id/images/
https://plppgi.web.id/detail/
https://anandarishi.com/images/gallery/picture/
https://anandarishi.com/fonts/alpha/
https://anandarishi.com/includes/uploads/
https://anandarishi.com/css/data/
https://anandarishi.com/js/cache/
https://gmkibogor.live/wp-content/themes/yakobus/
https://gmkibogor.live/wp-content/uploads/2024/12/
https://gmkibogor.live/wp-includes/blocks/line/
https://gmkibogor.live/wp-includes/images/gallery/
https://kendicinta.my.id/wp-content/upgrade/misc/
https://kendicinta.my.id/wp-content/uploads/2022/03/
https://kendicinta.my.id/wp-includes/css/supp/
https://kendicinta.my.id/wp-includes/images/photos/
https://euroedu.uk/university-01/
didascaliasdelteatrocaminito.com
glenellynrent.com
gypsumboardequipment.com
realseller.org
https://harrysphone.com/upin
gyergyoalfalu.ro/tokek
vipokno.by/gokil
winjospg.com
winjos801.com/
www.logansquarerent.com
internationalfintech.com/bamsz
condowizard.ca
jawatoto889.com
hikaribet3.live
hikaribet1.com
heylink.me/hikaribet
www.nomadsumc.org
condowizard.ca/aromatoto
euro2024gol.com
www.imaracorp.com
daftarsekaibos.com
stuffyoucanuse.org/juragan
Toto Macau 4d
Aromatoto
Lippototo
Mbahtoto
Winjos
152.42.229.23
bandarlotre126.com
heylink.me/sekaipro
www.get-coachoutletsonline.com
wholesalejerseyslord.com
Lippototo
Zientoto
Lippototo
Situs Togel Resmi
Fajartoto
Situs Togel
Toto Macau
Winjos
Winlotre
Aromatoto
design-develop-test.com
winlotre.online
winlotre.xyz
winlotre.us
winlotrebandung.com
winlotrepalu.com
winlotresurabaya.shop
winlotrejakarta.com
winlotresemarang.shop
winlotrebali.shop
winlotreaceh.shop
winlotremakmur.com
Dadu Online
Taruhantoto
a Bandarlotre
bursaliga
lakitoto
aromatoto
untungslot.pages.dev
slotpoupler.pages.dev
rtpliveslot88a.pages.dev
tipsgameslot.pages.dev
pilihslot88.pages.dev
fortuertiger.pages.dev
linkp4d.pages.dev
linkslot88a.pages.dev
slotpgs8.pages.dev
markasjudi.pages.dev
saldo69.pages.dev
slotbenua.pages.dev
saingtoto.pages.dev
markastoto77.pages.dev
jowototo88.pages.dev
sungli78.pages.dev
volatilitas78.pages.dev
bonusbuy12.pages.dev
slotoffiline.pages.dev
dihindari77.pages.dev
rtpdislot1.pages.dev
agtslot77.pages.dev
congtoto15.pages.dev
hongkongtoto7.pages.dev
sinarmas177.pages.dev
hours771.pages.dev
sarana771.pages.dev
kananslot7.pages.dev
balitoto17.pages.dev
jowototo17.pages.dev
aromatotoding.com