Beyond RAG: SEARCH-R1 integrates search engines directly into reasoning models


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Large language models (LLMs) have seen remarkable advancements in using reasoning capabilities. However, their ability to correctly reference and use external data — information that they weren’t trained on — in conjunction with reasoning has largely lagged behind. 

This is an issue especially when using LLMs in dynamic, information-intensive scenarios that demand up-to-date data from search engines.

But an improvement has arrived: SEARCH-R1, a technique introduced in a paper by researchers at the University of Illinois at Urbana-Champaign and the University of Massachusetts Amherst, trains LLMs to generate search queries and seamlessly integrate search engine retrieval into their reasoning. 

With enterprises seeking ways to integrate these new models into their applications, techniques such as SEARCH-R1 promise to unlock new reasoning capabilities that rely on external data sources.

The challenge of integrating search with LLMs

Search engines are crucial for providing LLM applications with up-to-date, external knowledge. The two main methods for integrating search engines with LLMs are Retrieval-Augmented Generation (RAG) and tool use, implemented through prompt engineering or model fine-tuning. 

However, both methods have limitations that make them unsuitable for reasoning models. RAG often struggles with retrieval inaccuracies and lacks the ability to perform multi-turn, multi-query retrieval, which is essential for reasoning tasks. 

Prompting-based tool use often struggles with generalization, while training-based approaches require extensive, annotated datasets of search-and-reasoning interactions, which are difficult to produce at scale.

(In our own experiments with reasoning models, we found that information retrieval remains one of the key challenges.) 

SEARCH-R1

SEARCH-R1 enables LLMs to interact with search engines during their reasoning process as opposed to having a separate retrieval stage.

SEARCH-R1 defines the search engine as part of the LLM’s environment, enabling the model to integrate its token generation with search engine results seamlessly. 

The researchers designed SEARCH-R1 to support iterative reasoning and search. The model is trained to generate separate sets of tokens for thinking, search, information, and answer segments. This means that during its reasoning process (marked by <think></think> tags), if the model determines that it needs external information, it generates a <search></search> sequence that contains the search query. The query is then passed on to a search engine and the results are inserted into the context window in an <information></information> segment. The model then continues to reason with the added context and when ready, generates the results in an <answer></answer> segment.

This structure allows the model to invoke the search engine multiple times as it reasons about the problem and obtains new information (see example below).

Example of LLM reasoning with SEARCH-R1 (source: arXiv)

Reinforcement learning

Training LLMs to interleave search queries with their reasoning chain is challenging. To simplify the process, the researchers designed SEARCH-R1 to train the model through pure reinforcement learning (RL), where the model is left to explore the use of reasoning and search tools without guidance from human-generated data.

SEARCH-R1 uses an “outcome-based reward model,” in which the model is only evaluated based on the correctness of the final response. This eliminates the need for creating complex reward models that verify the model’s reasoning process.

This is the same approach used in DeepSeek-R1-Zero, where the model was given a task and only judged based on the outcome. The use of pure RL obviates the need to create large datasets of manually annotated examples (supervised fine-tuning).

“SEARCH-R1 can be viewed as an extension of DeepSeek-R1, which primarily focuses on parametric reasoning by introducing search-augmented RL training for enhanced retrieval-driven decision-making,” the researchers write in their paper.

SEARCH-R1 in action

The researchers tested SEARCH-R1 by fine-tuning the base and instruct versions of Qwen-2.5 and Llama-3.2 and evaluating them on seven benchmarks encompassing a diverse range of reasoning tasks requiring single-turn and multi-hop search. They compared SEARCH-R1 against different baselines:‌ direct inference with Chain-of-Thought (CoT) reasoning, inference with RAG, and supervised fine-tuning for tool use.

SEARCH-R1 consistently outperforms baseline methods by a fair margin. It also outperforms reasoning models trained on RL but without search retrieval. “This aligns with expectations, as incorporating search into LLM reasoning provides access to relevant external knowledge, improving overall performance,” the researchers write.

SEARCH-R1 is also effective for different model families and both base and instruction-tuned variants, suggesting that RL with outcome-based rewards can be useful beyond pure reasoning scenarios. The researchers have released the code for SEARCH-R1 on GitHub.

SEARCH-R1’s ability to autonomously generate search queries and integrate real-time information into reasoning can have significant implications for enterprise applications. It can enhance the accuracy and reliability of LLM-driven systems in areas such as customer support, knowledge management, and data analysis. By enabling LLMs to dynamically adapt to changing information, SEARCH-R1 can help enterprises build more intelligent and responsive AI solutions. This capability can be very helpful for applications that require access to constantly changing data, and that require multiple steps to find an answer. 

It also suggests that we have yet to explore the full potential of the new reinforcement learning paradigm that has emerged since the release of DeepSeek-R1.



Source link

Share

Latest Updates

Frequently Asked Questions

Related Articles

NASA reestablishes contact with one of two TRACERS satellites

WASHINGTON — NASA has restored contact with one of a pair of space...

tata technologies: Tata Technologies to fully acquire ES-Tec Group for nearly Rs 775 crore

Global product engineering and digital services firm Tata Technologies on Saturday said it...

Albania Appoints an AI as Government Official

Albania has appointed the world's first-ever AI government official in hopes of rooting...
sabung ayam online sabung ayam online sabung ayam online sabung ayam online sabung ayam online Sabung Ayam Online Sv388 Sv388 SV388 sabung ayam online sabung ayam online Sabung Ayam Online sabung ayam online sabung ayam online sabung ayam online Sabung ayam online Sabung ayam online SV388 sabung ayam online sabung ayam online sabung ayam online sabung ayam online sabung ayam online sabung ayam online SV388 sabung ayam online SV388 SV388 Sabung Ayam Online Sabung Ayam Online Sabung Ayam Online Sabung Ayam Online Sv388 SV388 SV388 sabung ayam online sv388 sv388 sabung ayam online sv388
judi bola judi bola Judi bola SBOBET judi bola judi bola judi bola Judi Bola Online judi bola judi bola judi bola judi bola judi bola judi bola juara303 juara303 Judi bola online judi bola judi bola judi bola judi bola judi bola judi bola judi bola judi bola SBOBET judi bola judi bola judi bola Judi Bola SBOBET88 SBOBET88 judi bola judi bola judi bola JUDI BOLA ONLINE JUDI BOLA ONLINE SBOBET88 Judi Bola Judi Bola judi bola judi bola judi bola judi bola judi bola Judi Bola Online judi bola judi bola judi bola judi bola mix parlay
CASINO ONLINE SLOT GACOR live casino mahjong ways Live Casino Online Slot Gacor Mahjong Ways slot pulsa Casino Online Slot Gacor Mix Parlay live casino online live casino online LIVE CASINO ONLINE LIVE CASINO ONLINE slot pulsa slot pulsa slot pulsa Mpo Slot
https://ejurnal.staidarulkamal.ac.id/ https://doctorsnutritionprogram.com/ https://nielsen-restaurante.com/ https://www.atobapizzaria.com.br/ https://casadeapoio.com.br/ https://bracoalemao.com.br/ https://letspetsresort.com.br/ https://mmsolucoesweb.com.br/ https://procao.com.br/
Rahasia Kemenangan di Mahjong Wild Pemain Tidak Menyangka Pola Scatter Jangan Anggap Remeh Mahjong Wild Pemain Pemula Heran Setelah Coba Mahjong Wild Menemukan Pola Rahasia yang Bikin Scatter Muncul Pola Scatter Rahasia yang Baru Terbongkar Pola Rahasia Pemain Pemula Terbongkar Mereka Ketagihan Karena Sering Dapat Kemenangan Mereka Ketagihan Karena Sering Dapat Kemenangan Trik Sederhana Saat Taruhan Kecil Pola Wild Liar Tersembunyi Bisa Menggandakan uang Pola Rahasia Baru Bisa Menghasilkan Wild Buktikan Pola Wild Liar dan Scatter Hitam Kaya Setelah Main Mahjong Wild Pria Asal Nepal Obrak-Abarik Kantor DPR