AI open models connecting LLMs to Google’s Data Commons


Large language models (LLMs) powering today’s AI innovations are becoming increasingly sophisticated. These models can comb through vast amounts of text and generate summaries, suggest new creative directions and even draft code. However, as impressive as these capabilities are, LLMs sometimes confidently present information that is inaccurate. This phenomenon, known as “hallucination,” is a key challenge in generative AI.

Today we’re sharing promising research advancements that tackle this challenge directly, helping reduce hallucination by anchoring LLMs in real-world statistical information. Alongside these research advancements, we are excited to announce DataGemma, the first open models designed to connect LLMs with extensive real-world data drawn from Google’s Data Commons.

Data Commons: A vast repository of publicly available, trustworthy data

Data Commons is a publicly available knowledge graph containing over 240 billion rich data points across hundreds of thousands of statistical variables. It sources this public information from trusted organizations like the United Nations (UN), the World Health Organization (WHO), Centers for Disease Control and Prevention (CDC) and Census Bureaus. Combining these datasets into one unified set of tools and AI models empowers policymakers, researchers and organizations seeking accurate insights.

Think of Data Commons as a vast, constantly expanding database filled with reliable, public information on a wide range of topics, from health and economics to demographics and the environment, which you can interact with in your own words using our AI-powered natural language interface. For example, you can explore which countries in Africa have had the greatest increase in electricity access, how income correlates with diabetes in US counties or your own data-curious query.

How Data Commons can help tackle hallucination

As generative AI adoption is increasing, we’re aiming to ground those experiences by integrating Data Commons within Gemma, our family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models. These DataGemma models are available to researchers and developers starting now.

DataGemma will expand the capabilities of Gemma models by harnessing the knowledge of Data Commons to enhance LLM factuality and reasoning using two distinct approaches:

1. RIG (Retrieval-Interleaved Generation) enhances the capabilities of our language model, Gemma 2, by proactively querying trusted sources and fact-checking against information in Data Commons. When DataGemma is prompted to generate a response, the model is programmed to identify instances of statistical data and retrieve the answer from Data Commons. While the RIG methodology is not new, its specific application within the DataGemma framework is unique.



Source link

Share

Latest Updates

Frequently Asked Questions

Related Articles

Cut the noise and dive into history, science, and culture with MagellanTV

TL;DR: Stream thousands of documentaries across science, history, and culture with lifetime access to MagellanTV...

Access Denied

Access Denied You don't have permission to access "http://www.gadgets360.com/science/news/canadian-startup-qubic-unveils-cryogenic-amplifier-that-could-transform-quantum-computing-9267232" on this server. Reference #18.79cfdb17.1757752917.49361e70 https://errors.edgesuite.net/18.79cfdb17.1757752917.49361e70 Source...

oracle: What Oracle didn’t foresee? Techies’ millions in a moment

Oracle’s ‘Nvidia moment’ did much more than instantly catapult cofounder Larry Ellison to...
sabung ayam online sabung ayam online sabung ayam online sabung ayam online sabung ayam online Sabung Ayam Online Sv388 Sv388 SV388 sabung ayam online sabung ayam online Sv388 Sabung Ayam Online sabung ayam online sabung ayam online sabung ayam online Sabung ayam online Sabung ayam online SV388 sabung ayam online sabung ayam online sabung ayam online sabung ayam online sabung ayam online sabung ayam online SV388 sabung ayam online SV388 SV388 Sabung Ayam Online Sabung Ayam Online SABUNG AYAM ONLINE Sabung Ayam Online Sabung Ayam Online Sv388 SV388 SV388 sabung ayam online sv388 sv388 sabung ayam online sv388
judi bola judi bola Judi bola SBOBET judi bola judi bola judi bola Judi Bola Online judi bola judi bola judi bola judi bola judi bola judi bola juara303 juara303 Judi bola online judi bola judi bola judi bola judi bola judi bola judi bola judi bola judi bola SBOBET88 SBOBET judi bola judi bola judi bola Judi Bola SBOBET88 SBOBET88 judi bola judi bola judi bola JUDI BOLA ONLINE JUDI BOLA ONLINE SBOBET88 Judi Bola Judi Bola judi bola judi bola judi bola judi bola judi bola Judi Bola Online Judi Bola Online judi bola judi bola
CASINO ONLINE SLOT GACOR live casino mahjong ways Sbobet88 Hongkong pools Live Casino Online Slot Gacor Mahjong Ways slot pulsa Casino Online Slot Gacor Mix Parlay live casino online live casino online LIVE CASINO ONLINE LIVE CASINO ONLINE slot pulsa slot pulsa slot pulsa situs bola Mpo Slot
https://ejurnal.staidarulkamal.ac.id/ https://doctorsnutritionprogram.com/ https://nielsen-restaurante.com/ https://www.atobapizzaria.com.br/ https://casadeapoio.com.br/ https://bracoalemao.com.br/ https://letspetsresort.com.br/ https://mmsolucoesweb.com.br/ https://procao.com.br/
Rahasia Kemenangan di Mahjong Wild Pemain Tidak Menyangka Pola Scatter Jangan Anggap Remeh Mahjong Wild Pemain Pemula Heran Setelah Coba Mahjong Wild Menemukan Pola Rahasia yang Bikin Scatter Muncul Pola Scatter Rahasia yang Baru Terbongkar Pola Rahasia Pemain Pemula Terbongkar Mereka Ketagihan Karena Sering Dapat Kemenangan Mereka Ketagihan Karena Sering Dapat Kemenangan Trik Sederhana Saat Taruhan Kecil Pola Wild Liar Tersembunyi Bisa Menggandakan uang Pola Rahasia Baru Bisa Menghasilkan Wild Buktikan Pola Wild Liar dan Scatter Hitam Kaya Setelah Main Mahjong Wild Pria Asal Nepal Obrak-Abarik Kantor DPR