Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
The First Workshop of a Public AI Assistant to World Wide Knowledge (WWK) | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
eventWorkshop

The First Workshop of a Public AI Assistant to World Wide Knowledge (WWK)

Status
Past
Date
Thursday, February 13, 2025 - Friday, February 14, 2025
Location
location details by invitation-only
Overview
Agenda Day 1
Agenda Day 2
Watch Event Recordings
Overview
Agenda Day 1
Agenda Day 2
Watch Event Recordings
Share
Link copied to clipboard!
Event Contact
Patrick Hynes
phynes@stanford.edu
Related Links
  • Open Virtual Assistant Lab
  • Podcast: Monica Lam from Stanford University's Open Virtual Assistant Lab

Related Events

AI+Education Summit 2026
ConferenceFeb 11, 20268:00 AM - 5:00 PM
February
11
2026

The AI Inflection Point: What, How, and Why We Learn

Conference

AI+Education Summit 2026

Feb 11, 20268:00 AM - 5:00 PM

The AI Inflection Point: What, How, and Why We Learn

Tom Mitchell | The History of Machine Learning
Feb 23, 202612:00 PM - 1:00 PM
February
23
2026

How did we get to today’s technology which now supports a trillion dollar AI industry? What were the key scientific breakthroughs? What were the surprises and dead-ends along the way...

Event

Tom Mitchell | The History of Machine Learning

Feb 23, 202612:00 PM - 1:00 PM

How did we get to today’s technology which now supports a trillion dollar AI industry? What were the key scientific breakthroughs? What were the surprises and dead-ends along the way...

Gaidi Faraj, Lofred Madzou | Nurturing Africa’s AI Leaders through Math Olympiad
SeminarFeb 25, 202612:00 PM - 1:15 PM
February
25
2026

The African Olympiad Academy is a world-class high school dedicated to training Africa’s most promising students in mathematics, science, and artificial intelligence through olympiad-based pedagogy.

Seminar

Gaidi Faraj, Lofred Madzou | Nurturing Africa’s AI Leaders through Math Olympiad

Feb 25, 202612:00 PM - 1:15 PM

The African Olympiad Academy is a world-class high school dedicated to training Africa’s most promising students in mathematics, science, and artificial intelligence through olympiad-based pedagogy.

February 14, 2025

Tutorial: Transforming LLMs into Reliable Knowledge Assistants

Large Language Models (LLMs) offer incredible potential for information retrieval but can sometimes produce inaccurate details. This tutorial will introduce advanced techniques for building virtual assistants that effectively navigate and analyze diverse knowledge sources, including free-text documents, structured databases, knowledge graphs, and even printed text.

You'll see demonstrations of Stanford's Genie technology, showcasing how it writes comprehensive articles by researching the internet for over 300,000 consumers, helps historians study the 19th-century African Times corpus, assists journalists with Federal Election Commission data, and enables users to extract information from Wikidata, the world's largest knowledge graph. 

Discover how to harness LLMs to create trustworthy and efficient knowledge assistants for various informational needs on your own knowledge corpus.


8:30 am – 9:00 am PST

Breakfast


9:00 am - 9:10 am PST

Overview of the Genie knowledge assistant

The Genie assistant is designed to work with any corpus of data, consisting of 

  • Images of printed material

  • Free text

  • Databases

  • Knowledge graphs

Knowledge access functions include: 

  • Browse

  • Semantic search

  • Interactive chat with near-zero hallucination

  • Answer queries for both free-text and structured data.

  • Write comprehensive research articles with fine-grained citations

 Genie can also

  • Perform interactive tasks under developer control

  • Extract knowledge graph from a document set

  • Detect inconsistencies in existing corpora


9:10 am - 9:50 am PST

Given a free-text corpus of any size and in any language (supported by LLMs), Genie automatically allows users to browse, search, and chat with the corpus. To minimize hallucinations, Genie employs a hybrid Retrieval-Augmented Generation (RAG) strategy, where LLM-generated text is filtered claim-by-claim against the corpus and integrated into an enhanced retrieval pipeline.

Genie can handle historical newspaper scans with complex page layouts where an article is scattered across different columns and pages.

Case Studies:

  • Wikipedia in 25 languages: Genie achieves 98% factual accuracy in conversations with users about recent topics, compared to 43% with GPT-4. This research won the Wikimedia Foundation Research Award of the Year. (

    https://wikichat.genie.stanford.edu, 

    https://search.genie.stanford.edu)

  • The African Times: A newspaper published in the late 19th century by the African diaspora (obtained from The British Museum). (

    https://history.genie.stanford.edu)

  • Chronicling America: A collection of over 200 years of American historical newspapers (obtained from the Library of Congress).

Reference: 

https://arxiv.org/abs/2305.14292


9:50 am - 10:20 am PST

Genie can take any topic and write a full article from a given corpus, complete with references. Genie simulates users with different perspectives to ask pertinent questions on the topic, retrieve information, review it, and ask further questions. Users can interact with the experts in a roundtable discussion to explore their topics of interest; this new paradigm is shown to be much more effective than simple question answering.     

Case studies: An internet subset (as approved by Wikipedia); Semantic Scholar. Over 300,000 users have requested 500,000 topics from lifestyles to scientific research. (

https://storm.genie.stanford.edu)

References: 

https://arxiv.org/abs/2402.14207; https://arxiv.org/abs/2408.15232


10:20 am - 10:50 am PST

Break


10:50 am - 11:20 am PST

Answering Queries of Structured and Unstructured Data

Knowledge corpora often consists of a combination of free text and structured data. When given free-text documents and schemas for the structured data, Genie accepts natural language queries and automatically retrieves the data from the hybrid data sources. Genie translates users’ natural language requests  into formal queries involving tabular, graph, and hybrid databases. It leverages SQL, SPARQL, and the novel SUQL language, which extends SQL to integrate information retrieval for free text with relational database concepts. The Genie agent uses an agentic approach that combines evaluation of intermediate queries and navigation of the knowledge graph to generate the final query.

Case studies:

  • Deployed at Wikidata (

    https://spinach.genie.stanford.edu/)

  • Deployed at Knight Election Hub (Federal Election Commission data). (

    https://datatalk.genie.stanford.edu/) 

References: 

https://arxiv.org/abs/2407.11417; https://arxiv.org/abs/2311.09818.


11:20am - 11:50am PST

Performing Interactive Tasks under Developer Contro

Many interactive tasks, from customer support to business operations, require agents that can provide accurate knowledge and perform specific functions;  direct prompting of LLMs tends to fail on uncommon inputs and complex tasks. Genie combines LLMs with a new Genie Worksheet system that gives developers full control over the agent actions. 

Case studies: Course advisor, Writing research grants (

https://ws.genie.stanford.edu/)

Reference: 

https://arxiv.org/abs/2407.05674

Monica Lam
Professor of Computer Science, and, by courtesy, of Electrical Engineering
Sina Semnani