Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Wolfgang Lehrach | Code World Models for General Game Playing | Stanford HAI
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Your browser does not support the video tag.
eventSeminar

Wolfgang Lehrach | Code World Models for General Game Playing

Status
Upcoming
Date
Wednesday, May 13, 2026 12:00 PM - 1:15 PM PST/PDT
Location
353 Jane Stanford Way, Stanford, CA, 94305 | Room 119
Topics
Machine Learning
Natural Language Processing
Attend Virtually

While Large Language Models (LLMs) show promise in many domains, relying on them for direct policy generation in games often results in illegal moves and poor strategic play.

Share
Link copied to clipboard!
Event Contact
Stanford HAI
stanford-hai@stanford.edu

Related Events

Zoë Hitzig | How People Use ChatGPT
Mar 09, 202612:00 PM - 1:00 PM
March
09
2026

Despite the rapid adoption of LLM chatbots, little is known about how they are used. We approach this question theoretically and empirically, modeling a user who chooses whether to complete a task herself, ask the chatbot for information that reduces decision noise, or delegate execution to the chatbot...

Event

Zoë Hitzig | How People Use ChatGPT

Mar 09, 202612:00 PM - 1:00 PM

Despite the rapid adoption of LLM chatbots, little is known about how they are used. We approach this question theoretically and empirically, modeling a user who chooses whether to complete a task herself, ask the chatbot for information that reduces decision noise, or delegate execution to the chatbot...

Joel Becker | Reconciling Impressive AI Benchmark Performance with Limited Developer Productivity Impacts
Mar 16, 202612:00 PM - 1:00 PM
March
16
2026

AI coding agents now complete multi-hour coding benchmarks with roughly 50% reliability, yet a randomized trial found experienced open-source developers took about 19% longer when allowed frontier AI tools than when tools were disallowed...

Event

Joel Becker | Reconciling Impressive AI Benchmark Performance with Limited Developer Productivity Impacts

Mar 16, 202612:00 PM - 1:00 PM

AI coding agents now complete multi-hour coding benchmarks with roughly 50% reliability, yet a randomized trial found experienced open-source developers took about 19% longer when allowed frontier AI tools than when tools were disallowed...

Dan Iancu & Antonio Skillicorn | Interpretable Machine Learning and Mixed Datasets for Predicting Child Labor in Ghana’s Cocoa Sector
SeminarMar 18, 202612:00 PM - 1:15 PM
March
18
2026

Child labor remains prevalent in Ghana’s cocoa sector and is associated with adverse educational and health outcomes for children.

Seminar

Dan Iancu & Antonio Skillicorn | Interpretable Machine Learning and Mixed Datasets for Predicting Child Labor in Ghana’s Cocoa Sector

Mar 18, 202612:00 PM - 1:15 PM

Child labor remains prevalent in Ghana’s cocoa sector and is associated with adverse educational and health outcomes for children.

In this talk, I present an approach that moves away from direct prompting, instead using LLMs as program synthesizers to bridge the gap between natural language rules and symbolic world models.  The LLM receives a game description and example trajectories, and outputs an executable, symbolic world model (CWM) represented in Python. The trajectories also ensure the rules are correctly captured and aid in refining the CWM if they are not.  Note that even trajectories containing only a single player's observations and actions can be used to help validate and refine CWMs.  Furthermore, partially observed trajectories also allow comparisons between CWMs via a bound on the likelihood.  


Given a CWM, Monte Carlo Tree Search (MCTS) or Reinforcement Learning (RL) methods can play the game, and gameplay can be further enhanced by adding in LLM-derived synthesized value functions. Imperfect information games are handled by having the LLM synthesize inference functions to impute information sets, or by directly training reinforcement learning policies on top of the CWM.

Speaker
Wolfgang Lehrach
Staff Research Scientist, DeepMind