What is a Reinforcement Learning? | Stanford HAI
Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs

What is a Reinforcement Learning?

Reinforcement Learning is a type of machine learning where an AI agent improves its performance through trial and error by taking actions within a specific setting. The AI gets positive feedback (rewards) when it does something good and negative feedback (penalties) when it makes a mistake, helping it figure out the best approach over time. Through this process, it can learn to master complex tasks like winning games, navigating mazes, or teaching a robot to walk.

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News


Reinforcement Learning mentioned at Stanford HAI

Explore Similar Terms:

Supervised Learning | Unsupervised Learning | Robotics

See Full List of Terms & Definitions

Enroll in a Human-Centered AI Course

This AI program covers technical fundamentals, business implications, and societal considerations.
Policy-Shaped Prediction: Avoiding Distractions in Model-Based Reinforcement Learning
Nicholas Haber, Miles Huston, Isaac Kauvar
Dec 13
Research
Your browser does not support the video tag.

Model-based reinforcement learning (MBRL) is a promising route to sampleefficient policy optimization. However, a known vulnerability of reconstructionbased MBRL consists of scenarios in which detailed aspects of the world are highly predictable, but irrelevant to learning a good policy. Such scenarios can lead the model to exhaust its capacity on meaningless content, at the cost of neglecting important environment dynamics. While existing approaches attempt to solve this problem, we highlight its continuing impact on leading MBRL methods —including DreamerV3 and DreamerPro — with a novel environment where background distractions are intricate, predictable, and useless for planning future actions. To address this challenge we develop a method for focusing the capacity of the world model through synergy of a pretrained segmentation model, a task-aware reconstruction loss, and adversarial learning. Our method outperforms a variety of other approaches designed to reduce the impact of distractors, and is an advance towards robust model-based reinforcement learning.

Policy-Shaped Prediction: Avoiding Distractions in Model-Based Reinforcement Learning

Nicholas Haber, Miles Huston, Isaac Kauvar
Dec 13

Model-based reinforcement learning (MBRL) is a promising route to sampleefficient policy optimization. However, a known vulnerability of reconstructionbased MBRL consists of scenarios in which detailed aspects of the world are highly predictable, but irrelevant to learning a good policy. Such scenarios can lead the model to exhaust its capacity on meaningless content, at the cost of neglecting important environment dynamics. While existing approaches attempt to solve this problem, we highlight its continuing impact on leading MBRL methods —including DreamerV3 and DreamerPro — with a novel environment where background distractions are intricate, predictable, and useless for planning future actions. To address this challenge we develop a method for focusing the capacity of the world model through synergy of a pretrained segmentation model, a task-aware reconstruction loss, and adversarial learning. Our method outperforms a variety of other approaches designed to reduce the impact of distractors, and is an advance towards robust model-based reinforcement learning.

Machine Learning
Foundation Models
Your browser does not support the video tag.
Research
Lifelong Robotic Reinforcement Learning by Retaining Experiences
Annie Xie, Chelsea Finn
Mar 21
Research
Your browser does not support the video tag.

Lifelong Robotic Reinforcement Learning by Retaining Experiences

Lifelong Robotic Reinforcement Learning by Retaining Experiences

Annie Xie, Chelsea Finn
Mar 21

Lifelong Robotic Reinforcement Learning by Retaining Experiences

Your browser does not support the video tag.
Research
Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach
Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
Nov 23
Research
Your browser does not support the video tag.

Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach

Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach

Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
Nov 23

Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach

Your browser does not support the video tag.
Research
Autonomous Reinforcement Learning via Subgoal Curricula
Archit Sharma, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn
Dec 24
Research
Your browser does not support the video tag.

Autonomous Reinforcement Learning via Subgoal Curricula

Autonomous Reinforcement Learning via Subgoal Curricula

Archit Sharma, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn
Dec 24

Autonomous Reinforcement Learning via Subgoal Curricula

Your browser does not support the video tag.
Research