AI Experts Establish the “North Star” for Domestic Robotics Field

Date

November 08, 2021

A Stanford AI team creates benchmarks for 100 everyday household tasks for robot assistants, creating a path for more useful agents.

Robots that do everything from helping people get dressed in the morning to washing (and putting away) the dishes have been a dream for as long people have uttered the words “artificial intelligence.” But, in a field where the state of the art currently rests far short of that level of sophistication, a fundamental challenge has emerged: Namely, what will “success” even look like, should the day come when robots are able to perform these key tasks to human standards.

To do these mundane but surprisingly complex tasks, a robot must be able to perceive, reason, and operate with full awareness of its own physical dimension and capabilities, but also of the world and objects around it. In robotics, this combination of situational and physical awareness and capability is known as embodied AI.

Now, a multidisciplinary team of researchers at Stanford University has released the Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments (BEHAVIOR). It is a catalog of the physical and intellectual details of 100 everyday household tasks — washing dishes, picking up toys, cleaning floors, etc. — and an implementation of those tasks in multiple simulated homes. A paper describing BEHAVIOR was recently accepted to the Conference on Robot Learning (CoRL).

BEHAVIOR imbues a set of realistic, varied, and complex activities with a new logical and symbolic language, a fully functional 3-D simulator with a virtual reality interface, and a set of success metrics drawn from the performance of humans doing the same tasks in virtual reality. Taken as a whole, BEHAVIOR delivers a breadth of tasks and a level of detailed descriptions about each task that was previously unavailable in AI.

“While any one of those tasks is already highly complex in its own right, imagine the challenge of creating a single robot that can do all of these things,” says Jiajun Wu, assistant professor of computer science and a senior author on the paper. “Creating these benchmarks now, before the field has evolved too far, will help to set up potential common goals for the community.”

A Monumental Task

Imagine the multiple problems a robot has to overcome to achieve a simple task like cleaning a countertop. The robot not only has to perceive and understand what a countertop is, where to find it, that it needs cleaning, and the counter’s physical dimensions, but also what tools and products are best used to clean it and how to coordinate its motions to get it clean. The robot would then have to determine the best course of action, step by step, needed to clean the counter. It even requires a complex understanding of things humans think nothing of, such as what tools or materials are “soakable” and how to detect and declare a countertop “clean.” In BEHAVIOR, this level of complexity is achieved in 100 activities performed in multiple different simulated houses.

Each of these steps (navigation, search, grasping, cleaning, evaluating) may require hours or even days of training in simulation to be learned — far beyond the capabilities of current autonomous robots.

“Deciding the best way to achieve a goal based on what the robot perceives and knows about the environment and about its own capabilities is an important aspect in BEHAVIOR,” says Roberto Martin-Martin, a postdoctoral scholar in computer science who worked on the planning aspects of the benchmark. “It requires not only an understanding of the environment and what needs to be done, but in what order they need to be done to achieve a task. All this for 100 tasks in different environments!”

Sim to Real

In creating the BEHAVIOR benchmark, the team, led by Stanford Institute for Human-Centered AI co-director and computer scientist Fei-Fei Li, together with experts from computer science, psychology, and neuroscience, has established a “North Star,” a visual reference point by which to gauge the success of future AI solutions, which might also be used to develop and train robotic assistants in virtual environments that are then migrated to operate in literal ones — a paradigm known in the field as “sim to real.”

“Making this leap from simulation to the real world is a non-trivial thing, but there have been a lot of promising results in training robots in simulation and then putting that same algorithm into a physical robot,” says co-author Sanjana Srivastava, a doctoral candidate in computer science who specializes in the task definition aspects of the benchmark.

“I got involved specifically to see how far we can push simulation technology,” says co-author Michael Lingelbach, a doctoral candidate in neuroscience. “Sim to real is a big area in robotic research and one we’d like to see develop more fully. Working with a simulator is just a much more accessible way to approach robotics.”

Next up, the BEHAVIOR team hopes to provide initial solutions to the benchmark while extending it with new tasks not currently benchmarked. According to the team, that effort will require contributions from the entire field: robotics, computer vision, computer graphics, cognitive science. Other researchers are invited to try their own solutions; to that end, the current version of BEHAVIOR is open-source and publicly available at behavior.stanford.edu.

“If you think about these one hundred activities at the level of detail we provide, you begin to comprehend how difficult — and important — benchmarking is,” says co-author Chengshu Li, a doctoral candidate in computer science. “In that regard, BEHAVIOR is not final. We will continue to iterate and add new tasks to our list.”

Stanford HAI's mission is to advance AI research, education, policy and practice to improve the human condition. Learn more.

Link copied to clipboard!

Contributor(s)

Andrew Myers

Sustainability and AI
Stanford HAI
Jan 31
Industry Brief
Environmental, social, and governance risks pose a threat to economies and human well-being around the world. However, we have the power to build a sustainable planet. Recent developments in AI are helping us see issues that were hard to identify before. As machine vision helps us see our world, we are able to detect issues, track them, and create targeted interventions. In this brief, we examine innovations by Stanford researchers that use AI and ML techniques to shift our world from one that depletes resources to one that preserves them for the future. For example, we can now track methane emissions across our energy and food systems, opening an avenue for policy formation and enforcement through near real-time tracing. AI enables knowledge-to-action and will play a key role in measuring and effectively achieving environmental, social, and governance goals.
Robotics and AI
Stanford HAI
Jun 01
Industry Brief
Robots are becoming a core building block in engineering and healthcare applications, altering the way many industries operate, and improving quality of life for everyone. With AI, robots are further given the ability to learn and adapt so that they can work collaboratively alongside humans and other robots in real-world environments. This industry brief provides a cross-section of key research – at HAI and across Stanford – that leverages AI methods into new algorithms for human robot interaction and robot navigation. Discover how researchers are designing intelligent robots that learn and adapt to human demonstration, and how they could be used to disrupt and create markets in a wide range of industries including manufacturing, healthcare, autonomous vehicles, and many more.

Related News

The AI Sovereignty Paradox: Should Countries Buy, Build, or Lease to Maintain Strategic Control of Their AI?

Shana Lynch

Jul 14, 2026

News

As nations invest billions to reduce reliance on foreign AI providers, a new Stanford HAI report surveys commercial sovereignty solutions and assesses the extent to which they meaningfully reduce dependencies on U.S. tech giants.

News

The AI Sovereignty Paradox: Should Countries Buy, Build, or Lease to Maintain Strategic Control of Their AI?

Shana Lynch

Government, Public AdministrationInternational Affairs, International Security, International DevelopmentRegulation, Policy, GovernanceJul 14

Stanford Study Exposes Major Flaw in AI Mental Health Safety Testing

Andrew Myers

Jul 13, 2026

News

mental health ai illustration head with binary code

With increased use of chatbots in mental health contexts, AI developers now rely on human experts to evaluate AI’s responses for “safety” – but experts rarely agree on what’s safe.

News

Stanford Study Exposes Major Flaw in AI Mental Health Safety Testing

Andrew Myers

HealthcareGenerative AIPrivacy, Safety, SecurityJul 13

With increased use of chatbots in mental health contexts, AI developers now rely on human experts to evaluate AI’s responses for “safety” – but experts rarely agree on what’s safe.

Stanford Scientists Build an AI Lab Partner

Nikki Goth Itoi

Jul 09, 2026

News

Biomni can analyze mountains of medical data, spot patterns humans might miss, and even design experiments—helping researchers make discoveries faster in the race to cure disease.

News

Stanford Scientists Build an AI Lab Partner

Nikki Goth Itoi

Sciences (Social, Health, Biological, Physical)Jul 09

Biomni can analyze mountains of medical data, spot patterns humans might miss, and even design experiments—helping researchers make discoveries faster in the race to cure disease.

Navigate

Participate

Stay Up To Date

AI Experts Establish the “North Star” for Domestic Robotics Field

A Monumental Task

Sim to Real

Sustainability and AI

Robotics and AI

Related News

The AI Sovereignty Paradox: Should Countries Buy, Build, or Lease to Maintain Strategic Control of Their AI?

The AI Sovereignty Paradox: Should Countries Buy, Build, or Lease to Maintain Strategic Control of Their AI?

Stanford Study Exposes Major Flaw in AI Mental Health Safety Testing

Stanford Study Exposes Major Flaw in AI Mental Health Safety Testing

Stanford Scientists Build an AI Lab Partner

Stanford Scientists Build an AI Lab Partner