What is AI Alignment? | Stanford HAI
Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs

What is AI Alignment?

AI Alignment means making sure an AI system’s goals and behavior match what people actually want—our values, rules, and intentions. It’s about getting the AI to do the “right thing” even in new situations, not just follow instructions literally in ways that cause harm. In practice, it includes preventing unwanted outcomes like deception, unsafe shortcuts, or optimizing a metric that misses the real objective.

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News


AI Alignment mentioned at Stanford HAI

Explore Similar Terms

AI Safety | Responsible AI | Ethical AI

See Full List of Terms & Definitions

When AI Imagines a Tree: How Your Chatbot’s Worldview Shapes Your Thinking
Katie Gray Garrison
Jul 28
news

A new study on generative AI argues that addressing biases requires a deeper exploration of ontological assumptions, challenging the way we define fundamental concepts like humanity and connection.

When AI Imagines a Tree: How Your Chatbot’s Worldview Shapes Your Thinking

Katie Gray Garrison
Jul 28

A new study on generative AI argues that addressing biases requires a deeper exploration of ontological assumptions, challenging the way we define fundamental concepts like humanity and connection.

Ethics, Equity, Inclusion
Generative AI
news
The Challenge of Aligning AI ChatBots
Andrew Myers
Aug 05
news

Ever-changing lexicons, multilingualism, and varying cultural value systems compromise accuracy of large language models.

The Challenge of Aligning AI ChatBots

Andrew Myers
Aug 05

Ever-changing lexicons, multilingualism, and varying cultural value systems compromise accuracy of large language models.

news

Enroll in a Human-Centered AI Course

This HAI program covers technical fundamentals, business implications, and societal considerations.