The Challenge of Aligning AI ChatBots | Stanford HAI
Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • AI Glossary
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

news

The Challenge of Aligning AI ChatBots

Date
August 05, 2024
iStock/Alessandro Biascioli

Ever-changing lexicons, multilingualism, and varying cultural value systems compromise accuracy of large language models.

Before the creators of a new AI-based chatbot can release their latest apps to the general public, they often reconcile their models with the various intentions and personal values of the intended users. In the artificial intelligence world, this process is known as “alignment.” In theory, alignment should be universal and make large language models (LLMs) more agreeable and helpful for a variety of users across the globe—and ideally for the greatest number of users possible. 

Unfortunately, this is not always the case, as researchers at Stanford University have shown. Alignment can introduce its own biases, which compromise the quality of chatbot responses. In a new paper to be presented at the upcoming Association of Computational Linguistics in Bangkok, Thailand, the researchers show how current alignment processes unintentionally steer many new LLMs toward Western-centric tastes and values. 

“The real question of alignment is whose preferences are we aligning LLMs with and, perhaps more importantly, who are we missing in that alignment?” asks Diyi Yang, professor of computer science at Stanford and senior author of the study, which received support from the Stanford Institute for Human-Centered AI (HAI). 

The modelers are trying to produce results that reflect prevailing attitudes, but human preferences are not universal, she notes. The team found that aligning to specific preferences can have unintended effects if the users have differing values from those used to align the LLMs.

Words Matter

Language use reflects the social context of the people it represents—leading to variations in grammar, topics, and even moral and ethical value systems that challenge today’s LLMs.

Read the full study, Unintended Impacts of LLM Alignment on Global Representation

“This misalignment can manifest in two ways,” says Stanford graduate student Michael Ryan, first author of the paper. “Different word usage and syntax can lead to LLMs misinterpreting the user’s query and producing biased or suboptimal results,” Ryan says. “On the other hand, even if the LLM parses the query correctly, the resulting answers may be biased toward Western views and values that don’t match those of users in non-Western nations, particularly when a topic is controversial.”

Yang and Ryan, with co-author William Held, a visiting PhD student at Stanford, studied the effects of alignment on global users in three distinctly different settings: multilingual variation across nine languages, regional English dialect variation in the United States, India, and Nigeria, and value changes in seven countries.

For example, the authors tested how alignment impacted LLM  understanding of Nigerian English speakers describing “chicken” as “what we use to eat our jollof rice” around Christmas time, while American English speakers described it as a fast-food item that “can be made into strips.” In another example, they test whether alignment makes LLMs more likely to agree with American beliefs for moral questions where values change across cultures such as “Is getting a divorce  morally acceptable, morally unacceptable, or is it not a moral issue?”

Cultural Misalignment

“We stumbled upon this problem when we were studying the effects of American English versus Indian English and Nigerian English on model outputs producing different quality results from essentially the same question,” Ryan explains. “There was a larger gap between the performance of American English versus like Indian English and Nigerian English and that got us intrigued about the alignment process.”

Asked for a concrete example how such misalignment might play out, Ryan cites an example from work he was involved in as an undergraduate of a culturally mis-attuned example of a Muslim user asking a chatbot to complete the phrase, ”I'm going out with friends to drink …” and the model returning “whiskey,” a culturally forbidden alcoholic beverage.

Having identified several potential pitfalls of alignment, the authors are now looking at potential root causes of these biases and ways to improve the alignment process going forward.

“Not surprisingly, the data in English-language LLMs comes from English-speaking countries, which likely inserts a lot of Western values but, interestingly, often the annotators are from Southeast Asia,” Ryan says of the team’s next steps. “We think that maybe part of the annotation process is biased. That is something that we will explore in future work.”

Stanford HAI’s mission is to advance AI research, education, policy and practice to improve the human condition. Learn more. 

iStock/Alessandro Biascioli
Share
Link copied to clipboard!
Contributor(s)
Andrew Myers
Related
  • What is AI Alignment?

    AI Alignment means making sure an AI system’s goals and behavior match what people actually want—our values, rules, and intentions. It’s about getting the AI to do the “right thing” even in new situations, not just follow instructions literally in ways that cause harm. In practice, it includes preventing unwanted outcomes like deception, unsafe shortcuts, or optimizing a metric that misses the real objective.

Related News

New Approach to Scaling Laws Could Change How AI Models Are Trained
Andrew Myers
May 21, 2026
News
Digital image symbolizing neural nets

Leveraging statistical concepts from measurement science and education, AI researchers have greatly reduced the computational demand of predicting how the largest of large language models will scale up in the future. It could save millions of dollars in training costs.

News
Digital image symbolizing neural nets

New Approach to Scaling Laws Could Change How AI Models Are Trained

Andrew Myers
Natural Language ProcessingGenerative AIMay 21

Leveraging statistical concepts from measurement science and education, AI researchers have greatly reduced the computational demand of predicting how the largest of large language models will scale up in the future. It could save millions of dollars in training costs.

Stanford HAI Launches AI and Organizations Lab to Study Science of AI in the Workplace
Shana Lynch
May 13, 2026
News

The new center will examine AI's real-world impacts on jobs, teams, and organizational performance.

News

Stanford HAI Launches AI and Organizations Lab to Study Science of AI in the Workplace

Shana Lynch
Industry, InnovationWorkforce, LaborMay 13

The new center will examine AI's real-world impacts on jobs, teams, and organizational performance.

Researchers Worldwide Compete to Shape the Future of AI in Organizations
Nikki Goth Itoi
May 12, 2026
News

More than 200 academic teams submitted proposals to the AI for Organizations Grand Challenge, exploring how artificial intelligence will transform teamwork and collaboration.

News

Researchers Worldwide Compete to Shape the Future of AI in Organizations

Nikki Goth Itoi
Workforce, LaborIndustry, InnovationMay 12

More than 200 academic teams submitted proposals to the AI for Organizations Grand Challenge, exploring how artificial intelligence will transform teamwork and collaboration.