Diyi Yang: Human-Centered Natural Language Processing Will Produce More Inclusive Technologies | Stanford HAI
Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • AI Glossary
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

news

Diyi Yang: Human-Centered Natural Language Processing Will Produce More Inclusive Technologies

Date
May 09, 2023
Topics
Natural Language Processing
Machine Learning

In her course called Human-Centered NLP, Yang challenges students to think beyond technical performance or accuracy.

Last year, OpenAI’s image generator, DALL·E 2, was released to rave reviews, quickly becoming the darling of the artificial intelligence world for its ability to generate original images from user prompts. But it didn’t take long before the internet discovered that, once again, this AI system was inheriting and reflecting society’s biases, based on the images in its training set. A prompt featuring the word “builder” produced images with only men, while a prompt with the word “flight attendant” featured only women. In anticipation of this outcome, OpenAI preemptively published a statement on the system’s Risks and Limitations, purporting “DALL·E 2 additionally inherits various biases from its training data, and its outputs sometimes reinforce societal stereotypes.”

How could such biased practices be avoided? Diyi Yang, assistant professor in the Computer Science Department at Stanford and an affiliate of Stanford Institute for Human-Centered AI, leads a course aimed at exactly these types of concerns with natural language processing systems. The course provides an overview of human-centered techniques and applications for NLP, ranging from human-centered design thinking to human-in-the-loop algorithms, fairness, and accessibility. 

“We need to take a long-term view of the systems we’re building and continually improve the designs to ensure they meet the concerns of the people,” she says. “Performance or accuracy should not be the single dimension to optimize for when building NLP systems – how to make them personalized, easy to use, and accessible to different user groups are all important goals.”

In this interview, Yang discusses the risks of building NLP systems that aren’t human-centered, as well as the knowledge students in this course can expect to come away with. 

Your course, Human-Centered NLP, explores human-centered techniques to address concerns about NLP systems (e.g., bias, lack of user input). What does the term human-centered NLP mean?

The key is to prioritize human needs and preferences, rather than solely focusing on technological capabilities. Human-centered NLP involves designing and developing systems by taking into account human factors and considerations for the ethical and social implications of these systems. It seeks to create systems that are more user-friendly, accessible, and inclusive, and it should also be in every stage of the design and development process: task formulation, data collection, data processing, model training, model evaluation, and deployment. Take the most recent Reinforcement Learning from Human Feedback (RLHF) behind ChatGPT as one example. A human-centered view will motivate us to ask: What type of feedback we would like models to learn, who is going to benefit from this type of task, how would such data points be collected and from whom, who will be included in the data processing and systems design, how can the models be trained to take into account diverse perspectives, who are in the right position to help evaluate the resulting systems, and finally, how would domain users use the system?

What are the risks of building NLP systems that are not human-centered? Where have we seen these risks materialize in real-world examples?

There are multiple risks if NLP systems are not built from a human-centered perspective.

First, we might end up with inaccurate results. If a system is trained on a limited dataset without considering possible user factors, such as gender, race, or culture, it might lead to unreliable predictions, especially when we apply it to high-stakes tasks, such as medical imaging

A lack of human-centered development might lead to poor user experience in the way the system presents results that might not be what users are looking for. 

It is also very likely to lead to fairness issues. For example, if NLP models are mainly trained on Standard American English, it might not work well when it comes to users who speak their preferred languages at home or use an English dialect. Models trained on a biased dataset might propagate and amplify existing biases, leading to unfair predictions and discrimination toward certain groups. 

What are examples where we didn’t consider the human in NLP design/development?

There were a few recent examples in news headlines about the lack of human-centered design and development. For instance, Amazon’s AI recruiting tool showed bias against women, which didn’t take into account how systems might work for different genders. The other example is the racial divide in speech recognition systems toward dialects like African American English. AI models for processing hate speech are more likely to flag tweets written in African American English as “offensive” or “hateful.” 

What are some positive examples of human-centered NLP?

I’ve seen some great examples of developers working with domain users (e.g., adults who are deaf or hard-of-hearing) to develop better text simplification systems, as well as the community-centric organization Masakhane that strengthens NLP research in African languages. 

GPT-3 is an example of a large language model that is designed to predict the next word, as opposed to designed to answer the user’s question, which is misaligned with user needs. What do you think would be a better design for LLMs, given technical constraints?

One of the components behind GPT-3 is the process of learning from human preferences – this is where misalignment might be more likely to occur. If we take a more human-centric perspective, the most straightforward question is “who is the human?” Whose preferences are we aligning to? Which culture and user groups are we aligning to? Who is the recipient of these results? When there are disagreements and conflicts in user preferences, which ones should we use? In terms of what goes into the system, who has final say? When diverse voices are not included in this process, misalignment, monoculture issues, and constraints may result.

What might the students in your class be surprised by or learn?

They may be surprised to learn we don’t have perfect answers to many of the questions we asked about human-centered NLP. We probably have more questions than solutions. But we cover a wide range of topics, from learning from human preferences, visualization, human-centered design principles, user-centered evaluation, human-AI collaboration, trust, and AI governance. Rather than rushing into model development too quickly, we want students to think about what could go wrong with the conventional approaches and what can be done to make them more human-centered. Teaching is not about filling the vase, but about lighting the fire. Since human-centered NLP is a new and emerging topic, I hope this course can open a window for students to learn about what types of technology would make a difference in the real world.

Stanford HAI’s mission is to advance AI research, education, policy and practice to improve the human condition. Learn more. 

Share
Link copied to clipboard!
Contributor(s)
Prabha Kannan

Related News

AI Coding Agents Fail at Teamwork
Andrew Myers
Jun 01, 2026
News
illustration of two people paddling in opposite directions

Two models working together perform worse than one alone, exposing a critical gap in artificial intelligence capabilities.

News
illustration of two people paddling in opposite directions

AI Coding Agents Fail at Teamwork

Andrew Myers
Generative AIMachine LearningJun 01

Two models working together perform worse than one alone, exposing a critical gap in artificial intelligence capabilities.

AI Hiring Tools Can Yield Racial Bias and Systemic Rejection
Rishi Bommasani, Sarah H. Bana, Kathleen A. Creel, Dan Jurafsky, Percy Liang
May 26, 2026
News
A 3D isometric conceptual illustration showing a single glowing yellow human icon standing out among a grid of identical blue figures

The first large-scale study of hiring algorithms in the wild finds concerning patterns to how systems reject candidates.

News
A 3D isometric conceptual illustration showing a single glowing yellow human icon standing out among a grid of identical blue figures

AI Hiring Tools Can Yield Racial Bias and Systemic Rejection

Rishi Bommasani, Sarah H. Bana, Kathleen A. Creel, Dan Jurafsky, Percy Liang
Machine LearningEthics, Equity, InclusionWorkforce, LaborMay 26

The first large-scale study of hiring algorithms in the wild finds concerning patterns to how systems reject candidates.

New Approach to Scaling Laws Could Change How AI Models Are Trained
Andrew Myers
May 21, 2026
News
Digital image symbolizing neural nets

Leveraging statistical concepts from measurement science and education, AI researchers have greatly reduced the computational demand of predicting how the largest of large language models will scale up in the future. It could save millions of dollars in training costs.

News
Digital image symbolizing neural nets

New Approach to Scaling Laws Could Change How AI Models Are Trained

Andrew Myers
Natural Language ProcessingGenerative AIMay 21

Leveraging statistical concepts from measurement science and education, AI researchers have greatly reduced the computational demand of predicting how the largest of large language models will scale up in the future. It could save millions of dollars in training costs.