Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Whose History? AI Uncovers Who Gets Attention in High School Textbooks | Stanford HAI
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

news

Whose History? AI Uncovers Who Gets Attention in High School Textbooks

Date
November 17, 2020
Topics
Education, Skills
Natural Language Processing
REUTERS/Jim Young

Natural language processing reveals huge differences in how Texas history textbooks treat men, women, and people of color.

Harnessing the power of machine learning, Stanford University researchers have measured just how much more attention some high school history textbooks pay to white men than to Black people, ethnic minorities, and women.

In a new study of American history textbooks used in Texas, the researchers found remarkable disparities.

Hispanic students make up 52 percent of enrollments in Texas schools, for example, but Hispanic people received almost no mention at all in any of the textbooks — less than one-quarter of one percent of people who were mentioned by name.

By contrast, all but five of the 50 most-mentioned individuals were white men. Only one woman made that list — Eleanor Roosevelt — and only four people of color. Former president Barack Obama came in at 29th, Martin Luther King came in 30th, followed by Dred Scott and Frederick Douglass. Andrew Jackson, a slaveowner who contributed mightily to the genocide of Native Americans, got more mentions than anyone else.

Those are just the top-line numbers. Using the tools of natural language processing, or NLP, the researchers also quantified differences in how various groups were characterized.

White men were more likely to be associated with words denoting power, while women were more likely to be associated with marriage and families. African Americans were most likely to be associated to words of powerlessness and persecution, rather than with political action or government.

“Even for people who grew up with these textbooks, these patterns are surprising,” said Dorottya Demszky, a PhD candidate in linguistics who co-initiated the project. “We hope that this kind of quantification can become a tool for developing textbooks that are more representative.”

Exposing the Patterns, Faster

To be sure, it’s no secret that textbooks are shaped by the priorities and prejudices of the people in power. As recently as the mid-20th century, many southern schools taught that the Civil War was primarily about states’ rights rather than slavery. Indeed, educators have been scouring textbooks for decades to measure prejudice and distortions.

NLP models, the researchers say, can be useful new tools in that effort. Because the AI models read every word and parse every sentence, they can provide more holistic, nuanced, and reliable measures of possible under- and over-representation of different groups.

The Stanford researchers analyzed 18 American history textbooks that Texas school districts used from 2015 through 2017, applying an array of natural language processing techniques. These included neural network models that quantify subtle implicit associations, as well as linguistic databases that help infer the connotations that arise from particular word combinations.

Demszky, who works with Dan Jurafsky, a professor of both linguistics and computer science at Stanford and a member of the Stanford Institute for Human-Centered Artificial Intelligence, teamed up with Lucy Li, a PhD student in natural language processing at UC Berkeley, and Patricia Bromley, an associate professor of education at Stanford.

"There's been a real shift in natural language processing toward more socially aware models,” says Jurafsky. “These can help uncover the implicit ways that people are described in text and do better at capturing the human and social element that's so important in language."

The most dramatic finding in the Texas history project was the virtual absence of Hispanic people, who received almost no attention outside of the Mexican-American War. Women fared better, but they too were discussed far less frequently than men.

When the researchers looked at the words associated with different groups, they found that Blacks were connected much more than other groups to words of powerlessness like “slave,” “escaped,” “owned,” and “barred.” That’s at odds, the researchers say, with new historical research that emphasizes the power and agency of Black people in fighting for their freedom. Women, meanwhile, were connected with verbs like “marry” and “help” and nouns like “wife” and “mother.” “Household,” “home,” and “chores” were among the other top words associated with women. White men, on the other hand, were most closely associated with words like “leader,” “authority,” “achievement,” and “powerful.”

The researchers found similar disparities when they looked at the topics associated with different groups. Women were associated with two topics — family and social movements. Men were associated with family as well, but also with the military and decision making.

Perhaps not surprisingly, the researchers found that the political leaning of school districts affected the kinds of textbooks used. The textbooks in counties dominated by Democratic voters paid somewhat more attention to the role of women than the textbooks used in Republican counties. None of the textbooks, however, paid any real attention to Hispanic people.

The researchers caution that the study doesn’t draw conclusions about why some textbooks give more attention to white men than to women and minorities, nor about what the appropriate levels are. Though old prejudices are part of the explanation, another possibility is that many textbooks focus on formal political events and leaders rather than on the lives of people.

The key takeaway, says Demszky, is that machine learning and natural language processing offer new opportunities to understand what textbooks are truly teaching and to make them more broadly relevant.

“This has the potential to improve textbooks so that students feel more represented and take a more critical view of history,” she says. “If our paper can be a step toward making that happen, I’d be very, very happy.”

Stanford HAI's mission is to advance AI research, education, policy and practice to improve the human condition. Learn more. 

REUTERS/Jim Young
Share
Link copied to clipboard!
Contributor(s)
Edmund L. Andrews
Related
  • Building an Ethical Computational Mindset
    Katharine Miller
    Oct 05
    news

    Stanford launches an embedded EthiCS program to help students consistently think through the common issues that arise in computer science. 

Related News

AI Challenges Core Assumptions in Education
Shana Lynch
Feb 19, 2026
News

We need to rethink student assessment, AI literacy, and technology’s usefulness, according to experts at the recent AI+Education Summit.

News

AI Challenges Core Assumptions in Education

Shana Lynch
Education, SkillsGenerative AIPrivacy, Safety, SecurityFeb 19

We need to rethink student assessment, AI literacy, and technology’s usefulness, according to experts at the recent AI+Education Summit.

Stanford HAI and Swiss National AI Institute Form Alliance to Advance Open, Human-Centered AI
Jan 22, 2026
Announcement
Your browser does not support the video tag.

Stanford, ETH Zurich, and EPFL will develop open-source foundation models that prioritize societal values over commercial interests, strengthening academia's role in shaping AI's future.

Announcement
Your browser does not support the video tag.

Stanford HAI and Swiss National AI Institute Form Alliance to Advance Open, Human-Centered AI

Education, SkillsJan 22

Stanford, ETH Zurich, and EPFL will develop open-source foundation models that prioritize societal values over commercial interests, strengthening academia's role in shaping AI's future.

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos
TIME
Jan 21, 2026
Media Mention

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?” 

Media Mention
Your browser does not support the video tag.

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos

TIME
Ethics, Equity, InclusionGenerative AIMachine LearningNatural Language ProcessingJan 21

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?”