Whose History? AI Uncovers Who Gets Attention in High School Textbooks

Date

November 17, 2020

Topics

REUTERS/Jim Young

Natural language processing reveals huge differences in how Texas history textbooks treat men, women, and people of color.

Harnessing the power of machine learning, Stanford University researchers have measured just how much more attention some high school history textbooks pay to white men than to Black people, ethnic minorities, and women.

In a new study of American history textbooks used in Texas, the researchers found remarkable disparities.

Hispanic students make up 52 percent of enrollments in Texas schools, for example, but Hispanic people received almost no mention at all in any of the textbooks — less than one-quarter of one percent of people who were mentioned by name.

By contrast, all but five of the 50 most-mentioned individuals were white men. Only one woman made that list — Eleanor Roosevelt — and only four people of color. Former president Barack Obama came in at 29th, Martin Luther King came in 30th, followed by Dred Scott and Frederick Douglass. Andrew Jackson, a slaveowner who contributed mightily to the genocide of Native Americans, got more mentions than anyone else.

Those are just the top-line numbers. Using the tools of natural language processing, or NLP, the researchers also quantified differences in how various groups were characterized.

White men were more likely to be associated with words denoting power, while women were more likely to be associated with marriage and families. African Americans were most likely to be associated to words of powerlessness and persecution, rather than with political action or government.

“Even for people who grew up with these textbooks, these patterns are surprising,” said Dorottya Demszky, a PhD candidate in linguistics who co-initiated the project. “We hope that this kind of quantification can become a tool for developing textbooks that are more representative.”

Exposing the Patterns, Faster

To be sure, it’s no secret that textbooks are shaped by the priorities and prejudices of the people in power. As recently as the mid-20th century, many southern schools taught that the Civil War was primarily about states’ rights rather than slavery. Indeed, educators have been scouring textbooks for decades to measure prejudice and distortions.

NLP models, the researchers say, can be useful new tools in that effort. Because the AI models read every word and parse every sentence, they can provide more holistic, nuanced, and reliable measures of possible under- and over-representation of different groups.

The Stanford researchers analyzed 18 American history textbooks that Texas school districts used from 2015 through 2017, applying an array of natural language processing techniques. These included neural network models that quantify subtle implicit associations, as well as linguistic databases that help infer the connotations that arise from particular word combinations.

Demszky, who works with Dan Jurafsky, a professor of both linguistics and computer science at Stanford and a member of the Stanford Institute for Human-Centered Artificial Intelligence, teamed up with Lucy Li, a PhD student in natural language processing at UC Berkeley, and Patricia Bromley, an associate professor of education at Stanford.

"There's been a real shift in natural language processing toward more socially aware models,” says Jurafsky. “These can help uncover the implicit ways that people are described in text and do better at capturing the human and social element that's so important in language."

The most dramatic finding in the Texas history project was the virtual absence of Hispanic people, who received almost no attention outside of the Mexican-American War. Women fared better, but they too were discussed far less frequently than men.

When the researchers looked at the words associated with different groups, they found that Blacks were connected much more than other groups to words of powerlessness like “slave,” “escaped,” “owned,” and “barred.” That’s at odds, the researchers say, with new historical research that emphasizes the power and agency of Black people in fighting for their freedom. Women, meanwhile, were connected with verbs like “marry” and “help” and nouns like “wife” and “mother.” “Household,” “home,” and “chores” were among the other top words associated with women. White men, on the other hand, were most closely associated with words like “leader,” “authority,” “achievement,” and “powerful.”

The researchers found similar disparities when they looked at the topics associated with different groups. Women were associated with two topics — family and social movements. Men were associated with family as well, but also with the military and decision making.

Perhaps not surprisingly, the researchers found that the political leaning of school districts affected the kinds of textbooks used. The textbooks in counties dominated by Democratic voters paid somewhat more attention to the role of women than the textbooks used in Republican counties. None of the textbooks, however, paid any real attention to Hispanic people.

The researchers caution that the study doesn’t draw conclusions about why some textbooks give more attention to white men than to women and minorities, nor about what the appropriate levels are. Though old prejudices are part of the explanation, another possibility is that many textbooks focus on formal political events and leaders rather than on the lives of people.

The key takeaway, says Demszky, is that machine learning and natural language processing offer new opportunities to understand what textbooks are truly teaching and to make them more broadly relevant.

“This has the potential to improve textbooks so that students feel more represented and take a more critical view of history,” she says. “If our paper can be a step toward making that happen, I’d be very, very happy.”

Stanford HAI's mission is to advance AI research, education, policy and practice to improve the human condition. Learn more.

Related News

AI Challenges Core Assumptions in Education

Shana Lynch

Feb 19, 2026

News

We need to rethink student assessment, AI literacy, and technology’s usefulness, according to experts at the recent AI+Education Summit.

News

AI Challenges Core Assumptions in Education

Shana Lynch

Education, SkillsGenerative AIPrivacy, Safety, SecurityFeb 19

We need to rethink student assessment, AI literacy, and technology’s usefulness, according to experts at the recent AI+Education Summit.

Stanford HAI and Swiss National AI Institute Form Alliance to Advance Open, Human-Centered AI

Jan 22, 2026

Announcement

Stanford, ETH Zurich, and EPFL will develop open-source foundation models that prioritize societal values over commercial interests, strengthening academia's role in shaping AI's future.

Announcement

Stanford HAI and Swiss National AI Institute Form Alliance to Advance Open, Human-Centered AI

Education, SkillsJan 22

Stanford, ETH Zurich, and EPFL will develop open-source foundation models that prioritize societal values over commercial interests, strengthening academia's role in shaping AI's future.

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos

TIME

Jan 21, 2026

Media Mention

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?”

Media Mention

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos

TIME

Ethics, Equity, InclusionGenerative AIMachine LearningNatural Language ProcessingJan 21

Navigate

Participate

Stay Up To Date

Whose History? AI Uncovers Who Gets Attention in High School Textbooks

Exposing the Patterns, Faster

Building an Ethical Computational Mindset

Related News

AI Challenges Core Assumptions in Education

AI Challenges Core Assumptions in Education

Stanford HAI and Swiss National AI Institute Form Alliance to Advance Open, Human-Centered AI

Stanford HAI and Swiss National AI Institute Form Alliance to Advance Open, Human-Centered AI

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos