Through AI and Text Analysis, Social Media Shows Our Community Well-being

Date

April 27, 2020

Topics

Linda A. Cicero / Stanford News Service

Stanford HAI junior fellow Johannes Eichstaedt built an algorithm that can provide, in principle, a real-time indication of community health.

Social media can reveal more than just a single person’s mood or frame of mind. It can capture the psychological states of an entire population, according to new research by Stanford scholar Johannes Eichstaedt.

Eichstaedt’s results, published April 27 in the Proceedings of the National Academy of Sciences, found that through machine-learning – teaching a computer to identify and analyze patterns in large datasets – researchers can see, in principle, how a society is doing in real-time.

“These methods really show how to do psychological measurement in the 21st century in our digital world,” said Eichstaedt, who is an assistant professor of psychology in the School of Humanities and Sciences and a junior fellow at the Stanford Institute for Human-Centered Artificial Intelligence.

For the past decade, Eichstaedt has tested how to use social media, including Twitter, as a way to measure the well-being of a community. He contends that social media provides the largest data set on behavior, emotions and thoughts in human history.

While the researchers acknowledge in the paper that Twitter is not representative of the U.S. population, it can still provide insight into how people experience their everyday life.

“What we really care about is how well the population is doing in terms of psychological and physical health, rather than merely that the GDP is growing,” said Eichstaedt. “You might not care about measuring subjective well-being in and of itself, but subjective well-being impacts mortality, including heart disease. It also impacts the economic bottom lines. So, it’s quite an important variable to capture for a population.”

From Survey Research to Social Media

To evaluate the different ways to analyze a region’s well-being, Eichstaedt and a team of researchers compared over a billion geo-tagged Tweets from 2009 to 2015 to 1.7 million responses from the Gallup-Sharecare Well-Being Index, an in-depth survey that measures how people experience everyday life.

Researchers have long relied on surveys like Gallup to measure a population’s well-being. While accurate, they can be costly and time-consuming undertakings. Sometimes it takes years to gather enough data for rough community estimates, said Eichstaedt.

But when augmented with data-driven techniques, some of that burden can be alleviated. Eichstaedt found that when an algorithm is trained with both users’ responses to a written well-being survey and a sample of posts from social media from the same respondents, it can then be deployed on a much larger scale to predict how people from an entire region would have responded on a traditional survey based only on their Tweets.

Understanding Words out of Context

Before machine learning methods were used, researchers either picked words or asked raters to annotate words for how “positive” they are. But it can be very tricky to pick words that measure well-being, said Eichstaedt.

For example, the researchers found that internet slang such as “LOL” – the popular acronym for “laugh out loud” – and the words “good” and “love” were frequently used in areas with lower income and education (and, in general, lower well-being). So even though these might seem like positive words, they may not be, Eichstaedt said.

Similarly, Eichstaedt found that words like “homework” and “taxes” might seem negative out of context, but the researchers found that these words were used more by people with higher education and income – a group that other studies have found to typically have higher well-being.

“When picking words to measure well-being, it’s really important to pay attention to cultural differences in language use across the U.S.,” said Eichstaedt.

But machine learning methods can help determine which words are more important than others. When the algorithm compared a person’s social media posts against their survey responses, it learned that words like “LOL” are not reliable indicators of well-being and instead used words such as “fun” and “excited.”

“Having the computer learn the words may be the best way to find words that measure well-being,” Eichstaedt said. “Differences in language use can be quite complex.”

Future Uses

The researchers note that well-being is also associated with other important factors, including overall health. For example, how stressed people are can induce unhealthy behaviors – such as excessive drinking or smoking – that in turn negatively impact their health, he said.

“When people are suffering from depression and anxiety, we need to know so that we can ensure they have the resources they need,” said Eichstaedt, who is currently applying this method to study the impact of the novel coronavirus pandemic on the population of cities across the U.S.

“COVID-19 is a natural disaster that interrupts our social norms and routines at an unprecedented scale,” Eichstaedt said. “With this real-time Twitter-based technology, psychologists can monitor if loneliness and anxiety are taking hold in communities, and how our well-being is impacted by social distancing. There is no other data source that can provide such measurement at population scale and give estimates so quickly. Now more than ever, using robust machine learning methods is very important.”

Co-authors on the paper include Kokil Jaidka who is affiliated with the National University of Singapore, Salvatore Giorgi and Lyle H. Ungar who are affiliated with the University of Pennsylvania, H. Andrew Schwartz of Stony Brook University and Margaret L. Kern from the University of Melbourne. Support for this research was provided by a Nanyang Presidential Postdoctoral Award, Adobe Research Award, Robert Wood Johnson Foundation Pioneer Award and a Templeton Religion Trust Grant.

Related News

From Privacy to ‘Glass Box’ AI, Stanford Students Are Targeting Real-World Problems

Nikki Goth Itoi

Feb 27, 2026

News

An Amazon-backed fellowship will support 10 Stanford PhD students whose work explores everything from how we communicate to understanding disease and protecting our data.

News

From Privacy to ‘Glass Box’ AI, Stanford Students Are Targeting Real-World Problems

Nikki Goth Itoi

Generative AIHealthcarePrivacy, Safety, SecurityComputer VisionSciences (Social, Health, Biological, Physical)Feb 27

An Amazon-backed fellowship will support 10 Stanford PhD students whose work explores everything from how we communicate to understanding disease and protecting our data.

AI Reveals How Brain Activity Unfolds Over Time

Andrew Myers

Jan 21, 2026

News

Medical Brain Scans on Multiple Computer Screens. Advanced Neuroimaging Technology Reveals Complex Neural Pathways, Display Showing CT Scan in a Modern Medical Environment

Stanford researchers have developed a deep learning model that transforms overwhelming brain data into clear trajectories, opening new possibilities for understanding thought, emotion, and neurological disease.

News

AI Reveals How Brain Activity Unfolds Over Time

Andrew Myers

HealthcareSciences (Social, Health, Biological, Physical)Jan 21

Why 'Zero-Shot' Clinical Predictions Are Risky

Suhana Bedi, Jason Alan Fries, and Nigam H. Shah

Jan 07, 2026

News

Doctor reviews a tablet in the foreground while other doctors and nurses stand over a medical bed in the background

These models generate plausible timelines from historical patterns; without calibration and auditing, their “probabilities” may not reflect reality.

News

Why 'Zero-Shot' Clinical Predictions Are Risky

Suhana Bedi, Jason Alan Fries, and Nigam H. Shah

HealthcareFoundation ModelsJan 07

These models generate plausible timelines from historical patterns; without calibration and auditing, their “probabilities” may not reflect reality.

Navigate

Participate

Stay Up To Date

Through AI and Text Analysis, Social Media Shows Our Community Well-being

From Survey Research to Social Media

Understanding Words out of Context

Future Uses

What Twitter Reveals About COVID-19’s Impact on Our Mental Health

Related News

From Privacy to ‘Glass Box’ AI, Stanford Students Are Targeting Real-World Problems

From Privacy to ‘Glass Box’ AI, Stanford Students Are Targeting Real-World Problems

AI Reveals How Brain Activity Unfolds Over Time

AI Reveals How Brain Activity Unfolds Over Time

Why 'Zero-Shot' Clinical Predictions Are Risky

Why 'Zero-Shot' Clinical Predictions Are Risky