AI-Enabled Depression Prediction Using Social Media

This brief introduces AI-enabled depression prediction through social media and calls for clear policy guidelines to ensure patient privacy.
Key Takeaways
AI-enabled depression prediction is capable of matching the accuracy of traditional screening surveys but can be delivered to whole (consenting) populations.
By examining social media language, our model can make a significant impact in recognizing the most widespread mental illnesses in the world.
Policymakers and regulators must establish clearer guidelines about access to data, understand the consequences of using algorithms to change social media posts into protected health information, and consider how depression detection can be combined with digital treatments in a modern system of care.
Executive Summary
Natural language processing for mental health monitoring is an emerging use of AI that is poised to disrupt the landscape of the health care industry. As the profusion of social media platforms allows for a wider swathe of the population to share their thoughts and feelings with the world, users’ posts and reactions extend the scope of medical screening methods for psychological disorders such as depression. Users are already being marketed to with sophistication based on these behaviors — why not leverage these technologies for public health?
To give some sense of scale for the unaddressed need, in the United States, between 7 and 26 percent of the population experiences depression each year, but only between 13 and 49 percent of those people receive treatment — this means that in the US, there currently may be 30+ million people in need of but not receiving mental healthcare. (Of note, these numbers are pre-COVID; early studies suggest that the prevalence of mental health conditions may have doubled after the first lockdowns). These high rates of underdiagnosis and undertreatment suggest that new screening methods like AI-enabled prediction are needed to identify and treat patients with depression.
In a recent article I co-authored in the Proceedings of the National Academy of Sciences, “Facebook language predicts depression in medical records,” my team and I specify a set of protocols to identify patients suffering from depression using only language from their Facebook posts. These methods capitalize on significant advances in technology over the last decade and are capable of roughly matching the accuracy of traditional screening surveys. Using a system that relies on machine learning to cluster, count, and score each word, we find that language predictors of depression include emotional, interpersonal, and cognitive processes represented by words such as sadness, a preoccupation with the self or rumination, and expressions of loneliness and hostility.
Depression assessment through social media represents a way to screen which does not require users to actively engage in a survey; it is unobtrusive for individuals who consent to be part of this modality. However, in order for this method to become feasible as a scalable complement to existing screening and monitoring procedures, policymakers and regulators will need to ensure that patient privacy and confidentiality are kept at the forefront of consideration when these technology solutions are developed. To that end, clearer guidelines and regulation are needed about access to data, who has access to the data, and the purpose for collecting this data. The application of machine learning to quasi-public social media posts can transform such data into protected health information and must be understood and treated as such — including with regards to questions of privacy and of the medical autonomy of patients.
If this advance in treatment is responsibly developed and is responsibly developed and introduced in a manner that integrates with existing systems of care and relationships with trusted providers, it has the potential to be a huge shift in public health.







