Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Using Algorithms to Track Down Sex Criminals | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
news

Using Algorithms to Track Down Sex Criminals

Date
July 28, 2020
iStock/mysondanube, iStock/South_agency

A Stanford scholar has developed a more efficient way to test DNA samples in rape kits. 

Machine learning algorithms appear to do better than forensic experts at plowing through the hundreds of thousands of untested sexual assault kits warehoused in police departments and crime labs across the U.S., a Stanford professor has found.

Forensic examiners, who are often nurses, are responsible for creating the kits, which hold the evidence — collected during an exam of a sexual assault victim’s body, clothes, and other belongings — containing DNA that might be used to identify and convict rapists. Unfortunately, myriad roadblocks, such as inadequate funding and a lack of testing protocols, have created a huge backlog of rape kits. Lawrence M. Wein, a professor at Stanford Graduate School of Business, calls this an affront to the several hundred thousand victims of sexual assault whose kits are in the backlog.

“After this traumatic experience, in hopes of catching the offender, the victim goes through a forensic exam that takes many hours and is very tedious,” Wein says. “For the kits to just sit there and never get tested is unspeakable.”

Wein continues his work looking at how to make testing the kits more efficient and cost-effective in a study recently published in Proceedings of the National Academy of Sciences. He found that when pitted against the recommendations of forensic examiners, a machine learning algorithm better predicts which biological samples from rape kits are more likely to generate DNA evidence.

Math Meets Forensics

Wein’s first round of research a couple of years ago demonstrated a hefty economic benefit to testing sexual assault kits despite the costs involved. It also brought him to Washington, D.C., where he spoke with the Department of Justice and learned of a related problem. Although municipalities were finally starting to crack their logjams of untested kits, many were testing only a few elements from each kit, limiting their efforts to specimens that forensic examiners deemed most likely to yield DNA.

Wein wondered: Could mathematical analysis do a more efficient job of deciding which samples to test in each kit? And how does the cost-effectiveness of the current practice compare to that of testing every component in each kit?

“They’re only testing a couple of samples from each kit but don’t know how it compares to testing all samples in a kit,” says Wein. “We talked to a variety of people in the field and it eventually led me to the San Francisco Police Department Criminalistics Laboratory, which has an amazing dataset that allowed us to answer this question with quite a bit of confidence.”

Given that SFPD tests all elements of the rape kits it receives and also collects information on samples flagged by examiners as most likely to contain DNA, Wein found a textbook dataset to experiment with. Using data from 868 rape kits that were tested over 2017–2019, Wein’s team constructed a standard machine learning model based on information gleaned from each sexual assault case and an estimate of all fixed and variable costs required for testing each kit.

The idea was to create an algorithm that predicts exactly where to find a DNA sample that’s high quality enough to be uploaded into CODIS, the national database of DNA profiles from known offenders of both sexual assaults and nonsexual crimes. And that’s the money: DNA from a sexual assault kit that matches an existing DNA profile in CODIS could provide a lead for law enforcement to identify the attacker.

Economies of Scale

Wein’s experiment worked well. For the same cost as the process that tests only samples recommended by forensic examiners, Wein’s algorithm and testing policy showed an increase of 41% in the number of DNA results that could be submitted to CODIS. The team also found that while full testing of all samples is more expensive — the mean cost per kit rises from $397 to $912 — it increases the DNA yield more than twofold, and the added cost is offset by economies of scale.

It’s favorable, then, to test all samples in a sexual assault kit, Wein found. There’s also upside for forensic examiners to take further cues from their machine learning counterparts, which revealed that the number of DNA hits will increase another 47% if more samples are collected from the three body locations deemed by the algorithm to most likely harbor DNA.

To see how their model applies to sexual assault backlogs outside of San Francisco, Wein and his team are looking to test against bigger datasets from other cities, where factors like the number of samples obtained and tested in a kit vary.

Once this is done, the research, which so far has been well-received by criminal lab experts and government representatives, should bolster the argument for more federal funding explicitly earmarked for plodding through rape kit backlogs. It also advocates for testing all samples inside kits, which many cities and towns aren’t yet doing.

“I feel like there would have to be a law or bill that says, ‘Here’s the money, now go test every sample in every kit in the backlog,’” says Wein.

This story was first published on Insights by Stanford Business.

Stanford HAI's mission is to advance AI research, education, policy and practice to improve the human condition. Learn more. 

iStock/mysondanube, iStock/South_agency
Share
Link copied to clipboard!
Contributor(s)
Maggie Overfelt
Related
  • Reconciling Law, Ethics, and Artificial Intelligence: The Difficult Work Ahead
    Mariano-Florentino Cuéllar
    Feb 04
    news
    Your browser does not support the video tag.

Related News

A New Economic World Order May Be Based on Sovereign AI and Midsized Nation Alliances
Alex Pentland
Feb 06, 2026
News
close-up of a globe with pinpoints of lights coming out of all the countries

As trust in the old order erodes, mid-sized countries are building new agreements involving shared digital infrastructure and localized AI.

News
close-up of a globe with pinpoints of lights coming out of all the countries

A New Economic World Order May Be Based on Sovereign AI and Midsized Nation Alliances

Alex Pentland
Feb 06

As trust in the old order erodes, mid-sized countries are building new agreements involving shared digital infrastructure and localized AI.

Smart Enough to Do Math, Dumb Enough to Fail: The Hunt for a Better AI Test
Andrew Myers
Feb 02, 2026
News
illustration of data and lines

A Stanford HAI workshop brought together experts to develop new evaluation methods that assess AI's hidden capabilities, not just its test-taking performance.

News
illustration of data and lines

Smart Enough to Do Math, Dumb Enough to Fail: The Hunt for a Better AI Test

Andrew Myers
Foundation ModelsGenerative AIPrivacy, Safety, SecurityFeb 02

A Stanford HAI workshop brought together experts to develop new evaluation methods that assess AI's hidden capabilities, not just its test-taking performance.

What Davos Said About AI This Year
Shana Lynch
Jan 28, 2026
News
James Landay and Vanessa Parli

World leaders focused on ROI over hype this year, discussing sovereign AI, open ecosystems, and workplace change.

News
James Landay and Vanessa Parli

What Davos Said About AI This Year

Shana Lynch
Economy, MarketsJan 28

World leaders focused on ROI over hype this year, discussing sovereign AI, open ecosystems, and workplace change.