Among the most pressing matters cancer patients and oncologists must discuss is how serious the disease is and how long the patient can expect to live. These are tough conversations that affect not only how the patient is treated but also beget other sobering decisions. And for most of medical history, life-expectancy predictions have amounted largely to educated guesswork based on the physician’s knowledge and experience, abetted by simple statistical models based on prior patient histories.
Recently, however, a team of researchers at Stanford University says it has significantly improved the accuracy of these important predictions by turning to artificial intelligence. In a paper published in the Journal of the American Medical Informatics Association (JAMIA), they say their algorithm could lead to greater accuracy in cancer prognosis and, ultimately and most importantly, to better care for patients.
More Data, Surprising Patterns
“A patient’s prognosis is critical information that drives pretty much all health decisions from that point forward,” says first author of the study Michael Gensheimer, a Stanford radiation oncologist and HAI faculty member.
It helps doctors and their patients and their families grapple with extremely hard choices: What treatments are appropriate? Should they aggressively treat a cancer in a patient with little time remaining? Does the patient want to try untested experimental approaches? And beyond medical treatment decisions, patients must consider surrogate decision makers, other people who can help make medical decisions if they are incapacitated.
But in prognosis, there are no easy answers. The statistical models help, but they are not particularly accurate, Gensheimer says. Doctors often do better based upon their personal knowledge and experience.
The new algorithm improves upon both the statistical models and the doctors’ expertise alone. The machine learning tool combed every shred of data available in the medical records of more than 15,000 Stanford patients with metastatic cancers — including doctors’ clinic notes, radiology reports, laboratory test results, vital signs, and much more — using several thousand variables to predict survival time.
The algorithm uses all the data in a patient’s chart to make a prediction in a matter of minutes. But the computer also has the advantage of knowing how the patients fared, so it can make predictions and compare those predictions against real-world results to reveal arcane patterns that best predict patient survival.
After it has churned through the data, the model will then explain what data points it used to make its predictions, helping physicians learn critical data points they may be overlooking.
“Often, the doctors may be surprised by these details,” Gensheimer says.
For instance, the model picked a few hundred laboratory tests from a longer list to help make its predictions. It picked labs that doctors would expect to be important, such as those that measure kidney and liver function. But it also picked more obscure tests such as the levels of specific kinds of white blood cells like basophils and eosinophils. While Gensheimer says more research is needed into why these specific tests are predictive of survival time, it’s the sort of revelation that machine learning is good at uncovering.
The current study compared the algorithm’s performance to that of the patient’s doctor and a traditional statistical model. The researchers were hoping that the algorithm might perform almost as well as the doctors at predicting which patients would survive for longer. They were surprised to find that the algorithm was actually slightly more accurate than the physicians and much more accurate than the traditional statistical model.
The researchers measured accuracy of the survival estimates using a number called C-index. This is defined as the chances that for two randomly selected patients, the one who lived longer was correctly predicted to live longer. A C-index of 1.0 would indicate perfect performance, and 0.5 would indicate that the estimate is no better than a coin flip. The computer model’s C-index was 0.70, better than the physicians’ C-index of 0.67 and the traditional model’s 0.64.
When Gensheimer and team combined the physician and machine learning predictions, the results were best of all. That is, physician experience and knowledge informed by the algorithm’s predictive capabilities outperform either in isolation.
“The machine learning model really is adding something here,” he notes.
The team’s success raises another potential concern about the future of AI in medical decision making: What happens if patients are uncomfortable with the use of AI in their medical care? To address that challenge, Gensheimer plans to survey patients on how they feel about the use of artificial intelligence in their care and any pitfalls that might hinder further adoption of such tools.
“We want to address patient concerns from the outset,” Gensheimer says. “The success of these artificial intelligence techniques depends on earning patients’ trust in the algorithms.”
Stanford HAI's mission is to advance AI research, education, policy and practice to improve the human condition. Learn more.