When the COVID-19 pandemic broke out in Wuhan, China, researchers in the lab of Olivier Gevaert, assistant professor of medicine and of biomedical data science at Stanford, wondered if their work with quantitative image analysis (also known as radiomics) and data fusion might be transferable to the COVID context.
Gevaert and his colleagues had previously shown that radiomics features in a lung CT could point toward different prognoses for lung cancer patients. Might a similar approach help predict the prognoses of COVID patients? And by fusing radiomics data with patients’ clinical information and lab test results, might the predictions improve further?
Read the full study: AI-based Analysis of CT Images for Rapid Triage of COVID-19 Patients
The answer to these questions proved to be “yes.” Using a dataset obtained from China, Gevaert, a Stanford HAI faculty member, and his team found that quantitative analysis of COVID patients’ CT scans was better able to predict an intensive care stay, use of a mechanical ventilator, or death than any other available data types alone, including a human radiologist’s reading of the CT, clinical information, or laboratory data. And when the radiomics data was fused with the clinical and laboratory data, the predictions were even more accurate. The work was published in Nature in April of 2021.
It’s worth noting, Gevaert says, that this was an observational study using retrospective data. Much more work is needed before the model can be deployed in hospitals, including validation in non-Asian populations and randomized clinical trials to determine whether the predictions should lead to redistribution of medical resources or different treatment decisions.
Although this additional work will take time, Gevaert thinks it’s possible that the project will prove useful to hospitals before the pandemic is over. But if not, the research nevertheless reinforces the value of radiomics and data fusion techniques. And because the methods are generic, they can be applied to other diseases. “Even if this doesn’t affect the pandemic directly, it’s another demonstration that quantitative analysis of CT imaging data can be a complementary source of biomarkers that can be used for prognosis or treatment,” he says.
A Two-Step Approach: Radiomics Plus Data Fusion
At the start of the COVID-19 pandemic, there was a spike of interest in working with COVID data among the graduate students and postdocs who worked in Gevaert’s lab, including visiting radiologist Peiyi (Penny) Xie, Stanford graduate students Xianghao (Sam) Zhan and Yiheng (Terry) Li, and visiting graduate student Qinmei Xu, who was associated with Jinling Hospital in Nanjing, China. Through Xu’s connections, the team obtained a dataset covering more than 3,000 confirmed COVID patients who were admitted to a network of 39 hospitals in China during the first three months of 2020. These data included not only CT scans, clinical records, and lab tests from the time of admission but also follow-up information about how sick the patients became.
The team focused on using all the various types of admissions data to predict three events of increasing severity – ICU admission, ventilator use, and then death – at least 48 hours after admission. “If a patient is in bad shape and is admitted to the ICU almost immediately, that’s a pretty easy prediction,” Gevaert says. “We wanted to have some sort of threshold for predicting the more distant future.”
The first step was to analyze the CT images using machine learning. This was done in two stages: identifying lung lesions using automated image segmentation (confirmed by radiologists); and extracting radiomics features to identify characteristics of those lesions that might be predictive of patient prognosis.
Certain features extracted using radiomics, such as shape, size, or mean intensity of a lesion’s brightness, are more easily interpretable than others, such as equations representing different types of textures. But when compared with features that were manually annotated by radiologists, the features selected by the algorithm were superior predictors of prognosis. “We don’t necessarily know what the biology behind it is, but it is clear that there are biomarkers in images that perhaps a radiologist cannot observe,” Gevaert says, similar to his lab’s work on CT images of lung cancer patients.
The team then merged the radiomics features with other data that were predictive of prognosis, such as age, gender, lymphocyte markers, and other lab results. “Taking the clinical data, labs, and demographic information, and combining that with the imaging, improves the performance of the model,” Gevaert says.
Next Steps: Validation
Gevaert would like to validate the work in a non-Asian population, but finding a dataset has proven challenging. CT hasn’t been widely used on COVID patients in the United States, and a European dataset that seemed promising didn’t include necessary information about what happened to the patients after admission. “We’re still trying to pursue validation of this model in additional populations,” Gevaert says. “But we have also made the models available so that anyone who has access to the COVID patient CT and follow-up data can test our models.”
Gevaert recognizes that the pace of science has been frustrating for the public. “There’s a craving for new results because people want to go back to their normal lives,” he says. “But going from code to clinic takes time – probably longer than people want.”
Stanford HAI's mission is to advance AI research, education, policy and practice to improve the human condition. Learn more.