Tuberculosis kills 1.4 million people every year, primarily in places where poverty and deprivation conspire to make people uniquely vulnerable, and unable to get lifesaving care in time.
Google is now joining a global fight to snuff out the disease, using AI to automate its detection — and expedite treatment — in communities where physicians are in short supply. A new study published Tuesday in Radiology, the journal of the Radiological Society of North America, found that its AI model performed as well as radiologists at detecting tuberculosis on chest X-rays.
Google is not the first to develop an AI system to detect TB, nor is its tool likely to make a dent in death rates anytime soon. But outside experts said its early results are especially promising given their consistency across diverse populations of patients. The model met or exceeded performance standards set by the World Health Organization when tested on historical patient data drawn from China, India, the United States, and Zambia.
“Unlike much of the data published about AI, (Google’s) study was large and used different training sets, which showed their system is robust,” said Edith Marom, the head of thoracic imaging at Chaim Sheba Medical Center in Israel.
Marom, who was not involved in the research, added that Google’s testing did not match real-life circumstances, however. The datasets contained higher-than-normal rates of disease and were skewed toward patients who were younger and able to stand for upright X-rays — conditions that typically make images easier to interpret. Its performance also dipped among sicker populations with more lung abnormalities, such as HIV-positive patients and a group of miners in South Africa.
“To be implementable worldwide, it would have to be tested in populations with a low prevalence of tuberculosis resembling the typical patients” with multiple abnormalities in the chest, Marom said. “It will also have to be tested on an older population, typically encountered in the hospital setting.”
Effective treatments for tuberculosis, which typically attacks the lungs, have long been available. But 90% of cases occur in 30 countries that often lack the resources to effectively screen patients, isolate them, and deliver the care they need.
Google’s model is designed to be used for screening rather than diagnosis. It analyzes X-ray images to determine which patients should receive follow-up molecular testing to confirm the presence of TB. Such testing is expensive, time-consuming, and impractical in large groups, which is why the WHO has concluded that chest radiography — augmented by AI — is an essential tool in fighting the disease.
Google’s tool rates chest X-rays for suspicion of TB on a 0-1 scale, with the higher score equating to a greater likelihood that the disease is present. In the study, the company’s researchers calibrated the tool to recommend follow-up testing at a threshold of .45, which turned out to be the right choice, as it proved highly sensitive in catching the disease without generating high rates of false positives.
The researchers said an overarching goal of the work was to expose the AI to a variety of patients and clinical circumstances to ensure that it would not be tripped up by variations that normally occur across different geographies and settings of care.
“We tried to be very comprehensive with the validation of this one to explore different TB presentations, different X-ray manufacturers, and countries,” said Rory Pilgrim, a product manager for Google Health’s AI team and co-author of the study. He added that the model was trained on data where the cases of TB were confirmed by molecular testing, rather than radiological findings whose diagnostic accuracy varies substantially.
The researchers also used a training method known as “noisy student” to help improve the model’s ability to recognize TB despite the different ways it can appear on X-ray images. That technique allowed them to expose the AI to data that was not explicitly labeled as positive or negative for TB. Normally, models are only trained on data explicitly marked based on the outcome of interest, so its trainers can tell the model if its conclusions were right or wrong.
Under noisy student, the model uses its training on prior data to generate its own labels for new images it encounters. That allows it to iterate and improve without limiting its exposure to only labeled examples.
“It let us leverage a lot more data from a TB screening population,” said Sahar Kazemzadeh, a Google software engineer who led the development of the model. The population in question was the data on South African miners, whose level of lung disease was greater, presenting the AI model with a higher degree of complexity.
“As the model sees more iterations and permutations, its generalizability goes up,” said Pilgrim. “It becomes less narrow” and able to find TB in a wider range of circumstances.
Google’s AI system, along with many others like it, could significantly improve TB outcomes and lower costs if it clears additional testing hurdles.
The next challenge is to examine its performance in a real-world environment. Google is now pursuing a study in a clinic in Zambia, where the accuracy of the tools findings will be measured against the results of molecular testing for each patient. The study is expected to be completed by the end of the year.