Nurse Dina Sarro didn’t know much about artificial intelligence when Duke University Hospital installed machine learning software to raise an alarm when a person was at risk of developing sepsis, a complication of infection that is the number one killer in US hospitals. The software, called Sepsis Watch, passed alerts from an algorithm Duke researchers had tuned with 32 million data points from past patients to the hospital’s team of rapid response nurses, co-led by Sarro.
But when nurses relayed those warnings to doctors, they sometimes encountered indifference or even suspicion. When docs questioned why the AI thought a patient needed extra attention, Sarro found herself in a tough spot. “I wouldn’t have a good answer because it’s based on an algorithm,” she says.
Sepsis Watch is still in use at Duke—in no small part thanks to Sarro and her fellow nurses reinventing themselves as AI diplomats skilled in smoothing over human-machine relations. They developed new workflows that helped make the algorithm’s squawks more acceptable to people.
A new report from think tank Data & Society calls this an example of the “repair work” that often needs to accompany disruptive advances in technology. Coauthor Madeleine Clare Elish says that vital contributions from people on the frontline like Sarro are often overlooked. “These things are going to fail when the only resources are put towards the technology itself,” she says.
The human-machine mediation required at Duke illustrates the challenge of translating a recent surge in AI health research into better patient care. Many studies have created algorithms that perform as well as or better than doctors when tested on medical records, such as X-rays or photos of skin lesions. But how to usefully employ such algorithms in hospitals and clinics is not well understood. Machine learning algorithms are notoriously inflexible, and opaque even to their creators. Good results on a carefully curated research dataset don’t guarantee success in the chaotic clockwork of a hospital.
A recent study on software for classifying moles found its recommendations sometimes persuaded experienced doctors to switch from a correct diagnosis to a wrong one. When Google put a system capable of detecting eye disease in diabetics with 90 percent accuracy into clinics in Thailand, the system rejected more than 20 percent of patient images due to problems like variable lighting. Elish recently joined the company, and says she hopes to keep researching AI in healthcare.
Duke’s sepsis project started in 2016, early in the recent AI healthcare boom. It was supposed to improve on a simpler system of pop-up sepsis alerts, which workers overwhelmed by notifications had learned to dismiss and ignore.
Researchers at the Duke Institute for Health Innovation reasoned that more targeted alerts, sent directly to the hospital’s rapid response nurses, who in turn informed doctors, might fare better. They used deep learning, the AI technique favored by the tech industry, to train an algorithm on 50,000 patient records, and built a system that scans patient charts in real time.
Sepsis Watch got an anthropological close up because the Duke developers knew there would be unknowns in the hospital’s hurly burly and asked Elish for help. She spent days shadowing and interviewing nurses and emergency department doctors and found the algorithm had a complicated social life.
The system threw up alerts on iPads monitored by the nurses, flagging patients deemed moderate or high risk for sepsis, or to have already developed the deadly condition. Nurses were supposed to call an emergency department doctor immediately for patients flagged as high risk. But when the nurses followed that protocol, they ran into problems.
Some challenges came from disrupting the usual workflow of a busy hospital—many doctors aren’t used to taking direction from nurses. Others were specific to AI, like the times Sarro faced demands to know why the algorithm had raised the alarm. The team behind the software hadn’t built in an explanation function, because as with many machine learning algorithms, it’s not possible to pinpoint why it made a particular call.
Source : Wired