It’s 3am. You wake up hot and clammy. A cactus withered and died in your mouth. Joints you never knew you had scream for WD-40. Breathing hurts. Blinking hurts. You can feel your fingernails growing, and that hurts. Did you get the flu? Zika? Are you patient zero for the latest emerging plague? You could go see a doctor, but you’re a graduate student with rudimentary health insurance whose coverage documentation you barely skimmed because c’mon, you’re young and healthy and you don’t plan on getting sick–no, you don’t plan on having time to get sick. Would a trip to the ED be covered? Does it have to be a certain ED? “Reply hazy; try again” is the best your fevered mind can muster. Can the Internet be your physician? “It is decidedly so.”
You may be shocked to learn that 3am fever brain doesn’t always make good decisions. Nor is the Internet always right, as it happens. In a comparison of diagnostic accuracy, human physicians performed better than a “symptom-checking app.” Now, I don’t think this is a significant indictment of artificial intelligence or anything. Both the humans and the app were dealing with incomplete information. They were only given clinical vignettes–no exams or tests or lab results or anything diagnostic, just whatever the patient reported initially. As a result, even the human physicians were wrong 15% of the time. That does not mean your doctor is only right 85% of the time; your doctor will use all of the diagnostic tools when treating you but those tools weren’t available in this particular scenario. The doctors needed to be on an equal footing with the app, which is intended for patients to input their symptoms prior to seeing a doctor. I’ve consulted Dr. Internet myself, so I figured maybe you have too and you’d want to know about the good doctor’s performance.
For comparison, a state-of-the-art AI like IBM’s Watson can do as well as or perhaps even better than humans when given a full range of diagnostic information. Computers are also allegedly getting better than us at speech transcription, among other tasks. Maybe I’m naïve, but I think these claims raise an interesting philosophical question. When a computer program defeats a human at a game like chess or Go, the head-to-head result is straightforward to interpret. But how is a diagnosis or a transcription evaluated? Ultimately a human has to provide the correct answer, which means on some level a human is the best at the task. From that perspective, it’s not clear that computers ever can be better than humans at tasks like these. Either the computer exactly matches the human, or it deviates in which case it is scored as wrong. So should we even care about human vs computer? Should we aspire for something different, maybe even better, for our computer programs?