I’ve had the chance to watch more baseball this season than usual, specifically New York Mets games. Baseball lends itself to extensive data collection and precise situational adjustments. For example, teams can record where each individual hitter tends to hit the ball when thrown different types of pitches. When a given hitter has strong tendencies in one direction, the other team can shift their defense to that side and possibly adjust their pitching to favor that tendency.
Opposing teams were making those kinds of shifts against Mets hitter Jeff McNeil this season, and on several occasions he was able to get on base by hitting the ball to the other side, where the defense was thin. This happened frequently enough that the Mets broadcasters openly wondered when the other teams would recognize this and change their strategy against McNeil. Meanwhile, it got me wondering about contingency vs capability and the limitations of data-driven inference.
Overfitting is a fundamental concern in machine learning and data analysis. When making inferences, you want to extract general principles and not idiosyncratic quirks of the specific data you happen to have; the latter is overfitting. Bigger samples of data can help, since patterns in smaller samples are more likely to be quirks; that’s probably why opposing teams didn’t reconsider their shift against McNeil after the first couple of hits. Techniques like cross-validation can also help by checking to see if the same patterns hold across different subsets of the data. But even with large samples and cross-validated inferences, the data can only tell you what players have done in the past. What you really want to know is the full range of what players can do. For that, you might need to run experiments to generate the data that doesn’t already exist.
Now, I am interested in baseball, but as we’ve discussed before I’m really interested in public health. And strange as it might sound, the nuances of defensive shifts got me thinking about the current monkeypox situation. We’ve known about monkeypox since the 1950s, and the first detected case in humans was in 1970, over 50 years ago. We know that it can spread by close skin-to-skin contact and by contact with materials like bedding or clothing used by an infected person and by respiratory droplets. We know that because we have good models of how virus transmission works in general, and from specific research on monkeypox in the lab and analysis of data from past outbreaks.
Data on the current monkeypox outbreak has revealed some strong tendencies; many of the cases are in men who have sex with men. In light of these tendencies and the limited resources currently available for testing and vaccination, a shift in focus to concentrate on the populations most at risk–if the past trends continue–makes a certain amount of sense. At the same time, if we only test folks who meet those criteria, we will only ever identify cases among those groups and reinforce the trends in the data regardless of what is actually happening. Given everything we know about the virus and what it can do–and also what we know about human behavior and all the different ways close contact can occur beyond sexual encounters–there is the distinct possibility of missing something.
In a sense, this is a recapitulation of a broader problem around monkeypox and a number of other communicable diseases: thinking of them as diseases that happen to other people, but not Americans or those in the West. Monkeypox has be endemic in several African countries for years; there was nothing intrinsically African about the disease, but we have largely acted as if it is. The same could be said for a variety of diseases that Americans generally only think about when traveling, if at all. For example, malaria and yellow fever used to be endemic in the United States until they were eradicated with an extensive campaign that eliminated the pathogens from the continent but not the mosquitoes that transmit them. So transmission could occur here again, especially as the habitat of those mosquitoes expands due to climate change. If COVID-19 highlighted how climate change, globalization and human population growth increases the chances for new pathogens to cross over to humans, perhaps monkeypox can remind us of how those same factors can shake up the status quo of existing human diseases.
Working against our ability to adapt to these changes is the stickiness of messaging. Recent public health experience demonstrated the challenge here. Understandably, we tried to fine-tune our response to COVID-19 as we went along and circumstances changed. Perhaps most notably, masking recommendations changed several times as we learned more about presymptomatic spread, as mask availbility changed, as vaccination become widely available and as viral evolution changed the effectiveness of vaccination to prevent transmission. Those choices may have been strategic, but may not have fully reckoned with the psychology of memory and habit formation or with the actual speed of information dissemination. We may need to do a more thorough job up front with monkeypox to communicate the range of possible scenarios and what will trigger them rather than just focusing on what we want people to know and do right now.
So let’s be clear up front: regardless of who is currently most at risk or how initial resources are deployed, no one is intrinsically exempt from the possibility of infection. That hardly means we are all going to wind up getting monkeypox. But we all need to be aware of the possibility of getting infected. Relatedly, in the event of illness there’s no reason to infer anything beyond a need for some compassion and possibly healthcare (treatment is possible; as always, I cannot offer personal medical advice so speak to a healthcare professional if you have symptoms or concerns). And of course keep in mind the broader reality that pathogens respect no border or most of our other category delineations, so be ready for the next pattern shift even as you handle the current one.
Andy has worn many hats in his life. He knows this is a dreadfully clichéd notion, but since it is also literally true he uses it anyway. Among his current metaphorical hats: husband of one wife, father of two teenagers, reader of science fiction and science fact, enthusiast of contemporary symphonic music, and chief science officer. Previous metaphorical hats include: comp bio postdoc, molecular biology grad student, InterVarsity chapter president (that one came with a literal hat), music store clerk, house painter, and mosquito trapper. Among his more unique literal hats: British bobby, captain’s hats (of varying levels of authenticity) of several specific vessels, a deerstalker from 221B Baker St, and a railroad engineer’s cap. His monthly Science in Review is drawn from his weekly Science Corner posts — Wednesdays, 8am (Eastern) on the Emerging Scholars Network Blog. His book Faith across the Multiverse is available from Hendrickson.