Listening to the recent debate between human debating champion Harish Natarajan and IBM’s Watson-derived Project Debater, I was amused to hear the AI conclude her* remarks with a Benjamin Disraeli quote. His sentiment was consistent with her argument, but to my ears the quote was not exactly a compelling clincher. I found it a little clichéd, frankly, which is why it amused me. Apparently AIs learning how to debate go through the same stages as students. Although once I heard more about how the software was trained, it made more sense that she would replicate the tendencies of her human counterparts.
The debate was the latest in the Intelligence Squared series, but the first with a computer participant. I’ve been listening to Intelligence Squared debates for a number of years now. I’m not entirely sold on the debate format, particularly the idea of turning healthy discussion into a competition with winners and losers. Still, they always get top notch contributors. I appreciate learning from their expertise, hearing multiple perspectives on important issues, and sometimes having my beliefs and opinions challenged–including my thoughts about the role of debate and rhetoric in society. So I was excited and intrigued to hear what artificial intelligence would bring to the conversation.
The results were markedly different than the usual Intelligence Squared debate, starting with the format. To better suit the AI (and make it clear the debate remarks were not scripted in advance), neither participant knew the topic beforehand and the resolution was more abstract than usual. So instead of experts sharing their depth of knowledge on a specific question–for example, the previous debate was about de-extinction and included leading biologists talking about specific projects like the one to bring back the woolly mammoth–we got debaters talking about subsidizing preschools in very broad terms.
Despite not being able to research the topic, Project Debater came prepared with an extensive database on many issues, allowing her to cite several specific studies. In my opinion, that was her main advantage; Natarajan had no references to back up his claims. Still, he seemed to have some familiarity with the issue, and I think part of his success was the ability to tackle the issue at a slightly deeper level. For all her facts and figures, Project Debater could only deploy them in the broadest framework, presenting evidence that preschool has benefits and arguing for the moral obligation to give aid to those who need it. While still not speaking to any particular policy proposals or contexts, Natarajan could nevertheless take things one step further by pointing out the need to consider budget tradeoffs and expressing skepticism that preschool is an absolute good that will always deliver the promised benefits. These were fairly generic critiques that at best argued for an agnostic stance rather than making a clear case in the negative, but Project Debater didn’t seem to fully understand them or bring additional evidence to rebut them.
Which brings me back to how Project Debater was trained, or at least how it was described by the project leads after the debate. Listening to the debate first and hearing her cite specific studies, I thought perhaps IBM had implemented an AI that could digest publications, assess their topic and conclusions, and provide summaries of them. But afterwards the scientists described her database as a collection of billions of sentences, which she selected from to construct her remarks. So was every statement really a quote, or possibly a close paraphrase with some grammatical adjustments to fit the flow of her argument? Maybe the statements are closer to amalgamations of several quotes expressing similar ideas, if I understand this description of the process correctly. In either case, those are really human summaries of the studies. That would explain her difficulty rebutting her opponent’s critiques. It is plausible that someone has previously made some general summary comments about a given study; it is less like they have framed such comments in the context of a response to his particular points.
Finally, I also felt Natarajan had a mathematical advantage. The way Intelligence Squared debates are scored is to poll the audience before and after the debate and see which side changed the most minds. Going into the conversation, 79% of the audience agreed with subsidizing preschools. That is exceptionally high for one of these debates; usually the audience is closer to an even split with a larger pool of undecideds. Project Debater practically had nowhere to go but down. It would be interesting to see what the outcome would be with the roles reversed.
So, besides the novelty of the whole thing, what do we make of Project Debater? I found it interesting to see AI applied to a domain in the humanities, rather than the more quantitative applications we usually hear about. So, are there other areas in the humanities you’d like to see AI tackle? Could something like this digest theology texts and come up with persuasive essays on theological topics? Could an AI come up with novel theological ideas? Would anyone entertain those ideas if it did?
*I’m following the convention used in the debate and referring to the Project Debater AI as her. I realize there is a whole separate conversation to be had about gender and artificial intelligence; for these purposes I am setting that conversation aside and deferring to the preferences of those most closely involved.
Andy has worn many hats in his life. He knows this is a dreadfully clichéd notion, but since it is also literally true he uses it anyway. Among his current metaphorical hats: husband of one wife, father of two teenagers, reader of science fiction and science fact, enthusiast of contemporary symphonic music, and chief science officer. Previous metaphorical hats include: comp bio postdoc, molecular biology grad student, InterVarsity chapter president (that one came with a literal hat), music store clerk, house painter, and mosquito trapper. Among his more unique literal hats: British bobby, captain’s hats (of varying levels of authenticity) of several specific vessels, a deerstalker from 221B Baker St, and a railroad engineer’s cap. His monthly Science in Review is drawn from his weekly Science Corner posts — Wednesdays, 8am (Eastern) on the Emerging Scholars Network Blog. His book Faith across the Multiverse is available from Hendrickson.