Science Corner: Suitable Help Math - Emerging Scholars Blog

“Can we visit every station exactly once and wind up back here?” (*Image by Liliana Drew at Pexels*)

Everyone has thoughts on artificial intelligence and large language models (LLMs) these days, so it can be tough to cut through the noise. Personally, the novelty applications that make the rounds on social media don’t really grab me. Caricature portrait of myself? I’m sure I’ll like what my wife can draw better. Pop culture mashup? I’ll just wait for Avengers Doomsday. A knowledgeable person has found a practical use case that solves a real problem they had? Now we’re talking. So I perked up when I saw that Donald Knuth was excited about the mathematics abilities of Claude Opus 4.6.

Not familiar with Donald Knuth? He’s probably not a household name. There’s a decent chance he’s your favorite computer scientist’s favorite computer scientist. (Or maybe that’s Sister Mary Kenneth Keller, the co-first computer science PhD in the US, a pioneer in computer science education and contributor to the development of BASIC, my first programming language and maybe yours too.) Knuth is a somewhat legendary figure in computer science circles. As an undergrad, I remember hearing rumors about his idiosyncratic relationship to email; while probably as factually unreliable as any other bit of hearsay, no other computer scientists were even the subject of urban legends. He created the TeX typesetting system and the WEB/NOWEB tools for literate programming, all technologies that I use regularly in my day job. He is also the author of the foundational text The Art of Computer Programming, an ongoing project of his that led to his encounter with Claude.

His first person account is included in the linked write-up. I’d encourage you to read those bits for yourself, even if the math parts are intimidating (more on those shortly). Briefly, he posted an unsolved math problem that he was working on for his book, someone posed the problem to Claude Opus 4.6 which was able to find some solutions, and then further iterations with it and other LLMs produced additional solutions and written formal proofs that those solutions are indeed valid.

What was the unsolved problem? It’s about Hamiltonian cycles, a topic in graph theory. In graph theory, a graph is a set of nodes and edges between them. The nodes could be physical places and the edges transportation links, like subway stations (nodes) and the tracks that connect them (edges). Or they could be people (nodes) and friendships (edges). Or actors (nodes) who costarred in TV or movie projects (edges). A Hamiltonian cycle is a path that starts at one node and follows the available edges to every other node, visiting each one before winding up back at the start node. Knuth wasn’t planning a transit excursion or a movie marathon, he was working on a more abstract setup. For a directed graph (the edges point in one of the two directions) with a specified number of nodes (m³) connected in a specified way, can you split the edges into three equal groups such that each group is a Hamiltonian cycle?

Tools to help with math are not new. It is a matter of cost-benefit analysis and who gets to benefit from their help vs who gets replaced. (*Image by Hanna Pad at Pexels*)

Sometimes with a math problem like this you can prove that solutions exist without knowing what that solution or solutions are or look like. Or you can find the actual solution(s), thus proving they exist. In this case, for any given value of m you might be able to come up with a solution, but that doesn’t tell you about whether there is a solution for any other value. So even better is a procedure that constructs solutions given the value of m of interest. Then you need a proof that your procedure produces genuine solutions, without actually generating all (potentially infinite, depending on the range for m that the procedure works for) solutions and checking them individually. And that is what the LLMs were able to produce: several different procedures for generating solutions that worked for different ranges of m, and in at least one case the corresponding formal proof that the procedure worked.

To accomplish this, Claude iterated through a few dozen ideas about what such a procedure might look like. For each one in turn, it generated computer code that would implement the proposed procedure. Then it would run that code for specific values and check whether the procedure worked in those cases. When it found that the procedure didn’t work, it moved on to another idea, until it eventually came up with a procedure that worked for several odd values of m and declared success. For that procedure, Knuth himself then worked out a formal proof manually. Subsequently, other folks iterated with Claude and other LLMs to find different procedures, including some that worked for even numbers, and one of those procedures was then proven to be a general solution in yet another LLM session.

To be fair, even to generate the first working procedure, Claude did not do all of those steps smoothly in one pass. There were some starts and stops, and in the initial sessions it just plain conked out trying to find solutions for even values of m. At the same time, it did not need to be led step-by-step through the process or given ideas to try; it apparently has some ability to take multiple steps towards a goal on its own.

How it does that is harder to say. Some LLMs, in some modes, do provide running commentary on what they are doing and why, but it is not clear to me whether that is any more accurate than any other text generated by such models. To be fair, we don’t know everything that goes on when humans carry out complex tasks like this either, and given how much goes on at the unconscious level, we may not be 100% reliable about our own processes either. But to look at the output, it sure seems like Claude has a better understanding of graph theory than I do, or at least a better facility to do graph theory work. Maybe not better than Donald Knuth or the other mathematicians and computer scientists involved, but better than me.

Does that mean we are on the way to superhuman, and perhaps even God-like intelligence? That’s certainly the sci-fi answer, but not for me to say. However, just the idea that we are creating peers for ourselves–not in the traditional, what-parents-have-been-doing-for-years way, although that is remarkable for its own reasons–represents a substantial development. Sure, we’ve been making tools that can replace or exceed human capabilities, both physical and mental. But we’ve never had a tool we could talk to like this before. I don’t think that makes us as gods or anything like that, but I can see how someone could wind up following that train of thought.

Knuth himself might be interested in such questions himself. His book, Things a Computer Scientist Rarely Talks About, has sections like “Computer programmers as creators of new universes” and “Other concepts of computer science that may give insights about divinity.” The book predates the current AI boom, but his insights on how computer science can speak with theology are likely still relevant.

Andy Walsh

Andy has worn many hats in his life. He knows this is a dreadfully clichÃ©d notion, but since it is also literally true he uses it anyway. Among his current metaphorical hats: husband of one wife, father of two teenagers, reader of science fiction and science fact, enthusiast of contemporary symphonic music, and chief science officer. Previous metaphorical hats include: comp bio postdoc, molecular biology grad student, InterVarsity chapter president (that one came with a literal hat), music store clerk, house painter, and mosquito trapper. Among his more unique literal hats: British bobby, captain’s hats (of varying levels of authenticity) of several specific vessels, a deerstalker from 221B Baker St, and a railroad engineer’s cap. His monthly Science in Review is drawn from his weekly Science Corner posts — Wednesdays, 8am (Eastern) on the Emerging Scholars Network Blog. His book Faith across the Multiverse is available from Hendrickson.

Share this:

Reader Interactions

Leave a ReplyCancel reply

Footer

Recent Posts

Article Categories