For the new year, let’s look at where new genes come from. Many genes around today have pedigrees going back billions of years. Humans can find analogues of most of their genes in chimpanzees and other primates, suggesting our common ancestor millions of years ago had comparable genes. At the same time, many species including humans have genes found in no other species. As with so many things–to-do lists, spreadsheets, computer code, building plans–there are two main ways to get new ones: copy and modify something existing, or start from scratch. A newly published paper by Ni A. An et al (covered in this news article, if you prefer) explores in substantial detail the pathway by which some human-specific genes came about via the “from scratch” or de novo method. The twenty co-authors further demonstrated that one of these genes could play a role in how human brains came to be larger than chimpanzee brains.
Before we get too far into the results, I’d like to take a moment to marvel at the sheer volume of work in this paper, the diversity of the methods used and the newness of so many of those techniques. When I was the age my kids are now, a number of these methods did not exist or were not yet widely usable. Here are some highlights of what the team did:
- Search multiple large sequences databases to identify unique human genes; relies on 21st century sequencing technology and Internet infrastructure
- Extract RNA for those genes from human and monkey cells; more high-throughput sequencing
- Classify gene features from the sequence data; the convolutional neural networks go back to the 1980s, but widespread use relied on GPU implementation in the 2000s
- Create a library of gene variants using CRISPR/Cas9; the Nobel Prize-winning paper describing the process was published in 2012
- Grow brain organoids which incorporated their edited versions of one of the genes; the first paper on making those was published in 2013
- Engineer mice with modified genes; we’ve been able to do that for longer but CRISPR/Cas9 has made that process more straightforward
While this research group did not create any of these tools, it is still remarkable to me to see so many of them deployed in a single paper and makes me wonder what a paper in 2050 will look like.
So now what did they find from all that work? As I said, they were exploring a pathway by which new genes are made “from scratch.” I use scare quotes because of course these new genes aren’t made from literal nothing. Nor are they made by building up a new sequence of DNA from the chemical or molecular components. Really what’s going on when a new gene is made de novo is that a stretch of DNA that does not specify a gene undergoes a series of mutations until it does. When we say a stretch of DNA specifies a gene, what we mean is that that bit of DNA is transcribed to RNA and that RNA carries out some function. Most often, that function is to serve as the template for constructing a protein which has some further function like breaking down a food molecule to get energy. Sometimes the function is something else, like forming the ribosomes that make protein from RNA templates. After it is transcribed, RNA typically needs to leave the nucleus (where DNA is stored) and move to the cytoplasm where it can do its job.
If that seems complicated, think of the DNA as a valuable and out-of-print reference book; in that book is a chapter used to teach a course. Each student can’t have their own copy, and you don’t even want them handling the book themselves. So the book is stored in a special reference collection that only librarians can access. When needed, they go and make photocopies of the relevant chapter and bring those out to where the students can get them. The RNA is like those photocopies; easily made and replaced as needed and keeps the valuable original DNA away from a lot of rough handling.
Like the uncopied chapters of that reference book, most of our DNA does not specify any genes. This is the scratch area from which de novo genes can arise. First, the sequence has to change such that it is transcribed into RNA in significant numbers. Any stretch of DNA might get transcribed randomly every now and again, but it needs to start with a promoter to be transcribed regularly. Many random DNA sequences are just one mutation away from becoming a functional promoter, so this is not an insurmountable barrier. But transcribed RNA also has to get out of the nucleus to serve as a protein template or carry out most other functions, and that requires further changes. What this new paper demonstrated was evidence of that second transition. In comparing human and monkey (specifically, Rhesus macaque) genes and human and monkey RNA, they found sequences which specify genes in humans but in monkeys specify RNA that stays in the nucleus. In the human lineage, additional changes accumulated which gave that RNA the markers needed to be taken into the cytoplasm. More intriguingly, one of those genes was put into brain organoids and into mice and in both cases the result was more human-like features.
The origin of new genes is a frequent topic among those who are skeptical that humans evolved from a common ancestor with apes. One question that gets raised is whether there is enough time for these new genes to arise. Ten million years sure sounds like a lot, but mutations accumulate fairly slowly, especially when they are not specifically selected for. The creation of de novo genes likely involves such neutral mutations; it is unlikely that random DNA is just one step away from a useful function. But as this paper demonstrates, the flip side of that consideration is that if the mutations are neutral, they don’t have to start accumulating after the species divergence point. These genes which are unique to humans have intermediate origins in our common ancestor with macaques from 30 million years ago. So some of the changes on the way to gene status are even older still. This gives a much longer time horizon over which those mutations can accumulate.
Personally, there’s something somewhat reassuring about that thought apart from considerations of human evolution. It is easy to focus on changes that have big results, like the final steps in the de novo gene process that actually produced bigger brains. But to get there, a lot of other changes were needed whose impact was undetectable at the time of the change. We may not always see the fruits of our labors, but that doesn’t mean they will be forever fruitless.
Andy has worn many hats in his life. He knows this is a dreadfully clichéd notion, but since it is also literally true he uses it anyway. Among his current metaphorical hats: husband of one wife, father of two teenagers, reader of science fiction and science fact, enthusiast of contemporary symphonic music, and chief science officer. Previous metaphorical hats include: comp bio postdoc, molecular biology grad student, InterVarsity chapter president (that one came with a literal hat), music store clerk, house painter, and mosquito trapper. Among his more unique literal hats: British bobby, captain’s hats (of varying levels of authenticity) of several specific vessels, a deerstalker from 221B Baker St, and a railroad engineer’s cap. His monthly Science in Review is drawn from his weekly Science Corner posts — Wednesdays, 8am (Eastern) on the Emerging Scholars Network Blog. His book Faith across the Multiverse is available from Hendrickson.