Earlier in this chapter, we saw how genetic information is encoded in an organism’s DNA as a sequence of nucleobases. Over the past few decades, increasingly efficient gene-sequencing technologies have enabled geneticists to determine the precise nucleobase sequences of thousands of genes. Even whole genomesA “whole genome” sequence usually only includes protein-coding genes, not the entire DNA sequence. See the fine-print section, below, for further clarification. have been sequenced for many species, including human beings. This wealth of data has spawned a new field of research known as comparative genomics, in which the genomes of different organisms are compared using computer algorithms to search for commonalities and differences.
The claim that geneticists have sequenced “whole genomes” for many species is somewhat misleading. So-called “whole-genome sequencing” typically includes only the protein-coding genes, excluding other functional regions that often comprise much larger proportions of the organism’s DNA. For example, less than 2% of human DNA consists of protein-coding genes. A further 20% of our DNA is involved in the gene-regulatory networks that control protein-coding genes, and a total of around 80% of our DNA is active in the sense that it is transcribed to RNA, though most of its functions are unknown.For further discussion of these points, I recommend Lecture 6 of Sam Kean’s course on “Unlocking the Hidden History of DNA” in The Great Courses lecture series. The Human Genome Project, completed in 2003, succeeded in sequencing most of our protein-coding genes, but it didn’t even attempt to sequence the remaining 98% of our DNA. The ongoing ENCODE Project aims to “build a comprehensive parts list of functional elements in the human genome,” including not only protein-coding genes but also “regulatory elements that control cells and circumstances in which a gene is active.”Quotations from the ENCODE Project Overview.
As we might expect, comparisons of human DNA show that closely-related family members typically have similar genetic sequences, while distant relatives differ more. By comparing their genomes, therefore, it is possible to estimate how closely related any two people are—that is, how long ago they shared a common ancestor. Likewise, analyses comparing the genomes of different species often seem to indicate common ancestry between various kinds of organisms.
For several reasons, however, comparative genomics is not as straightforward as simply counting the number of base pairs that differ. First, the two DNA sequences must be carefully aligned to ensure that the same genes are being compared, and the alignment can be difficult when comparing genomes of different species. Since genomes typically contain large repeated segments, often they can be aligned in multiple different ways, so the comparison will depend on which alignment is chosen. To make matters worse, many genes have the mysterious ability to move around within a genome, and these so-called “jumping genes” make the alignment decisions even more challenging.Transposable elements (“jumping genes”) and other mobile genetic elements (DNA segments that can move around) constitute nearly 50% of the human genome, according to some estimates. For further discussion, I recommend Lecture 6 of Sam Kean’s course on “Unlocking the Hidden History of DNA” in The Great Courses lecture series. See also this article for discussion of the possible roles mobile genetic elements might play in the human genome. Second, there isn’t any standard way of deciding whether two sequences are similar enough to count as the “same gene” for comparison purposes. Moreover, a genome typically contains multiple copies of certain genes, and researchers have to make an arbitrary choice whether to count them as a single gene or several genes when evaluating the overall degree of similarity between genomes. Finally, and most importantly, the overall similarity between two genomes doesn’t always correlate with their relatedness. The DNA of a human boy may resemble his male cousin’s DNA more than that of his own sister, for instance, due to the major differences between the X and Y chromosomes which distinguish the sexes. For another example, the genome of an organism with polyploidy (extra copies of a chromosome resulting from a duplication mutation) may have a greater number of base pairs in common with another polyploid species than it has with non-polyploid members of its own species. Thus, careful methods of analysis must be employed to determine the true family relationships even between members of the same species, let alone to discern evolutionary relationships between different species.
A detailed examination of the methods of comparative genomics is beyond the scope of this summary, but it will be worthwhile to consider several kinds of clues typically seen as strong evidence of shared ancestry between species:For a more detailed explanation of various kinds of genetic evidence for evolution, see chapter 6 of Darrel Falk’s book Coming to Peace with Science: Bridging the Worlds Between Faith and Biology (Downers Grove: InterVarsity Press, 2004) 189-192. Falk is a Christian biologist who endorses evolutionary creationism.
Moreover, introns sometimes contain identifiable DNA sequences known to originate from retroviruses—viruses that trick the host cell into inserting their genetic information into the host cell’s own genome. When the viral genes are spliced into an intron, the host cell ignores the viral DNA and the host is unharmed. Nevertheless, if the insertion occurs in a germ cell (sex cell), the viral DNA can be passed to the organism’s descendants. Viral genetic sequences that are inherited in this way are called endogenous retroviruses (ERVs). Similarly, repetitive DNA segments called retroposons sometimes are inserted into introns within germ cells. These insertions, too, can be inherited by an organism’s descendants. Finding a retroposon or an ERV in the same specific location in two different genomes strongly suggests common ancestry, since it is unlikely that the same insertion would occur twice in exactly the same place just by chance. For example, the same retroposon has been found in the same intron in whales and cows, but not in horses, suggesting that cows shared a more recent common ancestor with whales than they did with horses!A retroposon named SINE-CHR-1, which is 120 nucleobases long, has been found in the same location within an intron inside the same gene in cetaceans (such as whales, dolphins, and porpoises) and even-toed ungulates (such as cows, deer, and sheep). Odd-toed ungulates, such as horses, do not share this retroposon. For further discussion of this example, see Darrel Falk’s book Coming to Peace with Science: Bridging the Worlds Between Faith and Biology (Downers Grove: InterVarsity Press, 2004) 189-192. Based on this evidence and other similar genetic indicators (along with some evidence from fossils, which we’ll discuss in the next section), evolutionary biologists have drawn the surprising conclusion that cows are indeed more closely related to whales than to horses.
Critics have responded to the above lines of evidence in a variety of ways. In rebuttal to the first point, many have noted that similar DNA doesn’t necessarily imply common descent. It may instead point to a common designer. When we find cars with similar engines and body styles, we don’t conclude that the cars descended from each other; we conclude that they were manufactured according to a similar blueprint, perhaps designed by the same team of engineers. Analogously, someone who believes that God designed all varieties of life need not be surprised to learn that He used a similar design plan for many of His creations. The degree of genetic similarity is still less surprising when we consider that DNA contains the instructions for producing many proteins and other cell parts that are common to all living organisms, regardless of how closely related those organisms may be. According to an analysis by the National Human Genome Research Institute, about 60% of human genes have a recognizable counterpart in banana plants,The corresponding banana genes are, on average, only 40% similar to ours in their precise nucleobase sequences however. See this article for further explanation. for instance; but no one thinks we’re closely related to bananas. The fact that similar organisms share many of the same genes is to be expected, whether they descended from a common ancestor or not.
On the other hand, the “common designer” rebuttal cannot easily explain why different species often share identical pseudogenes, ERVs, and retroposons. Nevertheless, there may be other ways of accounting for those similarities too. Many pseudogenes have turned out to serve important functions, including crucial roles in gene-regulatory networks. Ironically, the pseudogene Miller cited in his court testimony (see above) is one such example, as later studies revealed.The regulatory role of the pseudogene ψβ(HBBP1) is examined in Moleirinho et al., “Evolutionary Constraints in the β-Globin Cluster: The Signature of Purifying Selection at the δ-Globin (HBD) Locus and Its Role in Developmental Gene Regulation,” Genome Biology and Evolution 5 (2013), 559-571; also in Ma et al., “Genome-wide analysis of pseudogenes reveals HBBP1’s human-specific essentiality in erythropoiesis and implication in β-thalassemia,” Developmental Cell 56:4 (2021), 478-493. New functions for the non-protein-coding segments of DNA are still being discovered,See this article for further discussion. so seemingly non-functional pseudogenes may have been created for a purpose after all. Moreover, bacteria and other microbes sometimes take segments of DNA from one organism and transplant it into another. This process, called horizontal gene transfer, might account for some cases in which unrelated organisms share a few identical genes or pseudogenes. According to a recent study, at least 145 genes in the human genome appear to have come from horizontal gene transfer, and there may be others as well.Crisp, A., Boschetti, C., Perry, M. et al, “Expression of multiple horizontally acquired genes is a hallmark of both vertebrate and invertebrate genomes,” Genome Biology 16, 50 (2015). See this article for an accessible explanation.
Alternative explanations also have been proposed for the occurrence of identical ERVs and retroposons in different species. For example, some retroviruses seem to target specific locations in a host’s genome, and this might explain why multiple species share an ERV in the same location even if they didn’t inherit it from a common ancestor. In fact, numerous cases have been identified in which identical insertions appear to have occurred independently.See this article by biologist Jonathan McLatchie for further discussion and numerous citations of mainstream biological literature acknowledging such cases. As some evolutionary geneticists have acknowledged, such cases show that shared intron sequences don’t always indicate common ancestry.For example, in this Science Daily article, evolutionary biologist Michael Lynch is quoted as saying: “Remarkably, we have found many cases of parallel intron gains at essentially the same sites in independent genotypes. This strongly argues against the common assumption that when two species share introns at the same site, it is always due to inheritance from a common ancestor.”
Another objection, frequently raised against comparative genomics as a whole, alleges that the discipline engages in circular reasoning by presupposing universal common ancestry in its methodology: comparisons usually proceed on the assumption that two species shared a common ancestor at some time in the past, and the aim of the investigation is merely to determine when and how the divergence occurred. In my opinion, this allegation doesn’t undercut the three lines of evidence listed above, but a legitimate concern may be raised as follows. Comparative genomics looks for genetic similarities to determine which organisms are closely related to each other. If these inferred relations are then cited to support the foundational assumptions of comparative genomics, there is a danger of circular reasoning here.For example, if geneticists conclude that humans and chimps are closely related because our DNA is similar, and then cite this similarity as evidence that closely-related species have similar DNA, they would be reasoning in a circle. Presumably no individual scientist would reason in such a patently fallacious way, but errors of circular reasoning might occur when scientists working in different fields of study rely on each other’s conclusions without examining the evidence that led to those conclusions. Mistakes like this might give the impression that comparative genomics provides stronger evidence for universal common descent than it really does. One may hope that other scientists would soon discover and correct these errors, but opponents of the theory of evolution often lack such faith in the scientific community.
Perhaps the biggest problem with the genetic evidence for evolution, however, has arisen from within the field of comparative genomics itself. A central goal of comparative genomics, for evolutionary science, is to construct a phylogenetic tree, also called a phylogeny—a model representing the evolutionary relationships between species, typically shown as a branching diagram, like this:
The problem is that comparisons focusing on different genes or DNA segments often yield conflicting phylogenies. For example, some portions of the human genome seem to indicate that we are more closely related to orangutans than to chimpanzees, contradicting the generally-accepted phylogeny in which orangutans are our most distant relatives among the great apes.Hobolth et. al. (2011), “Incomplete lineage sorting patterns among human, chimpanzee, and orangutan suggest recent orangutan speciation and widespread selection.” Genome research, 21(3), 349–356. For other examples and further discussion of this problem, see here and here.
Standard approaches to comparative genomics look for ways to resolve these conflicts within the framework of mainstream evolutionary theory. However, advocates of Intelligent Design have argued that conflicting phylogenies point to a different explanation for the origin of genetic information. For example, Winston Ewert argues that genetic similarities fit the pattern of a “dependency graph” used by software engineers much better than they accord with the “branching tree” structure predicted by standard evolutionary theory.Winston Ewert, “The Dependency Graph of Life,” BIO-Complexity, Vol 2018. For an accessible explanation of Ewert’s argument, see this article. In other words, according to Ewert, the evidence from comparative genomics fits the hypothesis that life was engineered better than it fits standard evolutionary theory. Although some Intelligent Design theorists do accept the doctrine of universal common descent, they argue that genetic evidence reveals an intelligently guided process of evolution. Perhaps genetic information was pre-programmed into the finely tuned laws and initial conditions of the universe, or perhaps natural evolutionary mechanisms were influenced and augmented by supernatural causes. We’ll examine some of these ideas in Chapter 11.