College of Liberal Arts & Sciences


The Yakut samples (n = 371) were characterized for several variable sites identified in the coding region of mitochondrial DNA (mtDNA). This genetic system is a circular genome that is passed through the mother’s line and can provide important insights into the maternal ancestry of populations.

The majority of the Yakut samples belong to two mtDNA lineages: haplogroup C (41.2%) and haplogroup D (28.6%). This haplogroup pattern is similar to the neighboring Evenks and south Siberian populations.

Haplogroup frequencies of Asian Populations

MJ Network

A portion of the mtDNA, known as the control reigon, was sequenced for a subset of the Yakut samples (n = 144). This was done in order to obtain more detailed information on the relationship of Yakut mtDNA to that of other populations. The sequencing revealed 53 different types of mtDNA (called haplotypes) and a high level of sequence diversity. The phylogenetic tree for the haplotypes (M-J network) has a fragmented structure and is characterized by several isolated high frequency nodes, such as subhaplogroups C4a and D5a. This mtDNA structure can be interpreted as the genetic consequence of a founder event associated with the initial Yakut expansion into northeastern Siberia, in which a limited number of lineages played an important role.

Median-joining network

MDS Plot

Multi-Dimensional Scaling (MDS) analysis indicates that the Yakuts are genetically closest to the Tuva, a Turkic-speaking population from south Siberia. Based on geographic distributions, subhaplogroup C4a is most common amongst south Siberians, whereas subhaplogroup D5a is mainly found in East Asian groups such as the Mongols and Chinese Han.

MDS plot of genetic distances


The Yakut distribution for pairwise differences between sequences (known as a mismatch distribution) is unimodal, a feature that is considered to be the hallmark of an expanding population. However, the expansion event was dated to approximately 42,000 years ago using Rogers and Harpending’s (1992) mismatch model, and thus the unimodality likely reflects Paleolithic demography associated with the early peopling of Asia and Siberia. Yakut subhaplogroups C4a and D5a, identified as potential founder lineages based on their prevalence among Asian groups to the south and ancient Yakut specimens, were dated to about 2,300 and 450 years ago, respectively.

Mismatch distribution


Overall, the results from this study indicate that close phylogenetic relationships between human populations can be accurately identified by characterizing the geographic distribution of shared mtDNA haplotypes, and employing multivariate techniques such as MDS projections of genetic distance matrices. In this instance, the Yakuts exhibit strong genetic ties to Turkic- and Mongolic-speaking groups from south Siberia and East Asia, which are generally consistent with the Yakuts’ southern origins. Reconstructing historical demography proved to be more problematic. Neutrality test statistics, such as Tajima’s D and Fu’s FS, and the modality of mismatch distributions both appear to be significantly influenced by regional gene flow and thus can be confounding in terms of identifying population growth or decline. The MJ network and the haplotype frequency distribution (not shown), on the other hand, retain features of the Yakuts’ recent founder event, and the coalescent dates calculated for individual lineages, namely C4a and D5a, are more congruent with the timing of Yakut expansion into northeastern Siberia.

Future work is planned on studying the affects of a spatial expansion on the molecular diversity of central and periphery demes within an expanding population.