当前位置: 首页 > 医学版 > 期刊论文 > 基础医学 > 分子生物学进展 > 2004年 > 第2期 > 正文
编号:11259345
Genetic Variation Versus Recombination Rate in a Structured Population of Mice
     Division of Population Genetics, National Institute of Genetics, Mishima, Japan

    E-mail: nsaitou@genes.nig.ac.jp.

    Abstract

    The correlation between genetic variation and recombination rate was investigated in a structured mouse population. Nucleotide sequence data from 19 autosomal DNA loci from eight inbred strains of mouse (Mus musculus) sampled from three major subspecies were analyzed. The recombination rate was estimated from the comparison of genetic and physical map distances between markers flanking a 10-cM region of each locus. The strains were categorized into four groups (subpopulations) based on geography. By partitioning the genetic diversity into within-group and among-group variation, we detected a positive correlation between the recombination rate and nucleotide diversity within groups. The level of nucleotide differentiation among groups (GST) showed a negative correlation with the rate of recombination. There was no significant correlation between recombination rate and nucleotide diversity when data from different subpopulations were pooled. No correlation was detected between recombination rate and nucleotide divergence of M. musculus and M. spicilegus. These patterns deviate from the strict neutral expectation under the constant nucleotide substitution rate, and they are likely to have been formed either by a hitchhiking effect of positively selected mutants or by background selection of deleterious mutants occurring in a subdivided population. Our series of comparisons show that because a real population always has some structure, incorporation of its information is important in detecting non-neutral evolution.

    Key Words: Mus musculus ? recombination rate ? population structure ? population subdivision ? genetic hitchhiking ? background selection ? FST

    Introduction

    Detecting the effect of natural selection in a structured population has thus far been a challenge. A number of methods have been proposed for detection of natural selection in a "panmictic population." One of the most commonly used approaches is to investigate whether the frequency spectrum of silent polymorphic sites deviates from the neutral expectation (Tajima 1989; Fu and Li 1993). An excess of high-frequency–derived variants may be the strongest evidence for positive selection (Fay and Wu 2000). However, this approach may be problematic; a real population usually has some structure, and unequal sampling from a structured population can result in a spurious Fay and Wu test (Przeworski 2002). Thus, ignoring the population structure is likely to obscure any pattern of selection, and this approach requires caution when the neutral model is being tested (e.g., Wall 1999).

    The effect of natural selection in a panmictic population can also be inferred from the presence of a positive correlation between recombination rate and nucleotide diversity. This correlation has been observed in Drosophila (Begun and Aquadro 1992; Aquadro, Begun, and Kindahl 1994; Andolfatto and Przeworski 2001), human (Nachman et al. 1998; Przeworski, Hudson, and Di Rienzo 2000; Nachman 2001, Lercher and Hurst 2002), and mouse (Nachman 1997). The pattern has been explained mainly by the genetic hitchhiking of rapidly fixed advantageous mutations (Maynard Smith and Haigh 1974; Kaplan, Hudson, and Langley 1989) and/or by background selection against deleterious mutations (Charlesworth, Morgan, and Charlesworth 1993; Charlesworth, Charlesworth, and Morgan 1995; Hudson and Kaplan 1995). The hitchhiking hypothesis has been supported well by empirical data in Drosophila (e.g., Aquadro, Begun, and Kindahl 1994; Langley et al. 2000; Andolfatto and Przeworski 2001). However, in a pooled sample of a subdivided population, the correlation is likely to become weak if the pattern of natural selection varies among subpopulations. In this case, population structure has to be taken into account. For example, a subdivided population model of genetic hitchhiking is proposed by Slatkin and Wiehe (1998). They showed that under some conditions, hitchhiking can lead to substantial population differentiation, as measured by Wright's FST. They also suggested that if subpopulations were completely isolated, greater differentiation would be found in regions of a genome with a lower rate of recombination. Thus, in their model, genetic hitchhiking is likely to form a negative correlation between recombination rate and population divergence.

    There are a few observations in Drosophila supporting a negative correlation between FST and the rate of recombination. For example, Stephan and Mitchell (1992) found reduced variation within populations and increased divergence between populations of Drosophila ananassae in India and Burma in regions with reduced recombination on the X chromosome. Begun and Aquadro (1993) found elevated FST in three of seven genomic regions with a reduced recombination rate in populations of Drosophila melanogaster in Zimbabwe and other localities. Both of these studies invoked genetic hitchhiking events as the cause (see also Stephan 1994). Nevertheless, it should be noted that the background selection model also predicts increased FST values in regions of reduced recombination (Charlesworth, Nordborg, and Charlesworth 1997), and it is indistinguishable from the hitchhiking model in the absence of a rigorous quantitative study (e.g., Stephan et al. 1998).

    In this study, we have analyzed the sequences of 21 nuclear genes in nine inbred mouse strains from three major subspecies of Mus musculus. (Liu, Takahashi, Kitano, Koide, Shiroishi, Moriwaki, and Saitou, unpublished data). This sequence data showed overall clustering of the strains within subspecies, with traces of genetic exchange between them, suggesting a large ancient population size and fairly vague subspecies level divergence in this species. Hence, these samples represent a structured population. To define closely related groups as units of a subpopulation in this species, we categorized the strains into four geographically related groups. Here, we investigate (1) whether the phenomenon of increased population divergence in regions of reduced recombination exists in mouse and (2) whether the effect of natural selection on linked neutral variation can explain the relationships between genetic variation and rate of recombination in this structured population.

    Materials and Methods

    Mouse Strains and Nucleotide Sequence Data

    DNA sequences of 19 autosomal genes from eight inbred strains of Mus musculus and one strain of Mus spicilegus were analyzed. The mouse strains used in this study are listed in table 1. All the sequence data used in this study are available from the DDBJ/EMBL/GenBank Database. The accession numbers of these sequences are AB039045 through AB039223. The strains of Mus musculus were categorized into four closely related groups based on subpopulations found in this species (table 1). We categorized Pgn2 and BFM/2 as M. m. domesticus group, BLG2 and NJL as M. m. musculus group, CAST/Ei and HMI as M. m. castaneus group, and MSM and SWN as M. m. molossinus group. ZBN is an inbred strain of M. spicilegus used as an outgroup. All of these inbred strains are wild-caught and have been maintained at our National Institute of Genetics (details described in Koide et al. 2000). Two genes on the sex chromosomes also in the database were excluded from the analyses because they are assumed to have smaller Nes than the rest of the loci.

    Table 1 Mouse Strains Used in This Study.

    Recombination Rate Estimates

    The rates of recombination across the mouse genome were estimated by comparing the genetic map and the high-density radiation hybrid (RH) map (physical map; Van Etten et al. 1999). The map distance data were taken from Whitehead Institute/MIT Center for Genome Research, Mouse EST RH Mapping Project, Public Data Release 10 (December 2001). The genetic map distance (cM) and the RH map distance (cR = centiRay) between the two markers flanking the ± 5 cM (10 cM) region of the locus were used to calculate the recombination rate (cM/cR). For each of the two loci, Fau and Fut4, whose genetic map positions are both 3.0 cM from the distal end of their chromosomes, a marker at the distal end of the chromosome and another marker flanking the proximal 10 cM region from it were used. The markers and their map positions used to calculate recombination rates are listed in table 2.

    Table 2 Recombination Rate Estimates of the 19 Mouse Genes Analyzed in This Study.

    Estimation of Genetic Variation

    The level of genetic variation is estimated based on pairwise sequence differences of synonymous sites and introns. The sequence data of the Fut4 locus has suggested an introgression from other distant taxa in the BLG2 strain (Liu, Takahashi, Kitano, Koide, Shiroishi, Moriwaki, and Saitou, unpublished data); thus this strain is excluded from the analyses of this locus. Genetic variation within subpopulations ( within subpopulation) was calculated as the average number of pairwise nucleotide differences per site (nucleotide diversity; Nei 1987) between the sequences of the four (three for Fut4) pairs of strains from the same subpopulations (pairs of Pgn2-BFM/2 for subpopulation domesticus; BLG2-NJL, for musculus; CAST/Ei-HMI, for castaneus; and MSM-SWN, for molossinus). Genetic variation between subpopulations (d) is defined as the average number of pairwise nucleotide differences per site between the rest of the 24 (18 for Fut4) combinations of strains from different subpopulations. The relative level of population divergence was calculated as

    following Nei (1973). total is the average number of pairwise nucleotide differences per site calculated from all 28 (21 for Fut4) combinations of the strains. Because total of Hoxa2 was 0, GST of this locus was not calculated.

    Results and Discussion

    The mouse strains used in our study are wild-caught inbred lines from different subspecies of Mus musculus. This species is normally classified into four authentic subspecies, M. m. domesticus, M. m. musculus, M. m. castaneus, and M. m. bactrianus (Bonhomme and Guenet 1989; Sage, Atchley, and Ernesto 1993). We sampled two strains each from the first three subspecies and two strains from M. m. molossinus, which is a local population (or a subspecies) known to have originated from hybrids of M. m. musculus and M. m. castaneus (Yonekawa et al. 1988; table 1). Our definition of these four groups as subpopulations reflects the structure of the whole species; however, migration rates cannot easily be determined. For example, the discovery of a narrow hybrid zone between M. m. domesticus and M. m. musculus suggests the existence of a reproductive barrier (reviewed in Sage, Atchley, and Ernesto 1993), although there is evidence of past hybridization between M. m. musculus and M. m. castaneus (Yonekawa et al. 1988). Nevertheless, a subdivided population model with a low migration rate probably fits the data best, and thus provides a good opportunity to study the effects of population subdivision.

    The estimated local recombination rate (cM/cR) of each locus is shown in table 2. There is another independent estimate of rates of recombination by Nachman and Churchill (1996) calculated from the density of markers on the genetic map. They are converted approximately to an equivalent scale of cM/cR (1 cR = 100 kb; Van Etten et al. 1999), and are also listed in table 2. These two estimates are highly correlated (r = 0.72, P < 0.001, Spearman's P < 0.001), but because our estimate in cM/cR uses more recent information on the mouse genome, we decided to use it for the following analyses.

    The number of silent sites used for the analyses, nucleotide diversities, GST, and divergence of each locus are listed in table 3. The nucleotide diversity of the pooled sample ( total) of each gene is plotted against the recombination rate of its region in fig. 1A. There was no correlation detected between these two variables (fig. 1A; r = 0.06, P = 0.83; Spearman's P = 0.47). A possible reason for the lack of correlation is that the regional variation in recombination rate for mouse seems to be much lower than that for Drosophila or for human (data from Nachman and Churchill 1996; Payseur and Nachman 2000). Alternatively, the population structure of mice could have prevented advantageous alleles from spreading throughout the range of all subdivided populations, assuming genetic hitchhiking as the primary cause of the correlation. In this study, we focused on investigating the latter possibility. We first calculated within subpopulations and d, and plotted them against the recombination rate (fig. 1B and 1C, respectively). A positive correlation was detected between recombination rate and within subpopulations (fig. 1B; r = 0.46, P = 0.045; Spearman's P = 0.019), but not between recombination rate and d (fig. 1C; r = 0.01, P = 0.97; Spearman's P = 0.54). We then examined the correlation between the recombination rate and the level of nucleotide differentiation among subpopulations (GST). There was a signification negative correlation, as shown in figure 1D (r = –0.64, P = 0.0034; Spearman's P = 0.0061). This was the clearest pattern in terms of the significance level in our analyses. The hitchhiking or background selection in a subdivided population is expected to increase the genetic differentiation between subpopulations in regions of low recombination (Charlesworth, Nordborg, and Charlesworth 1997; Slatkin and Wiehe 1998). Hence, our analyses suggest that these forces are acting in this structured population.

    Table 3 Nucleotide Diversity of the 19 Loci Analyzed in This Study.

    FIG. 1. Scatterplots of recombination rate versus nucleotide variation calculated from the sequences of eight inbred strains of Mus musculus (see table 1) and one M. spicilegus strain for each of the 19 autosomal loci (see table 2). The sequence data of the Fut4 locus suggested an introgression from other distant taxa in BLG2 strain (Liu, Takahashi, Kitano, Koide, Shiroishi, Moriwaki, and Saitou, unpublished data); thus this strain was excluded from the analyses of this locus. A. Recombination rate versus total, which is the average number of pairwise nucleotide differences per site among all the 28 (21 for Fut4) combinations of strains. B. Recombination rate versus within subpopulations, which is the average number of pairwise nucleotide differences per site between the sequences of the four (three for Fut4) pairs of strains within the same subpopulation. C. Recombination rate versus d, which is the average number of pairwise nucleotide differences per site between the sequences from the 24 (18 for Fut4) combinations of strains from different subpopulations. D. Recombination rate versus level of nucleotide differentiation among subpopulations (GST; Nei 1973). Because total of Hoxa2 was 0, this locus was excluded from the calculation of GST. E. Recombination rate versus nucleotide divergence between M. musculus and M. spicilegus. The numbers beside the plots indicate gene symbols as follows; 1: B3galt1, 2: B3galt2, 3: B3galt3, 4: B3galt4, 5: Camp, 6: Cd14, 7: Ctgf, 8: Dfy, 9: Fau, 10: Fut1, 11: Fut2, 12: Fut4, 13: Hoxa2, 14: Nppb, 15: Psmb10, 16: Sec1-pending, 17: Sox15, 18: Tnf, 19: Wnt1.

    Recently, it has been suggested that an additional factor unrelated to natural selection contributes to the correlation between recombination rate and nucleotide variation. Lercher and Hurst (2002) claimed that the correlation observed in the human genome was due, at least in part, to a higher mutation rate in regions of high recombination because it holds for single-nucleotide polymorphisms (SNPs) across the entire human genome, the great majority of which are not near exons or control elements. If this increased mutation rate is a major factor, a positive correlation is expected between recombination rate and divergence among closely related species. Hellman et al. (2003) showed a positive correlation between recombination rate and human-chimp as well as human-baboon divergence. Their findings support the neutral explanation for the phenomenon in humans. In contrast, our data showed no correlation between recombination rate and divergence between M. musculus and M. spicilegus (fig. 1E; r = 0.11, P = 0.65; Spearman's P = 0.38), which diverged within several million years (She et al. 1990; Moriwaki, Shiroishi, and Yonekawa 1994). The exclusion of the outlier (Fut2; 11) in figure 1B, which has extremely high levels of divergence, would not reveal any significant pattern (r = 0.32, P = 0.20; Spearman's P = 0.28). Genetic variation between subpopulations (d), which must have diverged much later after the M. musculus and M. spicilegus divergence, also had no correlation with the recombination rate (fig. 1C). Thus, our data indicate that any determinants of mutation and/or substitution rates in mice do not have a comparable effect on diversity and divergence.

    However, we should be cautious about interpreting our results. First of all, the recombination rate estimates used in our analyses are only from M. m. domesticus. There is not yet sufficient information on the map distances of other subspecies or closely related species to know whether they are consistent. Second, the sequence data we used are located in close proximity to functional genes, and the whole non-functional region of the genome has not been analyzed. Finally, the sample size in terms of base pairs and number of individuals is not comparable in scale to the study of Hellman et al. (2003), which found a weak correlation between divergence and recombination that can account for the relationship between nucleotide diversity and recombination in human. In contrast, even a large data set of 255 Drosophila melanogaster and D. simulans loci revealed no detectable relationship between divergence and recombination rate (Betancourt and Presgraves 2002). A larger-scale analysis is awaited to know which of these cases applies for the mouse data. However, all the above concerns would not weaken the evidence of the correlations actually detected in this study, if at most they might veil a weak existing relationship. In conclusion, our analyses of a structured population of mice showed that the effect of genetic hitchhiking or background selection may still play a dominant role in shaping the positive correlation between recombination rate and genetic variation in many genomic regions of this species.

    Acknowledgements

    We thank Michael Nachman for providing us with the exact numerals of recombination rates for all the gene regions. We also thank Toshiyuki Takano-Shimizu, Jeff Wall, and Hiroshi Akashi for valuable discussions. We are grateful to Justin Fay for his suggestions to improve the manuscript. This study was supported by grants-in-aid for scientific studies from the Ministry of Education, Science, Sports, and Culture, Japan, to N.S., and a Japan Society for the Promotion of Science Research Fellowship for Young Scientists to A.T.

    Literature Cited

    Andolfatto, P., and M. Przeworski. 2001. Regions of lower crossing over harbor more rare variants in African populations of Drosophila melanogaster. Genetics 158:657-665.

    Aquadro, C. F., D. J. Begun, and E. C. Kindahl. 1994. Selection, recombination, and DNA polymorphism in Drosophila. Pp. 46–56 in B. Golding, ed. Non-neutral evolution: theories and molecular data. Chapman and Hall, New York.

    Begun, D. J., and C. F. Aquadro. 1992. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356:519-520.

    Begun, D. J., and C. F. Aquadro. 1993. African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature 365:548-550.

    Betancourt, A. J., and D. C. Presgraves. 2002. Linkage limits the power of natural selection in Drosophila. Proc. Natl. Acad. Sci. USA 99:13616-13620.

    Bonhomme, F., and J.-L. Guenet. 1989. The wild house mouse and its relatives. Pp. 649–662 in M. F. Lyon and A. G. Searle, eds. Genetic variants and strains of the laboratory mouse. Gustav Fischer Verlag, Stuttgart. Germany.

    Charlesworth, B., M. T. Morgan, and D. Charlesworth. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303.

    Charlesworth, B., M. Nordborg, and D. Charlesworth. 1997. The effects of local selection, balanced polymorphism and background selection on equilibrium patterns of genetic diversity in subdivided populations. Genet. Res. 70:155-174.

    Charlesworth, D., B. Charlesworth, and M. T. Morgan. 1995. The pattern of neutral molecular variation under the background selection model. Genetics 141:1619-1632.

    Fay, J., and C.-I. Wu. 2000. Hitchhiking under positive Darwinian selection. Genetics 155:1405-1413.

    Fu, Y.-X., and W.-H. Li. 1993. Statistical tests of neutrality of mutations. Genetics 133:693-709.

    Hellman, I., I. Erbersberger, S. E. Ptak, S. Paabo, and M. Przeworski. 2003. A neutral explanation for the correlation of diversity with recombination rates in humans. Am. J. Hum. Genet. 72:1527-1535.

    Hudson, R. R., and N. L. Kaplan. 1995. The coalescent process and background selection. Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci. 349:19-23.

    Kaplan, N. L., R. R. Hudson, and C. H. Langley. 1989. The "hitchhiking effect" revisited. Genetics 123:887-899.

    Koide, T., K. Moriwaki, K. Ikeda, H. Niki, and T. Shiroishi. 2000. Multi-phenotype behavioral characterization of inbred strains derived from wild stocks of Mus musculus. Mamm. Genome 11:664-670.

    Langley, C. H., B. P. Lazzaro, W. Phillips, E. Heikkinen, and J. M. Braverman. 2000. Linkage disequilibria and the site frequency spectra in the su (s) and su (wa) regions of the Drosophila melanogaster X chromosome. Genetics 156:1837-1852.

    Lercher, M. J., and L. D. Hurst. 2002. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18:337-340.

    Maynard Smith, J., and J. Haigh. 1974. The hitch-hiking effect of a favorable gene. Genet. Res. 23:23-35.

    Moriwaki, K., T. Shiroishi, and H. Yonekawa. 1994. Genetics in wild mice. Japan Sci. Soc Press, Tokyo/S Karger, Basel.

    Nachman, M. W. 1997. Patterns of DNA variability at X-linked loci in Mus domesticus. Genetics 147:1303-1316.

    Nachman, M. W. 2001. Single nucleotide polymorphisms and recombination rate in humans. Trends Genet. 17:481-485.

    Nachman, M. W., V. L. Bauer, S. L. Crowell, and C. F. Aquadro. 1998. DNA variability and recombination rates at X-linked loci in humans. Genetics 150:1133-1141.

    Nachman, M. W., and G. A. Churchill. 1996. Heterogeneity in rates of recombination across the mouse genome. Genetics 142:537-548.

    Nei, M. 1973. Analysis of gene diversity in subdivided populations. Proc. Natl. Acad. Sci. USA 70:3321-3323.

    Nei, M. 1987. Molecular evolutionary genetics. Columbia University Press, New York.

    Payseur, B. A., and M. W. Nachman. 2000. Microsatellite variation and recombination rate in the human genome. Genetics 156:1285-1298.

    Przeworski, M. 2002. The signature of positive selection at randomly chosen loci. Genetics 160:1179-1189.

    Przeworski, M., R. R. Hudson, and A. Di Rienzo. 2000. Adjusting the focus on human variation. Trends Genet. 16:296-302.

    Sage, R. D., W. R. Atchley, and C. Ernesto. 1993. House mice as models in systematic biology. Syst. Biol. 42:523-561.

    She, J. X., F. Bonhomme, P. Boursot, L. Thaler, and F. Catzeflis. 1990. Molecular phylogenies in the genus Mus: Comparative analysis of electrophoretic, scnDNA hybridization, and mtDNA RFLP data. Biol. J. Linn. Soc. 41:83-103.

    Slatkin, M., and T. Wiehe. 1998. Genetic hitch-hiking in a subdivided population. Genet. Res. 71:155-160.

    Stephan, W. 1994. Effects of genetic recombination and population subdivision on nucleotide sequence variation in Drosophila ananassae. Pp. 57–66 in B. Golding, ed. Non-neutral evolution: theories and molecular data. Chapman and Hall, New York.

    Stephan, W., and S. J. Mitchell. 1992. Reduced levels of DNA polymorphism and fixed between-population differences in the centromeric region of Drosophila ananassae. Genetics 132:1039-1045.

    Stephan, W., L. Xing, D. A. Kirby, and J. M. Braverman. 1998. A test of the background selection hypothesis based on nucleotide data from Drosophila ananassae. Proc. Natl. Acad. Sci. USA 95:5649-5654.

    Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.

    Van Etten, W. J., R. G. Steen, H. Nguyen, A. B. Castle, D. K. Slonim, B. Ge, C. Nusbaum, G. D. Schuler, E. S. Lander, and T. J. Hudson. 1999. Radiation hybrid map of the mouse genome. Nat. Genet. 22:384-387.

    Wall, J. D. 1999. Recombination and the power of statistical tests of neutrality. Genet. Res. 74:65-79.

    Yonekawa, H., K. Moriwaki, O. Gotoh, N. Miyashita, Y. Matsushima, L. M. Shi, W. S. Cho, X. L. Zhen, and Y. Tagashira. 1988. Hybrid origin of Japanese mice "Mus musculus molossinus": evidence from restriction analysis of mitochondrial DNA. Mol. Biol. Evol. 5:63-78.(Aya Takahashi, Yu-Hua Liu)