当前位置: 首页 > 期刊 > 《新英格兰医药杂志》 > 2005年第17期 > 正文
编号:11328769
Genomic Cartography — Presenting the HapMap
http://www.100md.com 《新英格兰医药杂志》
     Many studies are based on the premise that a sample is representative of the larger body from which it was drawn. An article being published in Nature today1 provides a guide for sampling the human genome in a way that will facilitate the quest for genes that influence susceptibility to disease.

    The article describes a map of haplotypes, colloquially called the HapMap. (A haplotype is a set of closely linked markers on a single chromosome that tend to be inherited as a group.) It is a logical follow-on from the human genome project and fulfills the need for a new approach to ferreting out genes that participate in complex multigenic disorders such as diabetes mellitus. Although molecular genetic techniques have revolutionized our understanding of the genetic causes of single-gene disorders, they have yielded little insight into the causes of genetically complex disorders.

    One of the motivating factors behind the HapMap was the realization that parts of the human genome are "chunky": long stretches of unperturbed DNA are inherited from parent to offspring through many generations. These stable segments, which can contain genes, are flanked by "hot spots" of recombination, a process in which homologous regions of chromosomes change position. A genetic variant in a specific segment is likely to be coinherited with another genetic variant in the same segment because, being physically linked, the two are unlikely to be separated by chromosomal recombination. It was reasoned that, if the entire genome is made up of segments of physically linked variants, an accurate map of genomic variation among the members of a population would show the chromosomal positions at which the segments begin and end. Such a map would save researchers time and money in conducting studies of genes associated with disease, because a variant base could serve as a marker for an entire block of linked variant bases. If one of these linked variant bases increases susceptibility to a specific disease, its association with the disease would be detected through the "marker" base.

    An analogy may help to explain this phenomenon. In the mid-1960s, George Harrison, Paul McCartney, John Lennon, and Ringo Starr were often found together. If you looked for Harrison, there was a high likelihood, but not complete certainty, that you would find the other members of the Beatles. In genetics, markers that travel together are said to be in linkage disequilibrium. The idea is that if you can find a DNA variant that often travels with other variants, you can use these links to reduce the complexity of the genetic maps needed to identify the variants that are likely to be associated with disease. The HapMap project focuses on variants known as single-nucleotide polymorphisms, or SNPs (pronounced "snips").

    SNPs arise through a mutation that affects a single base. These SNPs accrue very slowly (at the rate of about 10–8 per base pair per generation) and may affect protein function or regulation — in which case, the prevalence of a SNP in a given population may decrease or increase over time, depending on whether it confers a survival advantage. If the variant causes disease, it is called a "missense mutation."

    A SNP is generated on a stretch of DNA that contains other SNPs; with successive generations, new SNPs may arise in the same segment as the original SNP, creating new combinations of SNPs (SNP haplotypes). Comparison of the haplotypes for a given sequence of DNA yields a "family tree" that reveals not only the relatedness of the different haplotypes, but also the specific bases that can distinguish the different haplotypes most efficiently (see diagram).

    Genetic Variation Disclosed by the HapMap

    A stretch of DNA on chromosome 2, as represented in the genomes of the 120 chromosomes in a sample from Utah, contains 36 single-nucleotide polymorphisms (SNPs) and no evidence of recombination events. The SNPs are represented by colored circles at SNP positions. This genomic region has seven haplotypes (left side of diagram), and five of these haplotypes account for 118 of the chromosomes — demonstrating limited haplotype diversity that reflects shared ancestry. In this sample, therefore, the entire genetic variation of this region can be captured by typing seven SNPs. Groups of SNPs (i.e., SNP haplotypes) that can be captured by a single marker SNP are represented by the same color. The SNPs can be used to derive a genealogic tree (right side of diagram). Reproduced with the permission of the Nature Publishing Group.1

    In today's Nature article, the HapMap Consortium reports the genome-wide SNP haplotypes of 269 persons — one third of them being Yoruba persons from Ibadan, Nigeria; one third, persons from Utah believed to have recent ancestry from Northern and Western Europe; one sixth, Japanese persons in Tokyo; and one sixth, Han Chinese persons in Beijing. The selection of these populations was based on previous research indicating genetic diversity. The researchers typed 1,007,329 common SNPs in each of the 269 samples; the SNPs were selected in such a way as to ensure reasonably even coverage of the whole genome, at a density of about one SNP for every 5 kb.

    The three main findings were that chunkiness is present throughout the genome, with long blocks of tightly linked SNPs; that the extent of chunkiness varies among populations; and that there is low haplotype diversity — the average number of haplotypes for each block ranged from 4.0 (in the Japanese and Chinese populations analyzed) to 5.6 (in the Yoruba population analyzed). Thus, it is feasible that the entire genome of persons in these populations can be sampled with a limited number of SNPs — far fewer than if the populations had an average of, say, 20 haplotypes for each block.

    How did the researchers know that they had captured the lion's share of common variation through their selection of SNPs? To assess the efficiency of their capture, they analyzed 10 separate 500-kb regions in detail. They sequenced the entirety of each region in 48 persons from the populations they studied, identified all common SNPs, and then typed each of these SNPs in each of the 269 samples. By comparing the haplotypes they obtained in this manner with those they had obtained using the standard HapMap SNPs, the researchers were able to determine the extent to which the HapMap SNPs captured the genetic variation in each population sample.

    Their findings bode well for the effective use of the HapMap as a tool for seeking susceptibility genes. The consortium concludes that the map provides "excellent power" for association studies in the populations they analyzed from Utah, Tokyo, and Beijing and substantial power for the Yoruba population analyzed. They believe that the second phase of the project will result in a sufficient catalogue of SNPs to provide excellent power for association studies in the Yoruba people.

    The HapMap is a welcome tool that affords a fascinating glimpse of the human genome, partly because it provides a sense of evolution and history. A critical question is how useful the HapMap will be in analyzing populations other than those from which it was derived. This question can be answered only by experiment, but given current knowledge about genetic variability and its relationship to reported ancestry and geography, the HapMap should be helpful in chasing down loci and genes that are associated with disease. Precisely how helpful will depend on the extent to which common variants turn out to be responsible for disease.

    Source Information

    Dr. Phimister is a deputy editor of the Journal.

    References

    The International HapMap Consortium. A haplotype map of the human genome. Nature 2005;437:1299-1320.(Elizabeth G. Phimister, P)