当前位置: 首页 > 期刊 > 《循环学杂志》 > 2005年第9期 > 正文
编号:11304743
Comprehensive Survey of Common Genetic Variation at the Plasminogen Activator Inhibitor-1 Locus and Relations to Circulating Plasminogen Act
http://www.100md.com 《循环学杂志》
     the National Heart, Lung, and Blood Institute’s Framingham Heart Study (S.K., Q.Y., M.G.L, D.L., C.J.O.), Framingham, Mass

    Broad Institute of Harvard University and Massachusetts Institute of Technology (S.K., S.B.G., A.L.L., J.N.H., C.J.O.), Cambridge, Mass

    Department of Biostatistics (Q.Y.), Boston University School of Public Health

    Department of Neurology (Q.Y.), Boston University School of Medicine

    Department of Mathematics and Statistics (M.G.L.), Boston University, Boston, Mass

    Royal North Shore Hospital (G.H.T.), Sydney, Australia

    Divisions of Genetics and Endocrinology (J.N.H.), Children’s Hospital and Department of Genetics, Harvard Medical School, Boston, Mass

    Cardiology Division (S.K., C.J.O.), Massachusetts General Hospital, Harvard Medical School, Boston, Mass.

    Abstract

    Background— Using a linkage disequilibrium (LD)–based approach, we sought to comprehensively define common genetic variation at the plasminogen activator inhibitor-1 (PAI-1) locus and relate common single nucleotide polymorphisms (SNPs) and haplotypes to plasma PAI-1 levels.

    Methods and Results— In reference pedigrees, we defined LD structure across a 50-kb genomic segment spanning the PAI-1 locus via a dense SNP map (1 SNP every 2 kb). Eighteen sequence variants that capture underlying common genetic variation were genotyped in 1328 unrelated Framingham Heart Study participants who had plasma PAI-1 antigen levels measured. Regression analyses were used to examine associations of individual SNPs and of inferred haplotypes with multivariable-adjusted PAI-1 levels. Two genetic variants, SNP rs2227631 and the 4G/5G polymorphism, were strongly associated (P<0.0001) with PAI-1 levels. SNP rs2227631 is in tight LD (D'=0.97, r2=0.78) with the 4G/5G polymorphism, which makes it difficult to distinguish which of these 2 polymorphisms is responsible for the association with PAI-1 levels. In stepwise analysis considering all polymorphisms tested, 3 SNPs, rs2227631 (or the correlated 4G/5G polymorphism), rs6465787, and rs2227674, each explained 2.5%, 1%, and 1%, respectively, of the residual variance in multivariable-adjusted PAI-1 levels (stepwise P<0.0001, P=0.04, and P=0.03, respectively). A single common haplotype, at 50% frequency among Framingham Heart Study participants, was strongly associated with higher PAI-1 levels (haplotype-specific P=0.00001). The susceptibility haplotype harbors the minor alleles of SNP rs2227631 and the 4G/5G polymorphism.

    Conclusions— Three sequence variants at the PAI-1 locus, in sum, explain 5% of the residual variance in multivariable-adjusted PAI-1 levels. For quantitative cardiovascular traits such as circulating biomarkers, defining LD structure in a candidate gene followed by association analyses with both SNPs and haplotypes is an effective approach to localize common susceptibility alleles.

    Key Words: plasminogen epidemiology genetics genomics fibrinolysis

    Introduction

    Higher plasma levels of plasminogen activator inhibitor-1 (PAI-1), the principal circulating inhibitor of fibrinolysis, have been associated with increased risk of coronary heart disease events.1,2 In addition, increased PAI-1 is a common feature of the metabolic syndrome3 and predicts incident diabetes.4 Significant heritability of this phenotype in family and twin studies suggests that plasma PAI-1 is determined in part by genetic influences.5,6 In vitro and genetic studies have suggested that the 4G allele of the 4G/5G insertion-deletion polymorphism is associated with higher plasma PAI-1 concentrations.7–9 The 4G allele has been associated with an increased risk of myocardial infarction in a meta-analysis.10

    It is likely that single polymorphisms at a candidate gene locus represent only part of the overall genetic variation that may contribute to phenotypic variation. It is unknown whether the 4G allele is simply correlated with other possibly causative alleles and whether additional variation at the PAI-1 locus plays a role in determining PAI-1 levels. One emerging approach is to take advantage of the complete human genome sequence and the increasingly rich collection of single nucleotide polymorphisms (SNPs) in public databases to study the role of the estimated 11 million SNP variants (minor allele frequency >1%) in explaining phenotypic variation.11

    SNP alleles at a locus are often correlated (known as linkage disequilibrium [LD]) and coinherited as haplotypes.12 Most of the genome exists in regions of strong LD, called haplotype blocks, within which most individuals carry 1 or 2 of a few common haplotypes.13 Within these blocks, a relatively small number of SNPs, termed tag SNPs, can mark common haplotypes and capture most of the genetic diversity in a sample.14–16 The utilization of tag SNPs and common haplotypes formed by tag SNPs in genetic association studies has been posited as an efficient and effective method to narrow association signal and localize susceptibility variants.17

    Our investigation sought to comprehensively examine the role of common SNPs and haplotypes at the PAI-1 locus in determining PAI-1 levels measured in the Framingham Heart Study (FHS), a well-characterized, community-based sample. Thus, our objectives were (1) to define the LD patterns for common genetic variants at the PAI-1 locus, (2) to determine associations of PAI-1 single genetic variants and multimarker haplotypes with plasma PAI-1 levels, and (3) to determine the relation between the 4G/5G polymorphism and other common variation at the PAI-1 locus.

    Methods

    Study Participants

    The FHS offspring cohort began in 1971 with enrollment of 5124 men and women.18 Approximately every 4 years, participants have undergone a routine medical history and physical examination, as well as laboratory assessment of cardiovascular disease risk factors. The Institutional Review Board at Boston Medical Center approved the study, and all participants gave written informed consent.

    Of the 3799 participants who attended the fifth offspring examination (1991–1995), blood testing for PAI-1 antigen was completed in 3389 participants. We excluded 552 participants who were receiving anticoagulant medications or taking aspirin, which left 2837 eligible participants. DNA was available in a panel of 1811 unrelated individuals randomly selected from 2933 individuals who provided blood samples for DNA extraction during the sixth examination cycle (1995–1998). The overlap of the 2837 eligible participants with PAI-1 antigen levels and the 1811 unrelated participants with DNA available formed the sample for the present study (n=1328).

    Determination of PAI-1 Antigen Levels and Other Participant Characteristics

    Fasting blood samples were collected between 8 and 9 AM in tubes with 3.8% sodium citrate (9:1 vol/vol). ELISA methods were used to measure levels of PAI-1 antigen according to the description given by Declerck et al19 (TintElize PAI-1, Biopool AB). The intra-assay coefficient of variation was 9.6% for PAI-1 antigen.20

    Risk factor information was collected from physical examination and blood testing as described previously.18 Cardiovascular disease was defined by the presence of coronary heart disease (recognized or unrecognized myocardial infarction, coronary insufficiency, and angina pectoris), stroke or transient ischemic attack, or intermittent claudication. Criteria for cardiovascular disease diagnoses have been described previously.21

    Genotyping Methods

    SNPs and the 4G/5G insertion-deletion polymorphism were genotyped with matrix-associated laser desorption ionization–time of flight mass spectrometry with 5 ng of DNA per genotyping reaction (Sequenom, San Diego, Calif). The detailed multiplex polymerase chain reaction protocol has been described previously.13 As previously reported, our error rate using the Sequenom MassARRAY platform is 0.4%, and this was confirmed in the present study by comparison of duplicates and rates of apparent mendelian inheritance errors.13

    SNP Selection and Genotyping in Reference Pedigrees

    the public dbSNP database (http://www.ncbi.nlm.nih.gov/SNP), we selected 85 evenly spaced markers within a 50-kb region spanning the PAI-1 gene (the coding region,10 kb upstream and 5 kb downstream) on chromosome 7 (gene symbol SERPINE1, accession number NM_000602). SNPs were genotyped in a panel of 12 multigenerational family pedigrees from the Utah/Centre d’Etude du Polymorphisme Humain (CEPH) panel (Coriell Institute for Medical Research, Camden, NJ).13 These reference pedigrees included 93 individuals representing 96 independent chromosomes of European ancestry. Assays were considered successful if they met the following criteria: (1) at least 75% success for genotyping calls, (2) Hardy-Weinberg equilibrium (HWE) P>0.01, and (3) mendelian transmission errors 1. In addition, we imposed a minor allele frequency threshold and defined "common" for the present study as a minor allele frequency >2%. Overall, for 24 SNPs (22 noncoding, 2 nonsynonymous coding: rs6090 and rs6092) and the 4G/5G insertion-deletion polymorphism, we developed successful assays and observed a minor allele frequency >2%.

    LD Structure in Reference Pedigrees: Identification of Haplotype Blocks and Tag SNPs

    Haplotype blocks were defined and tag SNPs were selected with the "spine of LD" setting in the publicly available Haploview software package version 2.03 (Jeffrey C. Barrett and Mark J. Daly, http://www.broad.mit.edu/personal/jcbarret/haploview).22 Briefly, for each pair of markers, the absolute value of D' (an estimate of the strength of LD23) and a logarithm of the odds score (LOD; an estimate of the significance of LD24) were calculated. On the basis of these 2 measures, each pairwise marker comparison was categorized into 1 of 3 groups: (1) no or minimal evidence of historical recombination (D'=1/LOD >2 or 0.52), (2) strong evidence of historical recombination (D'<1/LOD <2 or D'<0.5/any LOD), and (3) uninformative (D'=1.0/LOD <2).

    In the spine of LD setting, haplotype blocks are assigned based on each end marker of a block having a D'>0.8 with all intervening pairwise marker comparisons excepting 1 comparison. Tag SNPs were selected by ranking SNPs within a block on the basis of successful genotyping percentage and then selecting SNPs one at a time from this ranked list until all haplotypes >2% frequency within the block were uniquely tagged. With this procedure, 11 tag SNPs were required to mark common haplotypes. For the 20 SNPs that fell into 2 blocks, the 11 tag SNPs captured the 9 unmeasured SNPs with a mean pairwise r2 of 0.78. Eight of the 9 unmeasured SNPs were captured with a pairwise r2 >0.70. Thus, in reference pedigrees, our tag SNPs captured well the unmeasured SNPs.

    Genotyping in the FHS

    In the FHS sample, we genotyped 11 tag SNPs, the 4G/5G polymorphism, and 7 SNPs that were redundant in CEPH pedigrees. Redundant SNPs were typed to help assess LD block structure similarity between CEPH and FHS samples (data not shown). Thus, 19 SNPs were genotyped in FHS. The 2 test was used to compare observed genotype frequencies with their estimates under HWE. One tag SNP, rs7242, was out of HWE (P<0.01) in the FHS sample and thus was excluded from further analysis. The remaining 18 sequence variants were in HWE (P>0.05).

    Statistical Analysis

    Because of a skewed distribution, serum PAI-1 antigen levels were logarithmically transformed (natural log). Sex-specific standardized residuals from multivariable-adjusted PAI-1 levels were calculated with SAS25 and served as the phenotype. On the basis of reported determinants of PAI-1 levels in the literature, covariates included in the multivariable models were age, body mass index, current cigarette smoking, systolic blood pressure, diastolic blood pressure, hypertension treatment, alcohol consumption, total cholesterol, HDL cholesterol, triglycerides, diabetes, prevalent cardiovascular disease, and menopause status and estrogen replacement therapy for women.

    Regression analysis was performed with each of the SNPs to test the null hypothesis that the phenotype means did not differ by marker genotype. We assumed a general model of inheritance and used a 2–degrees-of-freedom (df) test for each SNP. Analyses were performed on standardized PAI-1 residuals. To further identify a subset of SNPs that significantly explained variance in PAI-1 levels when adjusted for the effects of other SNPs, we conducted a stepwise selection of the SNPs in multivariable linear regression models.25

    Association analyses of haplotypes from a single block were conducted with a weighted-regression approach as implemented in the haplo.score program.26,27 All compatible haplotype configurations of a multimarker genotype were used in the regression, with weights being the corresponding posterior likelihood of such a configuration estimated with the expectation-maximization (EM) algorithm.28 An n-1 df score statistic, with n being the total number of haplotypes, tested all haplotypes simultaneously to detect any departure from the null hypothesis of no association. A 1-df haplotype-specific score statistic tested whether trait differences exist between a single haplotype and all other haplotypes combined. Haplotype-specific effect was also estimated to measure the mean difference in phenotype between carriers of 1 or 2 copies of a haplotype compared with those without the haplotype.

    In joint analyses of haplotypes from 2 haplotype blocks, a weighted regression model with 2 predictors that represented the haplotypes from each block was performed with SAS.25 The weights in the regression were the posterior likelihood of the joint haplotype configurations of the 2 blocks. The haplotype configurations and the posterior likelihood of each configuration were estimated with a similar EM algorithm implemented in SNPHAP (http://www-gene.cimr.cam.ac.uk/clayton/software/snphap.txt).

    Multivariable logistic regression analysis was performed to test the null hypothesis that coronary heart disease status did not differ by marker genotype. Covariates in the analyses included age, sex, and traditional cardiovascular risk factors including hypertension (defined by systolic blood pressure 140 mm Hg, diastolic blood pressure 90 mm Hg, or hypertension treatment), diabetes (defined by fasting blood glucose >125 mg/dL or current medication use for diabetes), total cholesterol, HDL cholesterol, lipid-lowering therapy, cigarette smoking (coded as yes if current smoker or quit within 1 year), body mass index, and triglycerides. For all analyses, a nominal P<0.05 was considered significant.

    Results

    LD Structure at the PAI-1 Locus in Reference Pedigrees

    The LD structure at the PAI-1 locus in reference CEPH pedigrees is displayed in the Figure. Two extended segments of LD were evident in the PAI-1 gene region, labeled block 1 and block 2. Within these segments, a small number of common, ancestral haplotypes were observed. Eight haplotypes in block 1 (16-kb genomic segment) accounted for 95% of the chromosomes in reference pedigrees, and likewise, 6 haplotypes in block 2 (13-kb genomic segment) represented 95% of the chromosomes. Six tag SNPs predict common haplotypes in block 1, and an additional 5 tag SNPs predict common haplotypes in block 2.

    LD structure at the PAI-1 gene locus in reference pedigrees. A, PAI-1 locus position on chromosome 7 (chr 7) in the human genome July 2003 assembly (hg16). Genes encoded in this genomic region include SERPINE1 and AP1S1 on the plus strand and VGF on the minus strand. Rs identification numbers for SNPs 1 to 25 are listed in sequence order underneath the black bar, which lists the genomic coordinates across the gene. B, Haplotype block structure panel shows that 25 SNPs fell into 2 haplotype blocks. SNPs 1 to 9 fell into block 1, and SNPs 10 to 20 fell into block 2. Each haplotype is shown within a rectangle containing the alleles, a frequency percentage, and a bar proportional to haplotype frequency. For the 4G/5G insertion/deletion polymorphism (ss3172207), "A" refers to the 4G allele and "C" refers to the 5G allele. Within block 1, 8 common haplotypes were present, and within block 2, there were 6 common haplotypes. This limited haplotypic diversity within each block illustrates the strength of LD across this region. C, LD structure panel displays the LD relations between pairs of markers in the region, with each square representing the pairwise strength and significance of LD. Figure prepared with LocusView 2.0 (T. Petryshen, et al, Broad Institute; available at http://www.broad.mit.edu/mpg/locusview/).

    FHS Participants

    Characteristics of the study sample (mean age 55 years; 54% women) are shown in Table 1. Unadjusted mean PAI-1 levels were higher in men than women.

    Single Allelic Variants and PAI-1 Levels in Framingham Participants

    Results of association analyses of each of 18 genetic variants with multivariable-adjusted plasma PAI-1 levels in FHS participants are shown in Table 2. SNP rs2227631 was significantly associated with PAI-1 levels (P<0.0001). The 4G/5G polymorphism, located 172 bases downstream of rs2227631, was associated as strongly as rs2227631 (P<0.0001). SNP rs2227631 is in tight LD with the 4G/5G polymorphism (D'=0.97, r2=0.78). Additionally, 9 other variants were also associated with PAI-1 levels (each P<0.05). Similar results were seen with more basic regression models, including a model that used only age-adjusted, sex-specific plasma PAI-1 levels as the phenotype.

    Overall Contribution of Individual Genetic Variants in Stepwise Models

    To distinguish between genetic variants that are strongly correlated with each other and those that independently contribute to the variation in PAI-1 levels, we conducted stepwise analysis that included all 18 genetic variants. In stepwise selection, the 4G/5G polymorphism explained 2.5% of the residual variance in multivariable-adjusted PAI-1 levels (stepwise P<0.0001), with the 4G allele associated with higher PAI-1 levels. An additional 1% was explained by rs6465787 (stepwise P=0.02) and a further 1% by rs2227674 (stepwise P=0.03). When we repeated stepwise selection forcing in rs2227631, the variant in tight LD with the 4G/5G polymorphism, 3 variants were significantly associated with PAI-1 levels—rs2227631, rs6465787, and rs2227674—and the 4G/5G polymorphism was no longer significant.

    Mean unadjusted PAI-1 levels by genotype for 4 variants associated in stepwise models are presented in Table 3. Association analyses were conducted with multivariable-adjusted log-transformed PAI-1 level, as noted in Methods.

    Associations of Haplotypes With PAI-1 Levels

    We inferred haplotype frequencies in the FHS sample via an EM algorithm using 7 SNPs (6 tag SNPs and the 4G/5G polymorphism) in block 1 and 4 tag SNPs in block 2 (the fifth tag SNP in block 2—rs7242—was eliminated owing to failure to achieve HWE). The resulting haplotypes in the larger FHS sample were very similar to those initially predicted in reference CEPH pedigrees (Tables 4 and 5; Figure).

    There were significant overall differences among the 8 common haplotypes in block 1 with respect to mean PAI-1 levels (global P=0.0001). When we compared individual haplotypes with all the other haplotypes combined, Hap 1A and Hap 1I were positively associated with PAI-1 levels (haplotype-specific P=0.00001 and P=0.05, respectively; Table 4). Hap 1A harbors the 4G allele, and Hap 1I harbors the minor T allele of rs6465787; thus, the single SNP findings from stepwise selection (Table 3) and the haplotype findings represent the same result.

    Association analyses for haplotypes in block 2 are displayed in Table 5. The 5 common haplotypes differed with respect to mean PAI-1 (global P=0.04). A single common haplotype in block 2, Hap 2A, was positively associated with plasma PAI-1 (haplotype-specific P=0.006). Hap 2E was associated with lower PAI-1 levels (Hap P=0.05). Hap 2E harbors the minor allele of rs2227674; thus, the single SNP rs2227674 finding from stepwise selection (Table 3) and the Hap 2E finding represent the same result.

    Hap 1A was associated with higher PAI-1 levels, as was Hap 2A. The haplotypes in block 1 and block 2 were correlated (multiallelic D' of 0.74); most individuals with Hap 2A also carry Hap 1A. To address whether the association signal could be narrowed to Hap 1A, we considered both Hap 1A and Hap 2A jointly in a model. In this analysis, Hap 1A remained significantly associated with plasma PAI-1 level (P<0.0001), whereas Hap 2A was no longer significant (P=0.32).

    Joint Consideration of SNPs and Hap 1A

    We considered the effects of SNPs and Hap 1A jointly by constructing models with the dependent variable being multivariable-adjusted plasma PAI-1 level and the 2 independent variables being Hap 1A and an individual SNP. Eighteen separate models were constructed, one for each of the 18 SNPs tested at the locus. When both Hap 1A and SNPs were considered, Hap 1A was significantly related to plasma PAI-1 in all of the models except when combined with SNP rs2227631 or 4G/5G (data not shown). In a model with 4G/5G and Hap 1A, 4G/5G was significantly associated with PAI-1 level, whereas Hap 1A was not. Similarly, in a model with rs2227631 and Hap1A, rs2227631 was associated with PAI-1 level, whereas Hap 1A was not. These results imply that these 2 SNP effects are stronger than the Hap 1A effect.

    Genetic Variants and Coronary Heart Disease

    We evaluated the relations between prevalent coronary heart disease (at the time of the FHS Offspring Study examination cycle 5) and the 2 variants most significantly associated with PAI-1 level. SNP rs2227631 was not associated with prevalent coronary heart disease (n=114 for coronary heart disease; P=0.41 in an age- and sex-adjusted model, and P=0.59 in a multivariable-adjusted model). The 4G/5G polymorphism also was not associated with prevalent coronary heart disease (n=111 for coronary heart disease; P=0.71 in an age- and sex-adjusted model, and P=0.88 in a multivariable-adjusted model). Compared with the 5G/5G genotype as the referent, the 4G/4G genotype OR was 1.16 (95% CI 0.61 to 2.21).

    Discussion

    Principal Findings

    In reference pedigrees, we characterized LD structure across a 50-kb genomic segment spanning the PAI-1 locus using a dense SNP map and found 2 blocks of sequence variants in strong LD. In association analyses of single genetic variants with PAI-1 levels, we identified 2 genetic variants, rs2227631 and the 4G/5G polymorphism, which are in tight LD and each strongly associated with plasma PAI-1 levels. After we accounted for either rs2227631 or the 4G/5G polymorphism, 2 additional variants further explained the variation in PAI-1 levels. After we accounted for clinical covariates, 3 sequence variants, in sum, explained 5% of the residual variance in circulating PAI-1 levels. In haplotype-phenotype association, a single common haplotype, Hap 1A (50% frequency), was strongly associated with increased PAI-1 level. Hap 1A harbors the minor alleles of both rs2227631 and the 4G/5G polymorphism.

    These findings confirm prior reports that the 4G allele is associated with increased plasma PAI-1 levels.29 The percent of residual variance in the multivariable-adjusted PAI-1 level explained by the 4G/5G polymorphism, which was 2.5% in the present study, is consistent with prior reports, in which it ranged from 0.63%30 to 1.6%.31

    With the genotyping of 25 SNPs in reference pedigrees and 18 PAI-1 SNPs in our association study, the present findings extend prior work by comprehensively studying the association of common genetic variants with PAI-1 levels in a large community-based sample. Our work directly addresses 2 questions: (1) Can the contribution of the 4G variant to PAI-1 levels be distinguished from LD with other possibly causative alleles and (2) Are there additional allelic contributors at the PAI-1 locus to variance in PAI-1 levels

    Common Genetic Variation at the PAI-1 Locus and PAI-1 Level

    The LD structure defined in the present study with publicly available common SNPs is in good agreement with the haplotypes defined by resequencing at the PAI-1 locus.32 In the Seattle SNPs Program for Genomics Applications project, 23 individuals of European ancestry were resequenced for 13 kb spanning the PAI-1 gene.32 This resequencing effort and the FHS sample yielded similar Hap 1A frequencies, 46% and 50%, respectively. In addition, the percent of individuals discordant for the minor alleles of 4G/5G and rs2227631 was similar at 2% in the Seattle SNPs European-descent sample and 5% in the FHS cohort.

    Using an LD-based approach, we narrowed the association signal at the PAI-1 locus to a specific region of limited haplotypic diversity in the 5' end of the gene. The 2 strongest known causal candidates present on Hap 1A are rs2227631 and the 4G/5G polymorphism. We are unable to distinguish the possible effect of rs2227631 and the 4G/5G polymorphism. In general, functional studies may help differentiate the effects of genetic variants. However, whereas previous functional studies have evaluated the 4G/5G polymorphism,8 an evaluation of the function of rs2227631 is lacking.

    Besides rs2227631 and the 4G/5G polymorphism, 2 additional variants may contribute to the variance in PAI-1 level. SNP rs6465787 is located 3 kb upstream of the 4G/5G polymorphism, and rs2227674 is located in intron 4. Both rs6465787 (minor allele frequency 0.02) and rs2227674 (minor allele frequency 0.21) are less common than the 4G/5G polymorphism. To the best of our knowledge, the present report is the first to describe the contribution of either rs6465787 or rs2227674 to variation in PAI-1 levels, and these findings will need to be replicated.

    The variants identified in sum explain 5% of the residual variability in multivariable-adjusted plasma PAI-1 level. The discovery of genetic variants that explain interindividual variation in biomarker phenotypes is important for several reasons. First, PAI-1 biomarker levels may predict incident clinical disease, and thus, genetic variants related to biomarker levels are strong candidate alleles to test for association with disease. Second, alleles that have been shown to be causally related to PAI-1 levels may be appropriate targets for drugs to alter gene expression. Third, a constellation of "risk" alleles may aid in predicting incident clinical disease. With complex traits such as biomarker phenotypes, the expected effect of any specific allele is expected to be modest, and our data are consistent with this expectation.

    Implications of a Comprehensive Approach for Candidate Gene Association Studies

    The limitations of single SNP association studies have recently been highlighted, because there is substantial difficulty in interpreting both a negative and positive result.14 Thus, it has been proposed that candidate gene association studies move to a staged approach that involves 2 steps: (1) to comprehensively define common patterns of SNP variation at a locus through a study of local LD and (2) to screen for association signal with a subset of nonredundant tag SNPs at the locus.17 With the present study, we have demonstrated that such an approach can effectively localize common susceptibility alleles. In fact, using an unbiased approach with publicly available SNPs, the implicated susceptibility haplotype would have been defined without knowledge of the 4G/5G polymorphism.

    Study Limitations and Strengths

    Our study has several potential limitations. First, in our definition of LD at the PAI-1 locus and the subsequent selection of tag SNPs, we restricted our focus mainly to common genetic variants. Multiple rare variants may influence a trait, and these variants may be identified by resequencing.33 However, the frequency spectrum of susceptibility variants for complex phenotypes is likely to include common variants, and our approach is most appropriate for the discovery of such variants.17

    Second, nonexonic conserved regions, which show conservation comparable to exons, have been postulated to be important in the regulation of gene expression.34 Variation in these nonexonic conserved regions around the PAI-1 locus may influence PAI-1 level. We did not explicitly examine such regions in the present study, although common variation in these regions may have been captured by LD.

    Third, in genomic regions of high LD, the optimal methodology to select a subset of nonredundant markers has yet to be defined. We are currently comparing various methods of tag SNP selection, and alternative methods may prove to be more efficient than the one used in the present report.35–37

    Fourth, as noted previously, the extent of LD in block 1 may extend upstream of the first rs757722. The International HapMap project has been designed to comprehensively catalog patterns of LD across the human genome, and we reviewed this SNP catalog for the 100-kb region encompassing the PAI-1 locus.38 We found that LD breaks down 700 bases further upstream of rs757722, although the present HapMap SNP density in this genomic region is insufficient to be certain about this conclusion. Thus, there remains the possibility that yet unidentified variants upstream of rs757722 are the true causal variants on Hap 1A.

    Fifth, our sample was predominantly white, which limits the generalizability of our results to other ethnic groups. Sixth, given that a large systematic review found that the 4G/4G genotype was associated with a modest 1.2-fold increased risk of myocardial infarction, we were underpowered to detect a small effect of this magnitude.10

    Finally, type I error may occur when testing for association between a phenotype and multiple genetic variants. We have not accounted for multiple testing in the present analyses. In genetic association studies, the Bonferroni correction may be overly conservative owing to a high degree of correlation among the tests performed.39 However, in the present study, the association between Hap 1A, 4G/5G polymorphism, and rs2227631 with PAI-1 levels would have survived even a Bonferroni correction.

    Strengths of the present investigation include the comprehensive assessment of common genetic variation at the PAI-1 locus, the use of single allelic variants and haplotypes in association analyses, the large sample size, and the use of multivariable analyses.

    Conclusions

    In summary, using a powerful approach that is now possible from knowledge of the human genome sequence, we have comprehensively defined the role of common genetic variation at the PAI-1 locus in determining plasma PAI-1 levels. Our results suggest that approaches to define LD structure in candidate gene regions followed by conduct of association analyses with SNPs and multimarker haplotypes will be effective for localizing common susceptibility alleles.

    Acknowledgments

    This work was supported through National Institutes of Health/National Heart, Lung, and Blood Institute contract N01-HC-25195 and the Cardiogenomics Programs for Genomic Applications (HL66582). Dr Kathiresan is supported by the American College of Cardiology Foundation/Merck Adult Cardiology Research Fellowship Award and a Young Investigator Award from the GlaxoSmithKline Research & Education Foundation for Cardiovascular Disease. The authors thank Dr Christopher Newton-Cheh for his critical review of the manuscript.

    References

    Juhan-Vague I, Pyke SD, Alessi MC, Jespersen J, Haverkate F, Thompson SG. Fibrinolytic factors and the risk of myocardial infarction or sudden death in patients with angina pectoris: ECAT Study Group: European Concerted Action on Thrombosis and Disabilities. Circulation. 1996; 94: 2057–2063.

    Thogersen AM, Jansson JH, Boman K, Nilsson TK, Weinehall L, Huhtasaari F, Hallmans G. High plasminogen activator inhibitor and tissue plasminogen activator levels in plasma precede a first acute myocardial infarction in both men and women: evidence for the fibrinolytic system as an independent primary risk factor. Circulation. 1998; 98: 2241–2247.

    Vague P, Juhan-Vague I, Aillaud MF, Badier C, Viard R, Alessi MC, Collen D. Correlation between blood fibrinolytic activity, plasminogen activator inhibitor level, plasma insulin level, and relative body weight in normal and obese subjects. Metabolism. 1986; 35: 250–253.

    Festa A, D’Agostino R Jr, Tracy RP, Haffner SM. Elevated levels of acute-phase proteins and plasminogen activator inhibitor-1 predict the development of type 2 diabetes: the Insulin Resistance Atherosclerosis Study. Diabetes. 2002; 51: 1131–1137.

    Pankow JS, Folsom AR, Province MA, Rao DC, Williams RR, Eckfeldt J, Sellers TA. Segregation analysis of plasminogen activator inhibitor-1 and fibrinogen levels in the NHLBI Family Heart Study. Arterioscler Thromb Vasc Biol. 1998; 18: 1559–1567.

    de Lange M, Snieder H, Ariens RA, Spector TD, Grant PJ. The genetics of haemostasis: a twin study. Lancet. 2001; 357: 101–105.

    Dawson SJ, Wiman B, Hamsten A, Green F, Humphries S, Henney AM. The two allele sequences of a common polymorphism in the promoter of the plasminogen activator inhibitor-1 (PAI-1) gene respond differently to interleukin-1 in HepG2 cells. J Biol Chem. 1993; 268: 10739–10745.

    Eriksson P, Kallin B, van’t Hooft FM, Bavenholm P, Hamsten A. Allele-specific increase in basal transcription of the plasminogen-activator inhibitor 1 gene is associated with myocardial infarction. Proc Natl Acad Sci U S A. 1995; 92: 1851–1855.

    Panahloo A, Mohamed-Ali V, Lane A, Green F, Humphries SE, Yudkin JS. Determinants of plasminogen activator inhibitor 1 activity in treated NIDDM and its relation to a polymorphism in the plasminogen activator inhibitor 1 gene. Diabetes. 1995; 44: 37–42.

    Boekholdt SM, Bijsterveld NR, Moons AH, Levi M, Buller HR, Peters RJ. Genetic variation in coagulation and fibrinolytic proteins and their relation with acute myocardial infarction: a systematic review. Circulation. 2001; 104: 3063–3068.

    Kruglyak L, Nickerson DA. Variation is the spice of life. Nat Genet. 2001; 27: 234–236.

    Reich DE, Cargill M, Bolk S, Ireland J, Sabeti PC, Richter DJ, Lavery T, Kouyoumjian R, Farhadian SF, Ward R, Lander ES. Linkage disequilibrium in the human genome. Nature. 2001; 411: 199–204.

    Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly MJ, Altshuler D. The structure of haplotype blocks in the human genome. Science. 2002; 296: 2225–2229.

    Daly MJ, Rioux JD, Schaffner SF, Hudson TJ, Lander ES. High-resolution haplotype structure in the human genome. Nat Genet. 2001; 29: 229–232.

    Patil N, Berno AJ, Hinds DA, Barrett WA, Doshi JM, Hacker CR, Kautzer CR, Lee DH, Marjoribanks C, McDonough DP, Nguyen BT, Norris MC, Sheehan JB, Shen N, Stern D, Stokowski RP, Thomas DJ, Trulson MO, Vyas KR, Frazer KA, Fodor SP, Cox DR. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science. 2001; 294: 1719–1723.

    Johnson GC, Esposito L, Barratt BJ, Smith AN, Heward J, Di Genova G, Ueda H, Cordell HJ, Eaves IA, Dudbridge F, Twells RC, Payne F, Hughes W, Nutland S, Stevens H, Carr P, Tuomilehto-Wolf E, Tuomilehto J, Gough SC, Clayton DG, Todd JA. Haplotype tagging for the identification of common disease genes. Nat Genet. 2001; 29: 233–237.

    The International HapMap Project. Nature. 2003; 426: 789–796.

    Kannel WB, Feinleib M, McNamara PM, Garrison RJ, Castelli WP. An investigation of coronary heart disease in families: the Framingham offspring study. Am J Epidemiol. 1979; 110: 281–290.

    Declerck PJ, Alessi MC, Verstreken M, Kruithof EK, Juhan-Vague I, Collen D. Measurement of plasminogen activator inhibitor 1 in biologic fluids with a murine monoclonal antibody-based enzyme-linked immunosorbent assay. Blood. 1988; 71: 220–225.

    Poli KA, Tofler GH, Larson MG, Evans JC, Sutherland PA, Lipinska I, Mittleman MA, Muller JE, D’Agostino RB, Wilson PW, Levy D. Association of blood pressure with fibrinolytic potential in the Framingham offspring population. Circulation. 2000; 101: 264–269.

    Kannel WB, Wolf P, Garrison RJ, eds. The Framingham Study: an epidemiological investigation of cardiovascular disease. Section 34. Some risk factors related to the annual incidence of cardiovascular disease and death in pooled repeated biennial measurements: Framingham Heart Study, 30-year followup. Bethesda, Md: National Heart, Lung, and Blood Institute; February 1987. NIH publication No. 87-2703.

    Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005; 21: 263–265.

    Lewontin RC. The interaction of selection and linkage, II: optimum models. Genetics. 1964; 50: 757–782.

    Morton NE. Sequential tests for the detection of linkage. Am J Hum Genet. 1955; 7: 277–318.

    SAS/STAT User’s Guide, Version 8.1. Cary, NC: SAS Institute; 2000.

    Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA. Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum Genet. 2002; 70: 425–434.

    Zaykin DV, Westfall PH, Young SS, Karnoub MA, Wagner MJ, Ehm MG. Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals. Hum Hered. 2002; 53: 79–91.

    Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol. 1995; 12: 921–927.

    Hoekstra T, Geleijnse JM, Schouten EG, Kluft C. Plasminogen activator inhibitor-type I: its plasma determinants and relation with cardiovascular risk. Thromb Haemost. 2004; 91: 861–872.

    Festa A, D’Agostino R Jr, Rich SS, Jenny NS, Tracy RP, Haffner SM. Promoter (4G/5G) plasminogen activator inhibitor-1 genotype and plasminogen activator inhibitor-1 levels in blacks, Hispanics, and non-Hispanic whites: the Insulin Resistance Atherosclerosis Study. Circulation. 2003; 107: 2422–2427.

    Henry M, Tregouet DA, Alessi MC, Aillaud MF, Visvikis S, Siest G, Tiret L, Juhan-Vague I. Metabolic determinants are much more important than genetic polymorphisms in determining the PAI-1 activity and antigen plasma concentrations: a family study with part of the Stanislas Cohort. Arterioscler Thromb Vasc Biol. 1998; 18: 84–91.

    SeattleSNPs. NHLBI Program for Genomics Applications, UW-FHCRC. Available at: http://pga.gs.washington.edu. Accessed June 28, 2004.

    Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, Hobbs HH. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science. 2004; 305: 869–872.

    Loots GG, Locksley RM, Blankespoor CM, Wang ZE, Miller W, Rubin EM, Frazer KA. Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science. 2000; 288: 136–140.

    Chapman JM, Cooper JD, Todd JA, Clayton DG. Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum Hered. 2003; 56: 18–31.

    Stram DO, Haiman CA, Hirschhorn JN, Altshuler D, Kolonel LN, Henderson BE, Pike MC. Choosing haplotype-tagging SNPS based on unphased genotype data using a preliminary sample of unrelated subjects with an example from the Multiethnic Cohort Study. Hum Hered. 2003; 55: 27–36.

    Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet. 2004; 74: 106–120.

    International HapMap Project. Available at http://www.hapmap.org/downloads/index.html.en. Accessed June 28, 2004.

    Cardon LR, Bell JI. Association study designs for complex diseases. Nat Rev Genet. 2001; 2: 91–99.(Sekar Kathiresan, MD; Sta)