当前位置: 首页 > 医学版 > 期刊论文 > 基础医学 > 分子生物学进展 > 2004年 > 第3期 > 正文
编号:11259356
Evolution of the APETALA3 and PISTILLATA Lineages of MADS-Box–Containing Genes in the Basal Angiosperms
     Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts

    E-mail: ekramer@oeb.harvard.edu.

    Abstract

    The B class genes, including homologs of the Arabidopsis loci APETALA3 (AP3) and PISTILLATA (PI ), appear to play a conserved role in the determination of petal and stamen identity across core eudicot angiosperms. Understanding how and when these functions evolved is a critical component of elucidating the evolution of flowers, particularly the appearance of petaloid perianth organs. Before comparisons of gene expression patterns or functions can be made, however, it is necessary to establish the orthology of AP3 and PI homologs from basal angiosperms. Here, we report the identification and analysis of 29 new representatives of the B gene lineage from basal ANITA and magnoliid dicot angiosperms. These studies indicate that gene duplications have occurred at every phylogenetic level, both before and after the duplication that produced the separate AP3 and PI lineages. Comparison of genomic structure among PI homologs indicates that a 12-nucleotide deletion that had been considered synapomorphic for the whole PI lineage actually arose within the ANITA grade, after the split of the Nymphaeales but before the separation of the Austrobaileyales. Evidence for alternative splicing of the Nymphaea AP3 homolog is also presented. The implications of these findings for angiosperm systematics, the conservation of AP3 and PI gene function, and the evolution of the ABC program are discussed.

    Key Words: APETALA3 ? PISTILLATA ? MADS-box gene ? gene duplication

    Introduction

    The pan-eukaryotic family of MADS-box–containing transcription factors is known to have a complex evolutionary history (Alvarez-Buylla et al. 2000). In plants, MADS-box genes are of particular interest because of the large size of the family and the critical developmental roles the members are known to play (Theissen et al. 2000). Understanding the radiation of the gene family within the context of the evolution of the major land plant lineages has, therefore, become a special priority (Theissen et al. 2002). This line of research has largely focused on a plant-specific subfamily of type II MADS-box genes (Alvarez-Buylla et al. 2000), which are known as the MIKC-type because of the presence of four distinct domains (fig. 1) (Theissen et al. 2000). The N-terminal MADS (M) domain contains approximately 60 aa and is highly conserved across the entire superfamily. It is essential for DNA binding at sites known as CArG elements and appears to play a role in protein dimerization (Shore and Sharrocks 1995; Riechmann, Wang, and Meyerowitz 1996). The Intervening (I) and keratin-like (K) domains are critical mediators of protein dimerization and influence the specificity of interactions between MIKC-type proteins (Riechmann, Krizek, and Meyerowitz 1996). The C-terminal (C) region is more enigmatic from a functional standpoint, showing much lower sequence conservation over all but typically containing short, highly conserved motifs (Kramer, Dorit, and Irish 1998; Johansen et al. 2002). This domain has recently been shown to contribute to the formation of higher order protein complexes between dimers of MIKC-type proteins (Egea-Cortines, Saedler, and Sommer 1999; Honma and Goto 2001).

    FIG. 1. Schematic of a MIKC-type MADS box protein. Approximate lengths of each domain in amino acids (aa) are shown below the domain. The K domain of PI orthologs is typically 61 aa, whereas that of AP3 orthologs is usually 65 aa. Vertical arrows indicate the conserved positions of the six introns (I1 to I6) generally present in B class genes. The horizontal arrows indicate the positions of the primers used in primary and secondary RT-PCR reactions (see Material and Methods)

    Much of the focus in the study of the MIKC-type MADS-box genes has fallen on the so-called A, B, and C class gene lineages. These names are derived from designations made during analyses of floral developmental mutants in Arabidopsis and Antirrhinum, which led to the description of the ABC model of floral organ identity determination (Coen and Meyerowitz 1991). According to this model, floral organ identity is established by the overlapping functions of three classes of gene activity: A alone determines sepal identity; A+B determines petal identity; B+C determines stamen identity; and C alone determines carpel identity. All but one of the genes corresponding to these activities are members of the MIKC-type family of MADS-box genes, with gene function being generally conserved across genetic orthologs in various angiosperm species (reviewed in Irish and Kramer [1998] (and Theissen et al. [2000]). Further investigations of Arabidopsis MIKC-type genes led to the identification of an additional lineage known as the E class genes, which are critical to the functions of the A, B, and C classes (Pelaz et al. 2000) and are evolutionarily closely related to the A class lineage (Hasebe and Banks 1997). Our current understanding of the ABC program holds that different heterodimers of A, B, C, and E class proteins interact to form functional "ternary" or "quartet" protein complexes that are responsible for establishing the various floral organ identities (Egea-Cortines, Saedler, and Sommer 1999; Honma and Goto 2001; Theissen 2001).

    One of the best-understood lineages of MIKC-type genes in terms of evolution is the clade of B class genes, including the closely related paralogous lineages represented in Arabidopsis by APETALA3 (AP3) and PISTILLATA (PI ) (Bowman, Smyth, and Meyerowitz 1989). The products of AP3 and PI function as obligate heterodimers (Riechmann, Krizek, and Meyerowitz 1996) to establish the identity of petals and stamens in the developing floral meristem (Bowman, Smyth, and Meyerowitz 1989). Both the developmental and the biochemical aspects of AP3 and PI function appear to be conserved across orthologs analyzed in many core eudicots (reviewed in Irish and Kramer [1998] and Theissen et al. [2000]). Outside the core eudicots, however, greater variability has been observed (Kramer and Irish 2000), and at least one PI homolog has been shown to have the capacity to bind DNA as a homodimer (Winter et al. 2002). Along with this flexibility, many ancient and recent gene duplications have been characterized in the B gene lineage (Kramer, Dorit, and Irish 1998; Kramer, Di Stilio, and Schluter 2003). These comparative approaches have facilitated the identification of highly conserved C-terminal sequences known as the PI, paleoAP3, and euAP3 motifs (Kramer, Dorit, and Irish 1998). The paleoAP3 motif represents the ancestral C-terminal sequence within the angiosperm AP3 lineage but was replaced by the euAP3 motif in one paralogous AP3 lineage found only in the core eudicots (Kramer, Dorit, and Irish 1998). Our understanding of the evolution of the B gene lineages has been further elucidated by gymnosperm studies that have identified putative homologs of AP3/PI (Mouradov et al. 1999; Sundstrom et al. 1999). These genes appear to represent an ancestral lineage that predates the AP3/PI duplication (Winter, Saedler, and Theissen 2002) but retains C-terminal sequences similar to the PI and paleoAP3 motifs and is exclusively expressed in male reproductive organs (Mouradov et al. 1999; Sundstrom et al. 1999). An additional lineage, known as the Bsister (Bs) genes, has recently been identified as a closely related paralogous lineage to the angiosperm and gymnosperm B class genes (Becker et al. 2002). Although the members of this lineage possess PI and paleoAP3 motifs, they appear to be involved in aspects of carpel or ovule development (Becker et al. 2002; Nesi et al. 2002).

    The emerging picture of B lineage evolution has been somewhat obscured by our lack of data from basal angiosperm lineages, particularly the primitive ANITA grade families (Qiu et al. 1999). Obtaining a clearer understanding of the early patterns of gene duplication and sequence evolution during the initial angiosperm radiations is particularly dependent on sampling from these groups. Because of the critical roles that the A, B, and C class genes play in flower development, the evolution of the corresponding lineages has been suggested to be connected to the evolution of the flower itself (Theissen et al. 2000; Theissen et al. 2002), underscoring the importance of obtaining information from basal angiosperms. To these ends, we have identified B and Bs lineage representatives from 10 angiosperm taxa primarily drawn from the ANITA grade and magnoliid dicots. Phylogenetic analyses of these genes in combination with earlier characterized homologs indicate that the evolution of the AP3 and PI lineages has been complex and dynamic from the earliest stages of angiosperm diversification.

    Materials and Methods

    Plant Materials

    Floral buds were collected over a broad range of developmental stages from Aristolochia eranthia (Aristolochiaceae), Asimina triloba (Annonaceae), Drimys winterii (Winteraceae), Houttuynia cordata (Saururaceae), Illicium henryii (Illiciaceae), Lindera erythrocarpa (Lauraceae), Meliosma dilleniifolia (Sabiaceae), Nymphaea sp. (Nymphaeaceae), Saruma henryii (Aristolochiaceae) and Thottea siloquosa (Aristolochiaceae) (see table S1 in Supplementary Material online). Floral tissue was immediately frozen in liquid nitrogen and stored at -80°C.

    Cloning and Characterization of AP3 and PI Homologs

    Isolation of AP3 and PI homologs was performed using RT-PCR in a manner similar to that described in Kramer, Dorit, and Irish (1998). Initial amplification of first-strand cDNA used one of two degenerate forward primers: primer 1, 5'-GGIMGIGGIAARATIGARATIAARMGIAT-3', or primer 2, 5'-ATGGSIMGIGGIAARATISARAT-3', with a poly-T reverse primer, 5'-CCGGATCTCTAGACGGCCGC(T)17. The products of the primary PCR reaction were cleaned with the QIAquick PCR purification kit (Qiagen, Valencia, Calif.), diluted 1:10, and used as template in a second PCR reaction using one of two degenerate primers: primer 3, 5'-AAYMGMCAMGTIACITWYTCIAARMGRMG-3', or primer 4, 5'-WCIAAYMGRCARGTIACITWWTC-3', with the same anchored poly-T reverse primer. All PCR amplifications were performed in 100 ml of PCR buffer (200 mM Tris-HCl, pH 8.4; 500 mM KCl; 50 mM MgCl2) containing 50 pmol and 10 pmol of 5' and 3' primer, respectively, 200 mmol of each dTNP, and 2 units of PlatinumTaq Polymerase (Invitrogen, Carlsbad, Calif.). The amplification program began with a 12 min activation step at 95°C, followed by a 1 min incubation step at 95°C, a 30 s annealing step at temperatures ranging from 50°C to 65°C, and a 1 min extension at 72°C. The program was repeated for 37 cycles and was terminated by a 10 min incubation step at 72°C. The amplified PCR products were cloned using the TOPO TA Cloninga Kit (Invitrogen, Carlsbad, Calif.) as per manufacturer's instructions. For each taxon, 100 to 400 clones of more than 650 bp were characterized by sequencing (BigDye Terminator version 3.0, ABI prism 3100, Applied Bioscience, Foster City, Calif.) and/or restriction analysis. At least five independent clones were sequenced for every putative locus. Criteria used to distinguish putative loci include the degree of nucleotide identity and presence of unique indels. Sequence variants that contained no indels and differed by less than 5% identity were treated as alleles. All cDNA sequences have been deposited in GenBank (accession numbers AY436707 to AY436746). The Aquilegia alpina gene AqaBS was identified in the context of a separate screen (Kramer, Di Stilio, and Schluter 2003) but is being reported here for the first time.

    Phylogenetic Analysis

    Alignments were initially compiled using ClustalW and then refined by hand, taking into consideration both nucleotide and amino acid sequences (see Supplementary Material online [www.mbe.oupjournals.org] for NEXUS files and accession numbers). Five different alignments were created from the amino acid and nucleotide data sets. The first amino acid alignment contained all new AP3, PI, and BS homologs, previously identified Magnoliid dicot B and Bs lineage representatives, gymnosperm B and Bs homologs, and sequences from the AGL15 and AGL17/ANR1 MADS-box gene lineages, which have been identified as closely related to the B genes (Hasebe and Banks 1997; Shindo et al. 1999). This amino acid alignment excluded the C-terminal region of the predicted protein sequences because of difficulty aligning this region between angiosperm and gymnosperm B class genes and between the outgroup sequences. We delimited the K domain as originally defined in Ma, Yanofsky, and Meyerowitz (1991), resulting in an MIK data set of 171 characters. Separate full-length nucleotide and predicted protein sequence alignments were created for the angiosperm AP3 and PI data sets. Based on the position of NymPI in the MIK analysis (see below), this PI homolog was chosen as the outgroup for both the AP3 and the PI data sets.

    All amino acid alignments were analyzed using PAUP* version 4.03b (Swofford 2001). Maximum-parsimony trees were generated through heuristic searches with 1,000 random stepwise additions, with tree bisection-reconnection (TBR) branch swapping and saving multiple parsimonious trees (MULTREES on). Gaps were encoded as missing data, and all characters were weighted equally. Bootstrap support for the full-length AP3 and PI data sets was estimated by performing 1,000 heuristic searches with 10 addition sequence replicates per bootstrap, using the same criteria as in the original search. Bootstrap support for the MIK data set was estimated by performing 1,000 nonparametric bootstrap replicates with random taxon addition, TBR branch-swapping, and MULTREES turned off. Wilcoxon sign-rank (known as the Templeton test [Templeton 1983]) and Kishino-Hasegawa (Kishino and Hasegawa 1989) tests were conducted on the MP trees to determine whether the data could reject topologies that were found in the Bayesian trees. Tests were also performed to explore topologies that would suggest alternative patterns of gene duplication.

    Bayesian phylogenetic analyses were conducted on the nucleotide alignments using the program MrBayes version 3.0 (Huelsenbeck and Ronquist 2001). The best model of evolution was determined using Modeltest version 3.06 (Posada and Crandall 1998). The model of DNA substitution selected for both AP3 and PI was GTR+ I + , which assumes general time reversibility (GTR), certain proportion of invariable sites (I), and a gamma approximation of the rate-variation among sites (). The option "codon" was used for the nucleotide substitution model, following the probabilistic model of codon evolution by Muse and Gaut (1994). We ran four chains of the Markov chain Monte Carlo method, sampling one tree every 100 generations for 1,000,000 generations starting with a random tree. Both searches reached stationarity after about 63,000 generations. The first 63,000 generations were considered the "burn-in" period and were not included in generating the consensus phylogenies.

    Cloning and Characterization of the NymAP3 and NymPI Genomic Loci

    Nymphaea sp. genomic DNA was prepared from leaf tissue using the DNeasy Plant Mini Kit (Qiagen, Valencia, Calif.). To obtain fragments of the NymAP3 genomic locus, the DNA was amplified using a specific forward primer, NymAP3F 5'-CATTCTGAGCTGTGCGGTCTTGAGCAA-3', in combination with one of two specific reverse primers: NymAP3R1, 5'-CTTTGTTTCTAGGGTCATCGGCTAACCT-3' or NymAP3R2, 5'-GGATTCATAATTATCTTCACTTCCATCGAA-3'. The primers were designed to regions of the NymAP3 cDNA predicted to fall within exon 3 for NymAP3F and within exons 5 and 6 for NymAP3R1 and NymAP3R2, respectively. PCR amplification was performed using BD Advantage Genomic PCR Kit (BD Biosciences Clonetech, Palo Alto, Calif.) as per manufacturer's instructions. The amplification program began with a 1 min activation step at 94°C, followed by a 30 s denaturing step at 94°C, a 30 s annealing step at 50°C to 60°C, and a 3 min extension step at 68°C, repeated for 30 cycles. The resulting genomic fragments were cloned using the TOPO TA Cloning? Kit (Invitrogen, Carlsbad, Calif.). Approximately 60 clones were screened for size, and 24 clones of either 350 bp (generated with NymAP3R1) or 470 bp (generated with NymAP3R2) were sequenced as described above. The resulting consensus genomic sequence (GenBank accession number AY436747) was aligned to the NymAP3 cDNAs (GenBank accession numbers AY436740 to AY436743) using ClustalW and then refined by hand (see figure S2 in Supplementary Material online [www.mbe.oupjournals.org]).

    A region of the NymPI genomic locus was similarly obtained using the following primers: NymPIF, 5'-GACCTGAGCTCGTTGTCTGTTGTCGAACTTCGAA-3' and NymPIR, 5'-CCAATGTCGATGTCTCCCAGCTCGCGCATT-3'. These primers were predicted to fall within exons 4 and 7, respectively. PCR was performed on Nymphaea genomic DNA as described above. Intron position was assessed by aligning the sequence of the genomic fragment (GenBank accession number AY4366748) to the NymPI cDNA sequence (GenBank accession number AY436744).

    Results

    Thirty-five new representatives of the B and Bs lineages were isolated from 11 species, including two members of the basal ANITA grade and seven taxa from the magnoliid dicots (table 1 and see also table S1 in Supplementary Material online). All of these genes contained the diagnostic sequence characteristics of the B subfamily, particularly the C-terminal PI and/or paleoAP3 motifs (fig. 2), as well as other synapomorphies in the M, I, and K domains (Kramer, Dorit, and Irish 1998). Multiple paralogous AP3 loci were identified in Drimys, Illicium, and Lindera. For both of the Illicium loci IhAP3-2 and IhAP3-3, two highly similar (98.7% to 99.3% identity) but distinct variants were recovered, which may be alleles or very recently duplicated paralogs. For the purposes of this analysis, they have been treated as alleles. None of the PI homologs identified appeared to have allelic variants, although two distinct PI loci were recovered from Drimys, Illicium, and Lindera. Representatives of the Bs lineage were identified in Drimys (DrwBS) and Aquilegia (AqaBS).

    Table 1 APETALA3 and PISTILLATA Lineage Members Included in This Analysis.

    FIG. 2. Alignment of the C-termini of predicted proteins from B and Bs lineage homologs included in the MIK dataset. The PI and paleoAP3 motifs are indicated with boxes. Residues in each region that show chemical conservation with the PI or paleoAP3 motif consensus sequences (Kramer, Dorit, and Irish 1998) are shaded

    Analysis of B Subfamily

    Maximum-parsimony analysis of the MIK amino acid alignment yielded 37 equally parsimonious trees of 2,241 steps (strict consensus in figure 3). There is strong bootstrap support for the AP3 and PI lineages and moderate support for a monophyletic clade including both AP3 and PI, suggesting that the AP3/PI duplication occurred after the last common ancestor of angiosperms and gymnosperms. Within the AP3 and PI clades, internal nodes are not well supported. NymPI is positioned as the first branch of the PI clade with a marginal bootstrap value of 63%, but other evidence strongly supports the basal position of this gene (see below). The Bs lineage has marginal bootstrap support, and there is no resolution of the node leading to this clade or those associated with the various gymnosperm B gene lineages. However, given that the Bs clade contains both gymnosperm and angiosperm representatives, and because our current understanding of seed plant phylogeny suggests that gymnosperms are monophyletic (Bowe, Coat, and dePamphilis 2000; Chaw et al. 2000), we can infer that the B/Bs duplication occurred before the last common ancestor of all extant seed plants, confirming earlier analyses (Becker et al. 2002). The phylogeny indicates that truncation events that removed the paleoAP3 motif have occurred independently in paralogous gymnosperm B lineages and the angiosperm PI lineage. This conclusion is also supported by the alignment shown in figure 2, which suggests that the truncations have occurred at different positions.

    FIG. 3. Strict consensus tree of 37 equally most-parsimonious trees of 2,241 steps. The numbers next to the nodes give bootstrap values from 1,000 replicates. Genes cloned in this study are shown in bold. The name of the associated genus is listed in parentheses after the gene name. Dark-gray lines indicate the AP3 clade; light-gray lines indicate the PI clade; and black lines indicate the gymnosperm B homologs (Gymno B), Bsister representatives, and outgroup sequences. Order or family relationships of the taxa are indicated to the right of the gene names

    Analysis of the PISTILLATA Lineage

    In-depth analysis of the PI lineage was continued by performing a maximum-parsimony analysis on an alignment of the full-length predicted protein sequences (fig. 4A) and a Bayesian search on an alignment of the full-length nucleotide data set (fig. 4B). Based on its supported basal position in the MIK tree (fig. 3), NymPI was used as the outgroup for both data sets. Overall, the MP analysis shows low bootstrap support for most nodes, whereas the Bayesian analysis has relatively high posterior probability values for a majority of nodes. However, posterior probabilities are known to be considerably less stringent than bootstrap values (Suzuki, Glazko, and Nei 2002; Alfaro, Zoller, and Lutzoni 2003; Douady et al. 2003) and should be considered upper boundaries of confidence for the relationships depicted at these nodes.

    FIG. 4. Phylogenies derived from maximum-parsimony (A) and Bayesian (B) analyses of the full-length amino acid and nucleotide PI sequences, respectively. Genes cloned in this study are shown in bold. Order or family relationships of the taxa of origin are indicated to the right: Eud = Eudicots, Mag = Magnoliales, Pip = Piperales, Wint = Winterales, Laur = Laurales, Chl = Chloranthaceae, Mon = Monocots, and AN = ANITA grade. (A) One of two equally most-parsimonious trees of 997 steps. The numbers next to the nodes give bootstrap values from 1,000 replicates. The stippled branch collapses in the strict consensus. (B) A 50% majority-rule tree derived from those trees sampled after "burn-in." The numbers next to the nodes indicate the posterior probabilities for those branches. The name of the associated genus is listed in parentheses after the gene name

    Both trees are in agreement that the PI paralogs from Illicium, Houttuynia, and Drimys, which share 75% to 90% nucleotide identity, are derived from independent, and relatively recent, duplication events. Consistent with this finding, all of these taxa have been reported to have some degree of polyploidy (Bennett, Smith, and Heslop-Harrison 1982; Sun, Stuessy, and Crawford 1990; Bennett and Leitch 1995). In contrast, each Lindera PI homolog emerges as orthologous to a PI from Calycanthus, suggesting an ancient duplication event in the PI lineage that at least predated the split between the Lauraceae and Calycanthaceae. An MP analysis where LnePI-1/-2 and CfPI-1/-2 were each constrained together yielded trees 18 steps longer than the MP trees. These constrained trees were found to be significantly less parsimonious than the original MP trees using the KH and Templeton tests (table 2), thus supporting the conclusion that LnePI-1/CfPI-1 and LnePI-2/CfPI-2 each define orthologous lineages. Both phylogenies place the paralogous Lauralean PI lineages in separate clades. However, in the MP analysis, this arrangement has no support, and it has only marginal support in the Bayesian analysis. If these topologies were taken to represent the true phylogenetic history of the PI lineage, it would suggest an ancient duplication event predating the diversification of the magnoliid dicots and monocots, followed by independent losses of each paralogous lineage. Although this hypothesis is suggested by both topologies, constraining all Lauralean PI homologs together yields trees only four steps longer than the MP tree. These trees are not significantly different using KH or Templeton tests (table 2). The Bayesian tree also suggests an orthologous relationship between the Thottea locus TtsPI-2 and LnePI-2/CfPI-2, but MP trees in which the two Thottea PI homologs are constrained together are not significantly different from the MP tree as judged by KH and Templeton tests (table 2). Therefore, although there appear to be two paralogous PI lineage in the Laurales, the duplication event that produced them may not have been as ancient as indicated by the MP and Bayesian analyses.

    Table 2 Results of MP Analyses Using Various Constraints and Subsequent Comparison Tests of Constrained Versus Unconstrained MP Trees.

    In both the MP and Bayesian analyses, the Piperaceae and Saururaceae PI homologs form a single clade but appear to have very long branches compared with other PI sequences (fig. 4A). The two Piper paralogs are separated from each other with high support, indicating that they were produced by a duplication that occurred before the last common ancestor of Piper, Peperomia, and Houttuynia. MP trees in which the Piper paralogs are constrained together are 28 steps longer and significantly less parsimonious by the KH and Templeton tests than the most parsimonious reconstruction (table 2). In the MP analysis, the Piperaceae and Saururaceae PI homologs are placed in an unsupported clade with the monocot PI loci, but in the Bayesian phylogeny, these genes are associated with Aristolochiaceae PI homologs. To determine whether the topology recovered by the Bayesian search is possible under a parsimony model, an analysis was performed where all of the Piperales PI homologs, both including and excluding TtsPI-2, were constrained to form a clade. MP analysis using these constraints yielded five trees seven steps longer than the MP tree. When analyzed using the KH and Templeton tests, it was found that these trees are not significantly different from the ones obtained using the unconstrained maximum-parsimony search, thereby allowing the possible monophyly of the Piperalean PI homologs (table 2).

    Close examination of the Nymphaea PI homolog revealed a notable difference in the structure of this gene. All Bs homologs, gymnosperm B lineage representatives, and AP3 homologs sequenced to date have 42 nucleotides in exon 5, whereas all previously examined PI representatives have only 30 nucleotides in exon 5 (Johansen et al. 2002). This difference is caused by a 12-nucleotide deletion that appears to have occurred in the center of the exon (Purugganan et al. 1995). As can be seen in figure 5, NymPI lacks this 12-nucleotide deletion, which has been confirmed by sequencing this region from genomic DNA (see figure S1 in Supplementary Material online). In contrast, both PI homologs from Illicium have the deletion, as do all other PI homologs recovered in this analysis. The absence of this deletion indicates that NymPI is indeed ancestral to the other PI orthologs included in this analysis.

    FIG. 5. Amino acid alignment showing the end of the K domain through the beginning of the C domain for select representatives of the AP3, PI, Gymnosperm B (Gymno B) and Bsister (Bs) lineages. Vertical lines show the conserved positions of introns 4 and 5 across PI, GLO, NymPI, AP3, DEF, NymAP3, DAL11, DAL12 and DAL13 (Jack, Brockman, and Meyerowitz 1992; Schwarz-Sommer et al. 1992; Trobner et al. 1992; Goto and Meyerowitz 1994; Sundstrom et al. 1999)

    Analysis of the APETALA3 Lineage

    Full-length amino acid and nucleotide sequence alignments of the AP3 data set were analyzed using maximum-parsimony (fig. 6A) and Bayesian inference (fig. 6B), respectively. NymPI was chosen as outgroup to the AP3 data set based on its basal position in the PI clade and the sister relationship between the AP3 and PI clades (fig. 3).

    FIG. 6. Phylogenies derived from parsimony (A) and Bayesian (B) analyses of full-length amino acid and nucleotide AP3 sequences, respectively. Genes cloned in this study are shown in bold. Order or family relationships of the taxa of origin are indicated to the right as in figure 4. (A) One of two equally parsimonious trees of 1,241 steps. The numbers next to each node give the bootstrap support from 1,000 replicates. The branch labeled with an arrow collapses in the strict consensus. (B) A 50% majority-rule tree derived from those trees sampled after "burn-in." The numbers next to each node indicate the posterior probabilities for those branches. The name of the associated genus is listed in parentheses after the gene name

    All of the AP3 homologs possess a paleoAP3 motif, except the gene from Chloranthus, CsAP3. This unusual condition is the result of a mutation that has occurred in the otherwise highly conserved aspartic acid codon (fig. 2). The premature stop appears to have been formed recently because the correct read-through of the rest of the motif is still intact in the cDNA sequence (data not shown). Considering this fact, considerable lengths were taken to confirm that the difference was not a PCR or sequencing artifact (Kramer 2000). Another interesting sequence distinction is that almost all the AP3 homologs from the Magnoliid dicots and ANITA taxa have acidic residues at position 72 of the AP3 amino acid alignment (113 in the MIK) rather than the neutral residues seen throughout the eudicot AP3 homologs (see Discussion).

    The multiple AP3 homologs found in Drimys and Lindera appear to be derived from relatively recent duplications, but the Illicium AP3 homologs have a more complex evolutionary history. IhAP3-1 does not form a clade with IhAP3-2 and IhAP3-3, being positioned as a separate branch in both analyses. The MP topology would suggest that a duplication occurred in the AP3 lineage before the diversification of all the angiosperms sampled in this study, with one paralogous lineage represented by NymAP3, IhAP3-2, and IhAP3-3 and the other represented by IhAP3-1 and the balance of the AP3 homologs. The topology from the Bayesian analysis indicates that AP3 was duplicated somewhat later, along the branch between Nymphaea and Illicium, but similarly suggests that orthologs of IhAP3-2/-3 are absent in angiosperm lineages above Illicium. Maximum-parsimony analyses in which all Illicium paralogs were constrained together yielded two trees, each 10 steps longer than the maximum-parsimony tree. These trees were found to be significantly less parsimonious than the original MP tree by both the KH and Templeton tests (table 2).

    Similar to what was seen in the PI nucleotide trees, the Piperaceae and Saururaceae AP3 homologs are placed together with high support but have long branch lengths. Although the Piperales AP3 homologs do not form a clade in the MP phylogeny, they do in the Bayesian analysis. Constraining all Piperales AP3 homologs into a monophyletic clade was tested with MP analysis, yielding four trees each only two steps longer than the maximum-parsimony tree. These trees were not found to be significantly different by the KH and Templeton tests (table 2).

    As shown in figure 2, AP3 homologs from the Magnoliales and CfAP3-1 from Calycanthus appear to have similar deletions in the PI motif-derived region. However, neither phylogenetic analysis groups CfAP3-1 with the AP3 representatives from the Magnoliales containing the similar deletion. MP analysis where the Magnoliales AP3s and CfAP3-1 were constrained to form a single clade yielded 12 trees, each three steps longer than the unconstrained tree. When analyzed using the KH and Templeton tests, it was found that these trees are not significantly different from unconstrained MP trees (table 2). If the deletion in the PI motif region does represent a synapomorphy uniting CfAP3-1 and the Magnoliales AP3 homologs, it would suggest that a duplication occurred in the AP3 lineage before the split of the Magnoliales and the Laurales, followed by the deletion of the PI motif region in one of the paralogous lineages. Such an event may have been followed by different patterns of paralog loss in the separate Magnoliales and Laurales lineages. Alternatively, the PI motif deletion could have occurred independently in the Magnoliales and the CfAP3-1 lineages, with the CfAP3-1/-2 duplication occurring more recently in the Laurales. The MP and Bayesian phylogenies both suggest that the CfAP3-1/-2 duplication predated the last common ancestor of Lindera and Calycanthus, although a CfAP3-1 ortholog was not detected in Lindera. If the two Calycanthus AP3 homologs are constrained together, however, MP analysis produces two trees only four steps longer then the original MP tree, a difference that cannot be rejected with the current data set (table 2). Therefore, the timing of the CfAP3-1/-2 duplication and the homology of the PI motif deletion remains unclear.

    In the course of characterizing the putative AP3 homolog from Nymphaea, four different classes of polyadenylated NymAP3 cDNAs were obtained. These classes were identical in sequence, except for four distinct patterns of indels observed in the K domain (fig. 7B). There is also slight variation in the 3' UTR polyadenylation position, but this does not correspond with the four indel classes (data not shown). Given the high frequency with which three of the classes were recovered (fig. 7B), it seemed possible that the different transcript types were produced through an alternative splicing mechanism. To investigate this possibility, a genomic fragment of NymAP3 corresponding to the region showing the indels was amplified and sequenced. All genomic clones (a total of 24 generated using two different primer pairs) were identical (data not shown), indicating that the different transcripts are all derived from one locus. Alignment of the genomic DNA and cDNA sequences (see figure S2 in Supplementary Material online) revealed a complex pattern of alternative splicing. Comparison of the class I cDNAs, which appear to contain the complete reading frame, with the characterized genomic DNA indicates that three introns designated I3, I4, and I5 (fig. 7A) are present in this region. These introns correspond to the same positions as introns 3, 4, and 5 in the Arabidopsis AP3 genomic sequence (Jack, Brockman, and Meyerowitz 1992). Based on the NymAP3 genomic and cDNA alignment (figure S2 in Supplementary Material online) and the consensus donor/acceptor site sequences from Arabidopsis (Hebsgaard et al. 1996; Brown and Simpson 1998; Lorkovic et al. 2000), the putative acceptor and donor sites for each NymAP3 intron/exon boundary were determined (fig. 7C). All of these sites appear to be correctly utilized in the class I clones, yielding a complete reading frame that encodes a predicted protein of 207 aa, including a C-terminal region with a paleoAP3 motif (fig. 2). In the class II clones, the I4 region has not been removed before polyadenylation of the transcript. Class III clones also retain I4 but appear to have utilized an alternative donor splice site located 5 bp within the E5 exon to splice to the E6 acceptor site. Because of the inclusion of I4 in the mature transcript, the translation products of both class II and class III cDNAs would be truncated by an in-frame stop codon that occurs in the center of I4. The resulting proteins would have 15 novel amino acids after those encoded by E4 and would lack almost all of the C-terminal domain. The class IV cDNA exhibits direct splicing of the E4 donor site to the E6 acceptor site, causing the omission of the entire E5 exon (fig. 7B). The protein product of the class IV transcript would lack the amino acids encoded by E5 but would remain in frame in E6. The resulting protein would lack most of the N-terminal end of the C domain.

    FIG. 7. Evidence of alternative splicing of the NymAP3 transcript. (A) Intron/exon structure of sequenced region of the NymAP3 genomic locus. Horizontal arrowheads indicate the positions of the NymAP3F1 (F1) and NymAP3R2 (R2) primers. Vertical arrows indicate the acceptor and donor splice sites. Designations of the exons as E3 to E6 and introns as I3 to I5 are based on the apparent conservation of the NymAP3 genomic structure relative to Arabidopsis AP3 (Jack, Brockman, and Meyerowitz 1992). Length of each exon or intron in base pairs (bp) is indicated above the boxes (exons) or lines (introns). Lengths of exons E3 and E6 reflect only the sequenced region of these exons. (B) Inferred structure of cDNA splicing variants in the region corresponding to the sequenced NymAP3 genomic fragment. The number of clones obtained for each class listed in parentheses. (C) Acceptor and donor sites from Arabidopsis thaliana (At) (Reddy 2001) and each apparent site as inferred from comparisons of NymAP3 genomic and cDNA sequences (designations as in [A]). The vertical lines indicate the splice position. Shading indicates sequence conservation of Nymphaea sites with consensus At sites

    Discussion

    Dynamic Evolution of the B Subfamily

    The current study presents the cloning and characterization of 33 new AP3 and PI homologs from taxa belonging to the basal angiosperms, particularly members of the ANITA grade and the magnoliid dicots. It also reports the discovery of two new representatives of the Bs lineage in Drimys and Aquilegia. Phylogenetic analyses of these genes in combination with other B lineage representatives have reinforced our understanding of the evolution of the B lineage. A common ancestor of the subfamily appears to have been present in the lineage leading to seed plants. This ancestor possessed C-terminal motifs similar to the PI and paleoAP3 motifs defined in the angiosperm representatives of the B lineage (Kramer, Dorit, and Irish 1998). At some time before the split between angiosperms and gymnosperms, this ancestor was duplicated, giving rise to the B lineage sensu strictu and the Bs lineage (Becker et al. 2002). In the gymnosperms, the B lineage appears to have undergone multiple gene duplications, which in some paralogs were followed by independent truncation events eliminating the C-teminal paleoAP3 motif (Winter, Saedler, and Theissen 2002).

    Along the lineage leading to the angiosperms, the B lineage was duplicated to give rise to what are referred to as the AP3 and PI gene lineages. These two paralogous lineages acquired many clear synapomorphies before the diversification of extant angiosperms, including a truncation in the PI lineage that eliminated the paleoAP3 motif (Kramer, Dorit, and Irish 1998). One of the characters previously considered to be diagnostic for the PI lineage, a 12-nucleotide difference in the length of exon 5 (Johansen et al. 2002; Winter, Saedler, and Theissen 2002), has now been shown to have evolved within the ANITA grade. This finding provides further confirmation of phylogenetic analyses that have placed the Nymphaeales at or close to the base of the angiosperms (Parkinson, Adams, and Palmer 1999; Qiu et al. 2001; Zanis et al. 2002) and may also have implications for the evolution of PI function. The four amino acids, which are present in Nymphaea but absent in Illicium, are located at the beginning of the recently proposed third -helix of the K domain (Yang, Fanning, and Jack 2003). Although several studies have shown that the first two K region helices are critical for the heterodimerization of AP3 and PI (Krizek and Meyerowitz 1996; Riechmann, Krizek, and Meyerowitz 1996; Yang, Fanning, and Jack 2003), evidence now suggests that the putative third -helix may mediate higher-order protein interactions between AP3/PI heterodimers and the E class SEPALLATA proteins (Yang and Jack, personal communication). The observed 4-aa deletion would shift the pattern of hydrophobic residues, which appear to be organized in heptad repeats in this region (Yang, Fanning, and Jack 2003), possibly changing the capacity of the domain to mediate certain protein interactions.

    Further evidence of potential changes in the specificity of protein interactions is found in the first -helix of the K domain. Residues corresponding to positions 113 and 118 in our MIK amino acid alignment (Supplementary Material online at www.mbe.oupjournals.org) have been identified as critical to the specific formation of heterodimers between Arabidopsis AP3 and PI (Yang, Fanning, and Jack 2003). In eudicot AP3 homologs, position 113 is typically uncharged, whereas position 118 is basic. Eudicot PI homologs generally have an acidic residue at 113 but an uncharged amino acid at 118. The potential ionic interaction between amino acids at position 118 in AP3 and 113 in PI has been proposed to be similar to the i +5 ionic interactions that promote the formation of specific dimer pairs in leucine zipper proteins (Yang, Fanning, and Jack 2003). Moreover, the ability of the protein product of the gymnosperm B gene GGM2 to bind DNA as a homodimer (Winter et al. 2002) is correlated with the presence of oppositely charged residues at these positions. It is interesting to note, therefore, that the majority of magnoliid dicot and ANITA grade AP3 homologs encode acidic residues at 113 and basic residues at 118, which under the theory of Yang, Fanning, and Jack (2003) could indicate that they have the capacity to function as homodimers. Similarly, several PI homologs encode basic residues at position 118, along with the highly conserved acidic residue at 113. Several studies have now shown that changes in coding sequence can have a significant impact on gene function (Galant and Carroll 2002; Ronshaugen, McGinnis, and McGinnis 2002), including evidence of functional divergence between the euAP3 and paleoAP3 motifs (Lamb and Irish 2003). It remains to be determined whether the deletions and character state changes that occurred during the course of AP3/PI evolution have had a similar impact on biochemical aspects of gene function.

    In addition to these trends of sequence evolution, we see dynamic patterns of gene lineage evolution in both the AP3 and PI lineages throughout the evolution of the basal angiosperms. For example, in the PI lineage, there is clear evidence for a duplication predating the last common ancestor of Lindera of the Lauraceae, a derived family within the Laurales, and Calycanthus of the Calycanthaceae, which is basal in the order and has a fossil record going back 100 Myr (Renner 1999). Similarly, paralogs derived from a duplication predating the last common ancestor of Piperaceae and Saururaceae have been retained in Piper. The IhAP3-1 and IhAP3-2/-3 lineages from Illicium may represent the most ancient example of this phenomenon. There are also cases of multiple recent duplications, most notably the two PI and four AP3 homologs in Drimys, which most likely reflect its polyploid genome (Sun, Stuessy, and Crawford 1990). These findings highlight the complicated nature of gene lineage evolution and exemplify the important caution raised by Theissen (2002) that simple genetic orthology is unlikely to exist between distantly related taxa. It is currently unclear as to what the functional implications of the long-term retention of AP3 or PI paralogs might be. None of the duplications detected in this study appear to be followed by dramatic patterns of sequence change, such as what has been observed in the case of the euAP3 and TM6 gene lineages (Kramer, Dorit, and Irish 1998). However, the fact that, for instance, two PI paralogs have been retained in the Laurales for approximately 100 Myr does suggest that they are being selectively maintained (Otto and Yong 2002). This does not necessarily mean that novel functions have evolved, because some type of subfunctionalization (Force et al. 1999; Lynch and Conery 2000) may have occurred. It is also unknown as to whether patterns of neofunctionalization or subfunctionalization are highly conserved within orthologous lineages. One possible indication of functional plasticity is the long branch lengths observed for both AP3 and PI homologs of the Piperaceae and Saururaceae. Although this trend could be a genome-wide phenomenon, it may be specific to these genes and related to the highly derived floral morphology found in these families (Tucker, Douglas, and Liang 1993).

    The evidence for alternative splicing of the NymAP3 transcript is unique among loci identified in this study or any other survey of AP3 and PI homologs to date (Irish and Kramer 1998; Kramer, Dorit, and Irish 1998; Kramer and Irish 2000; Kramer, Di Stilio, and Schluter 2003). Instances of alternative RNA processing are not especially uncommon in plants (Reddy 2001), however, and examples have been identified in other MADS-box-containing genes (Krogen and Ashton 2000; Cheng et al. 2003), including a minor splicing difference in transcripts of ABS (Nesi et al. 2002), the Arabidopsis Bs lineage representative. The critical question is whether the phenomenon observed for NymAP3 is regulatory in nature. The high frequency with which alternatively spliced NymAP3 transcripts were recovered, especially compared with the perfect splicing of all isolated NymPI cDNAs, suggests that unspecific sloppiness is not a suitable explanation. There are at least two potential mechanisms by which the production of alternative transcripts could regulate NymAP3. If the class II and III mRNAs are not translated, their production could serve to reduce the concentration of functional NymAP3 transcript, thereby attenuating gene function. Alternatively, if the transcripts are translated, the truncated products may be capable of functioning as dominant negative factors, similar to what has been found for truncated AP3 in Arabidopsis (Krizek, Riechmann, and Meyerowitz 1999). Whether this would be in the context of homodimer or heterodimer formation is currently not known. Class IV transcripts would also produce an altered protein, but it is unclear what effect deleting the exon 5 region would have on protein function, although the whole C-terminal domain has been implicated in the formation of higher-order protein complexes (Egea-Cortines, Saedler, and Sommer 1999). It has previously been found that the autoregulatory and cross-regulatory interactions that characterize AP3 and PI homologs in the core eudicots (reviewed in Irish and Kramer [1998]) are not universally conserved (Kramer and Irish 2000). The NymAP3 findings provide further evidence that diverse gene regulatory mechanisms are acting on AP3 or PI homologs across divergent taxa.

    Thus, the emerging picture of AP3 and PI lineage evolution is one of dynamic and stochastic evolution. Even if the developmental function of these genes in determining the identity of petaloid organs and stamens is broadly conserved, the exact parsing of ancestral functions among paralogs, along with possible instances of neofunctionalization, is likely to vary. Furthermore, this study demonstrates that these processes have been acting on the AP3 and PI lineages at every phylogenetic level. Our findings underscore the potential role of gene duplication in the process of developmental system drift (True and Haag 2001) and suggest that different degrees of conservation may be observed across biochemical, regulatory, and developmental aspects of gene function.

    Acknowledgements

    We would especially like to thank Dr. Christoph Neinhuis and his associates at the Bonn Botanical Garden for providing us with floral material for Aristolochia eriantha and Thottea siliquosa. The authors would also like to thank the Kramer lab and two anonymous reviewers for comments on the manuscript. This work was supported by an Arnold Arboretum Mercer Fellowship to M.A.J. and a grant from the Harvard College Research Program to G.M.S.

    Literature Cited

    Alfaro, M. E., S. Zoller, and F. Lutzoni. 2003. Bayes or bootstrap? A simulation study comparing the performance of Bayesian Markov chain Monte Carlo sampling and bootstrapping in assessing phylogenetic confidence. Mol. Biol. Evol. 20:255-266.

    Alvarez-Buylla, E. R., S. Pelaz, S. J. Liljegren, S. E. Gold, C. Burgeff, G. S. Ditta, L. Ribas de Pouplana, L. Martinez-Castilla, and M. F. Yanofsky. 2000. An ancestral MADS-box gene duplication occurred before the divergence of plants and animals. Proc. Natl. Acad. Sci. USA 97:5328-5333.

    Angiosperm Phylogeny Group. 2003. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APGII. Bot. J. Linn. Soc. 141:339-436.

    Becker, A., K. Kaufmann, A. Freialdenhoven, C. Vincent, M. A. Li, H. Saedler, and G. Theissen. 2002. A novel MADS-box gene subfamily with a sister-group relationship to class B floral homeotic genes. Mol. Genet. Genomics 266:942-950.

    Bennett, M. D., and I. J. Leitch. 1995. Nuclear DNA amounts in angiosperms. Ann. Bot. 76:113-176.

    Bennett, M. D., J. B. Smith, and J. S. Heslop-Harrison. 1982. Nuclear-DNA amounts in angiosperms. Proc. R. Soc. Lond. B Biol. Sci. 216:179-199.

    Bowe, L. M., G. Coat, and C. W. dePamphilis. 2000. Phylogeny of seed plants based on all three genomic compartments: extant gymnosperms are monophyletic and Gnetales' closest relatives are conifers. Proc. Natl. Acad. Sci. USA 97:4092-4097.

    Bowman, J. L., D. R. Smyth, and E. M. Meyerowitz. 1989. Genes directing flower development in Arabidopsis. Plant Cell 1:37-52.

    Brown, J. W. S., and C. G. Simpson. 1998. Splice site selection in plant pre-mRNA splicing. Ann. Rev. Plant Phys. Plant Mol. Biol. 49:77-95.

    Chaw, S. M., C. L. Parkinson, Y. Cheng, T. M. Vincent, and J. D. Palmer. 2000. Seed plant phylogeny inferred from all three plant genomes: monophyly of extant gymnosperms and origin of Gnetales from conifers. Proc. Natl. Acad. Sci. USA 97:4086-4091.

    Cheng, Y., N. Kato, W. Wang, J. Li, and X. Chen. 2003. Two RNA binding proteins, HEN4 and HUA1, act in the processing of AGAMOUS pre-mRNA in Arabidopsis thaliana. Dev. Cell 4:53-66.

    Coen, E. S., and E. M. Meyerowitz. 1991. The war of the whorls: genetic interactions controlling flower development. Nature 353:31-37.

    Douady, C. J., F. Delsuc, Y. Boucher, W. F. Doolittle, and E. J. P. Douzery. 2003. Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability. Mol. Biol. Evol. 20:248-254.

    Egea-Cortines, M., H. Saedler, and H. Sommer. 1999. Ternary complex formation between the MADS-box proteins SQUAMOSA, DEFICIENS and GLOBOSA is involved in the control of floral architecture in Antirrhinum majus. EMBO 18:5370-5379.

    Force, A., M. Lynch, F. B. Pickett, A. Amores, Y.-L. Yan, and J. Postlethwait. 1999. Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531-1545.

    Galant, R., and S. B. Carroll. 2002. Evolution of a transcriptional repression domain in an insect Hox protein. Nature 415:910-913.

    Goto, K., and E. M. Meyerowitz. 1994. Function and regulation of the Arabidopsis floral homeotic gene PISTILLATA. Genes Dev. 8:1548-1560.

    Hasebe, M., and J. A. Banks. 1997. Evolution of MADS gene family in plants. Pp 179–197. in K. Iwatsuki and P. H. Raven, eds. Evolution and diversification of land plants. Springer, Tokyo.

    Hebsgaard, S. M., P. G. Korning, N. Tolstrup, J. Engelbrecht, P. Rouze, and S. Brunak. 1996. Splice site prediction in Arabidopsis thaliana DNA by combining local and global sequence information. Nucleic Acids Res. 24:3439-3452.

    Honma, T., and K. Goto. 2001. Complexes of MADS-box proteins are sufficient to convert leaves into floral organs. Nature 409:525-529.

    Huelsenbeck, J. P., and F. Ronquist. 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17:754-755.

    Irish, V. F., and E. M. Kramer. 1998. Genetic and molecular analysis of angiosperm flower development. Adv. Bot. Res. 28:197-230.

    Jack, T., L. L. Brockman, and E. M. Meyerowitz. 1992. The homeotic gene APETALA3 of Arabidopsis thaliana encodes a MADS box and is expressed in petals and stamens. Cell 68:683-697.

    Johansen, B., L. B. Pedersen, M. Skipper, and S. Frederiksen. 2002. MADS-box gene evolution-structure and transcription patterns. Mol. Phy. Evol. 23:458-480.

    Kishino, H., and M. Hasegawa. 1989. Evaluation of the maximum-likelihood estimate of the evolutionary tree topologies from dna-sequence data, and the branching order in Hominoidea. J. Mol. Evol. 29:170-179.

    Kramer, E. M. 2000. Evolution of genetic mechanisms controlling petal and stamen development. PhD. Dissertation, Yale University, New Haven, Conn.

    Kramer, E. M., V. S. Di Stilio, and P. Schluter. 2003. Complex patterns of gene duplication in the APETALA3 and PISTILLATA lineages of the Ranunculaceae. Intl. J. Plant Sci. 164:1-11.

    Kramer, E. M., R. L. Dorit, and V. F. Irish. 1998. Molecular evolution of genes controlling petal and stamen development: duplication and divergence within the APETALA3 and PISTILLATA MADS-box gene lineages. Genetics 149:765-783.

    Kramer, E. M., and V. F. Irish. 2000. Evolution of the petal and stamen developmental programs: evidence from comparative studies of the lower eudicots and basal angiosperms. Intl. J. Plant Sci. 161:S29-S40.

    Krizek, B. A., and E. M. Meyerowitz. 1996. Mapping the protein regions responsible for the functional specificities of the Arabidopsis MADS domain organ-identity proteins. Proc. Natl. Acad. Sci. USA 93:4063-4070.

    Krizek, B. A., J. L. Riechmann, and E. M. Meyerowitz. 1999. Use of the APETALA1 promoter to assay the in vivo function of chimeric MADS box genes. Sex. Plant Reprod. 12:14-26.

    Krogen, N. T., and N. W. Ashton. 2000. Ancestry of plant MADS-box genes revealed by bryophyte (Physcomitrella patens) homologues. New Phytol. 147:505-517.

    Lamb, R. S., and V. F. Irish. 2003. Functional divergence within the APETALA3/PISTILLATA floral homeotic gene lineages. Proc. Natl. Acad. Sci. USA 100:6558-6563.

    Lorkovic, Z. J., D. A. W. Kirk, M. H. L. Lambermon, and W. Filipowicz. 2000. Pre-mRNA splicing in higher plants. Trends Plant Sci. 5:160-167.

    Lynch, M., and J. S. Conery. 2000. The evolutionary fate and consequences of duplicate genes. Science 290:1151-1155.

    Ma, H., M. F. Yanofsky, and E. M. Meyerowitz. 1991. AGL1-AGL6, an Arabidopsis gene family with similarity to floral homeotic and transcription factor genes. Genes Dev. 5:484-495.

    Mouradov, A., B. Hamdorf, R. D. Teasdale, J. T. Kim, K.-U. Winter, and G. Theissen. 1999. A DEF/GLO-like MADS-box gene from a gymnosperm: Pinus radiata contains an ortholog of angiosperm B class floral homeotic genes. Dev. Genet. 25:245-252.

    Muse, S. V., and B. S. Gaut. 1994. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol. Biol. Evol. 11:715-724.

    Nesi, N., I. Debeaujon, C. Jond, A. J. Stewart, G. I. Jenkins, M. Caboche, and L. Lepiniec. 2002. The TRANSPARENT TESTA16 locus encodes the ARABIDOPSIS BSISTER MADS domain protein and is required for proper development and pigmentation of the seed coat. Plant Cell 14:2463-2479.

    Otto, S. P., and P. Yong. 2002. The evolution of gene duplicates. Adv. Gen. 46:451-483.

    Parkinson, C. L., K. L. Adams, and J. D. Palmer. 1999. Multigene analyses identify the three earliest lineages of extant flowering plants. Curr. Biol. 9:1485-1488.

    Pelaz, S., G. S. Ditta, E. Baumann, E. Wisman, and M. Yanofsky. 2000. B and C floral organ identity functions require SEPALLATA MADS-box genes. Nature 405:200-203.

    Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817-818.

    Purugganan, M. D., S. D. Rounsley, R. J. Schmidt, and M. F. Yanofsky. 1995. Molecular evolution of flower development: diversification of the plant MADS-box regulatory gene family. Genetics 140:345-356.

    Qiu, Y.-L., J. Lee, F. Bernasconi-Quadroni, D. E. Soltis, P. S. Soltis, M. Zanis, E. A. Zimmer, Z. Chen, V. Savolainen, and M. W. Chase. 1999. The earliest angiosperms: evidence from mitochondrial, plastid and nuclear genomes. Nature 402:404-407.

    Qiu, Y. L., J. Lee, B. A. Whitlock, F. Bernasconi-Quadroni, and O. Dombrovska. 2001. Was the ANITA rooting of the angiosperm phylogeny affected by long-branch attraction? Mol. Biol. Evol. 18:1745-1753.

    Reddy, A. S. N. 2001. Nuclear pre-mRNA splicing in plants. Crit. Rev. Plant Sci. 20:523-571.

    Renner, S. S. 1999. Circumscription and phylogeny of the Laurales: evidence from molecular and morphological data. Am. J. Bot. 86:1301-1315.

    Riechmann, J. L., B. A. Krizek, and E. M. Meyerowitz. 1996. Dimerization specificity of Arabidopsis MADS domain homeotic proteins APETALA1, APETALA3, PISTILLATA, and AGAMOUS. Proc. Natl. Acad. Sci. USA 93:4793-4798.

    Riechmann, J. L., M. Wang, and E. M. Meyerowitz. 1996. DNA-binding properties of Arabidopsis MADS domain homeotic proteins APETALA1, APETALA3, PISTILLATA and AGAMOUS. Nucleic Acids Res. 24:3134-3141.

    Ronshaugen, M., N. McGinnis, and W. McGinnis. 2002. Hox protein mutation and macroevolution of the insect body plan. Nature 415:914-917.

    Schwarz-Sommer, Z., I. Hue, P. Huijser, P. J. Flor, R. Hansen, F. Tetens, W.-E. Lonnig, H. Saedler, and H. Sommer. 1992. Characterization of the Antirrhinum floral homeotic MADS-box gene deficiens: evidence for DNA binding and autoregulation of its persistent expression throughout flower development. EMBO J. 11:251-263.

    Shindo, S., M. Ito, K. Ueda, M. Kato, and M. Hasebe. 1999. Characterization of MADS genes in the gymnosperm Gnetum parvifolium and its implication on the evolution of reproductive organs in seed plants. Evol. Dev. 1:180-190.

    Shore, P., and A. D. Sharrocks. 1995. The MADS-box family of transcription factors. Eur. J. Biochem. 229:1-13.

    Sun, B. Y., T. F. Stuessy, and D. J. Crawford. 1990. Chromosome counts from the flora of the Juan Fernandez Islands, Chile. III. Pacific Sci 44:258-264.

    Sundstrom, J., A. Carlsbecker, M. Svenson, M. E. Svensson, and P. Engstrom. 1999. MADS-box genes active in developing pollen cones of Norway spruce are homologous to the B-class floral homeotic genes in angiosperms. Dev. Genet. 25:253-266.

    Suzuki, Y., G. V. Glazko, and M. Nei. 2002. Overcredibility of molecular phylogenies obtained by Bayesian phylogenetics. Proc. Natl. Acad. Sci. USA 99:16138-16143.

    Swofford, D. 2001. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.01b. Sinauer Associates, Sunderland, Mass.

    Templeton, A. R. 1983. Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the evolution of humans and the apes. Evolution 37:221-244.

    Theissen, G. 2001. Development of floral organ identity: stories from the MADS house. Curr. Opin. Plant Biol. 4:75-85.

    Theissen, G. 2002. Secret life of genes. Nature 415:741.

    Theissen, G., A. Becker, A. Di Rosa, A. Kanno, J. T. Kim, T. Munster, K.-U. Winter, and S. H.. 2000. A short history of MADS-box genes in plants. Plant Mol. Biol. 42:115-149.

    Theissen, G., A. Becker, K. U. Winter, T. Munster, C. Kirchner, and H. Saedler. 2002. How the land plants learned their floral ABCs: the role of MADS-box genes in the evolutionary origin of flowers. Pp. 173-205 in Q. C. B. Cronk, R. M. Bateson and J. A. Hawkins, eds. Developmental genetics and plant evolution. Taylor & Francis, London.

    Trobner, W., L. Ramirez, P. Motte, I. Hue, P. Huijser, W. E. Lonnig, H. Saedler, H. Sommer, and Z. Schwarz-Sommer. 1992. Globosa—a homeotic gene which interacts with deficiens in the control of antirrhinum floral organogenesis. EMBO J. 11:4693-4704.

    True, J. R., and E. S. Haag. 2001. Developmental system drift and flexibility in evolutionary trajectories. Evol. Dev. 3:109-119.

    Tucker, S. C., A. W. Douglas, and H.-X. Liang. 1993. Utility of ontogenetic and conventional characters in determining phylogenetic relationships of Saururaceae and Piperaceae (Piperales). Syst. Bot. 18:614-641.

    Winter, K. U., H. Saedler, and G. Theissen. 2002. On the origin of class B floral homeotic genes: functional substitution and dominant inhibition in Arabidopsis by expression of an orthologue from the gymnosperm Gnetum. Plant J. 31:457-475.

    Winter, K. U., C. Weiser, K. Kaufmann, A. Bohne, C. Kirchner, A. Kanno, H. Saedler, and G. Theissen. 2002. Evolution of class B floral homeotic proteins: obligate heterodimerization originated from homodimerization. Mol. Biol. Evol. 19:587-596.

    Yang, Y., L. Fanning, and T. Jack. 2003. The K domain mediates heterodimerization of the Arabidopsis floral organ identity proteins, APETALA3 and PISTILLATA. Plant J. 33:47-59.

    Zanis, M. J., D. E. Soltis, P. S. Soltis, S. Mathews, and M. J. Donoghue. 2002. The root of the angiosperms revisited. Proc. Natl. Acad. Sci. USA 99:6848-6853.(Giulia M. Stellari, M. Al)