当前位置: 首页 > 期刊 > 《分子生物学进展》 > 2005年第1期 > 正文
编号:11175500
A Potentially Functional Mariner Transposable Element in the Protist Trichomonas vaginalis
http://www.100md.com 《分子生物学进展》
     * The Institute for Genomic Research, Rockville, Maryland; Department of Microbiology, Immunology & Molecular Genetics, University of California, Los Angeles

    Correspondence: E-mail: jsilva@tigr.org.

    Abstract

    Mariner transposable elements encoding a D,D34D motif–bearing transposase are characterized by their pervasiveness among, and exclusivity to, animal phyla. To date, several hundred sequences have been obtained from taxa ranging from cnidarians to humans, only two of which are known to be functional. Related transposons have been identified in plants and fungi, but their absence among protists is noticeable. Here, we identify and characterize Tvmar1, the first representative of the mariner family to be found in a species of protist, the human parasite Trichomonas vaginalis. This is the first D,D34D element to be found outside the animal kingdom, and its inclusion in the mariner family is supported by both structural and phylogenetic analyses. Remarkably, Tvmar1 has all the hallmarks of a functional element and has recently expanded to several hundred copies in the genome of T. vaginalis. Our results show that a new potentially active mariner has been found that belongs to a distinct mariner lineage and has successfully invaded a nonanimal, single-celled organism. The considerable genetic distance between Tvmar1 and other mariners may have valuable implications for the design of new, high-efficiency vectors to be used in transfection studies in protists.

    Key Words: transposon ? mariner ? protist ? parabasilid ? Trichomonas vaginalis

    Introduction

    In the past 3 decades, our view of transposable genetic elements (TEs) has changed from esoteric features of a few genomes to highly pervasive components of the genetic material of most species. Mariner transposable elements are class II transposons that move predominantly via a DNA-mediated, cut-and-paste mechanism. They are widespread among animals and have been found in taxa as diverse as cnidarians (Robertson 1997), flatworms (Garcia-Fernandez et al. 1995), arthropods (Robertson and MacLeod 1993), and humans (Oosumi, Belknap, and Garlick 1995), and hundreds of sequences have been obtained from dozens of species (Robertson et al. 2002). Close relatives to the animal mariners are also known from fungi (Langin, Capy, and Daboussi 1995) and plants (Jarvik and Lark 1998; Feschotte and Wessler 2002). The apparent simplicity of this element's transposition mechanism, which does not require host-specific factors, explains its wide taxonomic distribution, as well as its remarkable success as a tool for insertional mutagenesis studies (Lampe, Churchill, and Robertson 1996; Vos, De Baere, and Plasterk 1996). Perhaps surprisingly, mariner elements have not been detected in single-celled organisms such as prokaryotes or protists, which led to the suggestion that "mariner lineages cannot readily establish themselves there" (Robertson 2002).

    The first mariner to be isolated, Mos1, from Drosophila mauritiana (Maruyama, Schoor, and Hartl 1991; Medhora, Maruyama, and Hartl 1991), was identified as an approximately 1.3-kb element, delimited by 28-nt inverted terminal repeats (ITRs) and found to encode a 345–amino acid transposase (figs. 1 and 2). The transposase has a N-terminal DNA-binding domain with a helix-turn-helix (HTH) motif and a C-terminal catalytic domain with a D,D34D motif (three aspartic acid residues in conserved positions, the latter two separated by 34 residues). A bipartite nuclear localization signal (NLS) is found between the two domains, with composition RYKRK,RRK (Hartl, Lohe, and Lozovskaya 1997; Lohe, De Aguiar, and Hartl 1997; Zhang, Dawson, and Finnegan 2001). The second full-length active mariner was later reconstructed from the genome of the horn fly, Haematobia irritans (Lampe, Churchill, and Robertson 1996), and shares most of the characteristics of Mos1 (fig. 1). Transposition of mariner elements occurs into a TA dinucleotide target site, in a reaction catalyzed by the transposase. The composition and spacing of the residues of the catalytic motif have proved the most reliable identifiers for elements in the IS630/Tc1/mariner superfamily, to which the mariner family belongs (Shao and Tu 2001). All derived mariner subfamilies have the D,D34D motif, and the clades "mori" and "rosa," previously considered basal mariner subfamilies but now viewed as separate families, have D,D37D and D,D41D motifs, respectively (fig. 3). The closely related transposon families Tc1 and ItmD37E have D,D34E and D,D37E motifs, respectively. Mariner-like elements were identified in plants (Jarvik and Lark 1998; Feschotte and Wessler 2002), although their distinct structure, sequence, and catalytic motif (D,D39D) suggest that they be classified as a separate family (Shao and Tu 2001).

    FIG. 1.— Tvmar1 structure and sequence. (a) Schematic comparison of Mos1, Himar1, and Tvmar1 elements. ITRs (arrow heads), ORF (open box), helix-turn-helix motif (hatched box), aspartic acid residues (D) that form the D,D34D catalytic motif, putative NLS (dotted boxes), and first seven methionine residues in the Tvmar1-encoded ORF (vertical lines) are shown. (b) Sequence of the Tvmar1 element isolated from strain G3, with translation of the longest ORF. The Tvmar1 ORF is aligned to the two active mariner elements, Mos1 from Drosophila mauritiana (gi 84871) and Himar1 from Haematobia irritans (Lampe, Churchill, and Robertson 1996), and to two mariner elements from Homo sapiens, the host of Trichomonas vaginalis, Hsmar1 (gi 1263080) and Hsmar2 (gi 1698454). Inverted terminal repeats (ITRs) are underlined. Positions at which the Tvmar1 ORF sequence is identical to a consensus of the five transposases are shown in bold and highlighted. Positions of the aspartic acid residues in the D,D34D active site of mariner transposases are marked with a plus sign (+) below the alignment.

    FIG. 2.— Alignment of 5' and 3' inverted terminal repeats (ITRs) of mariner elements. The internal segment of the transposable elements is abbreviated with two forward slashes (//). Completely conserved positions (black box) and those conserved in all but one sequence (gray box) are highlighted. The elements represented are Tvmar1 from T. vaginalis, Mos1 from D. mauritiana, Himar1 from H. irritans, Hsmar1 and Hsmar2 from H. sapiens, AhMLE from the moth Adoxophyes honmai (gi 15982565), Cpmar1 from the green lacewing Chrysoperla plorabunda (gi 156617), Ammar1 from the bee Apis mellifera (gi 27465074), and Cemar1 from Caenorhabditis elegans (Robertson and Asplund 1996).

    FIG. 3.— Phylogenetic position of Tvmar1 (GenBank accession number AY282463) in the IS630/Tc1/mariner superfamily. Mariner subfamilies and related transposons (Tc1, ItmD37E, and plant mariner-like elements) are shown. Elements are identified by host name and GenInfo Identifier (gi). One of two most parsimonious trees is presented, which differ only in the resolution of the D. mauritiana–D. simulans–D. teissieri triad. Percent bootstrap support for transposon families, mariner subfamilies and deeper branches is shown (parsimony and neighbor-joining values above and below the branches, respectively). Resolution of deep branches agrees with previous analyses (Shao and Tu 2001; Robertson 2002).

    Trichomonas vaginalis, an asexual flagellated protist, is a human extracellular obligate parasite of the urogenital tract (Vanacova et al. 2003). This parasite is the model organism for studies on one of the deep-branching eukaryotic lineages, the parabasalids (Keeling and Palmer 2000). Although its exact genome size and chromosome number have yet to be determined, preliminary sequence data generated by the T. vaginalis Genome Sequencing Project (http://www.tigr.org/tdb/e2k1/tvg/) suggest a highly repetitive genome with an approximate size of 170 Mb. However, no transposable elements from this species have so far been characterized. Only three TEs belonging to the large IS630/Tc1/mariner superfamily have been identified in two species of protist: TBE1, which is found in Oxytricha fallax (Doak et al. 1994), and Tec1 and Tec2, from Euplotes crassus (Jahn et al. 1993). All three elements belong to the Tec1/Ant1 family, which has a D,D34E catalytic motif, and which is only distantly related to mariner and Tc1 elements. However, the apparent lack of naturally occurring mariner elements in protists has not precluded their use in preliminary transfection studies in this taxon, as constructs containing derivatives of Drosophila's Mos1 mariner have been introduced into species of parasitic protists and shown to transpose (Gueiros-Filho and Beverley 1997; Mamoun et al. 2000).

    Here, we describe Tvmar1, a TE found in the genome of T. vaginalis, the first naturally occurring mariner to be described from a species of protist and the first mariner element containing the D,D34D motif found outside the animal kingdom. Inclusion of this new TE into the mariner family is corroborated by sequence and phylogenetic analyses, which show Tvmar1 to be a representative of a new, but quite distinct, mariner lineage. Despite the large number of copies of Tvmar1 in the genome of T. vaginalis, the degree of polymorphism among them is extremely small, suggesting a very recent amplification of the element and the possibility that Tvmar1 is an active mariner. As such finding could have profound implications for the engineering of new transfection vectors for protists, we searched for evidence of Tvmar1's potential functionality at the molecular level and by investigating the element's taxonomic distribution among Parabasalids.

    Material and Methods

    The sequence of the 1,304-nt mariner element from Trichomonas vaginalis described here, Tvmar1, has been submitted to GenBank and has accession number AY282463.

    Molecular Analyses

    Genomic DNA was extracted from T. vaginalis strains B7RC2 (American Type Culture Collection [ATCC], North Carolina), T1 (J.-H. Tai, Institute of Biomedical Sciences, Taipei, Taiwan), C1 (ATCC30001, Maryland), and G3 (ATCC PRA-98, Kent, UK), Trichomonas tenax strain Hs-4:NIH and Tritrichomonas foetus strain KV-1. Human DNA (Promega) and DNA from a malaria parasite Plasmodium falciparum, for which the complete genome sequence is available and which does not contain TEs, were used as controls. An approximately 480-bp fragment between positions 605 and 1090 of Tvmar1, which encompasses the catalytic motif of the transposase, was labeled with 32P by oligonucleotide-primed synthesis using random hexamers and used as a probe for both Southern and Northern blots. For Southern blots of Trichomonas strains, genomic DNA was digested with DraI, which cuts once within the element (nucleotide position approximately 1250), and DNA fragments were resolved on a 0.8% agarose gel and blotted onto Hybond-XL membrane. Blots were prehybridized for 4 h in 3 x SSC, 1% Denhardts solution, 1% SDS, 0.1% NaPPi, and 0.1 μg/ml salmon sperm DNA and hybridized overnight with the probe. Blots were washed extensively to a final stringency of 0.1 x SSC, 0.1% SDS, and exposed to autoradiographic film. For Southern blots of multiple Trichomonad species, the genomic DNA of T. tenax and P. hominis were additionally digested with BamHI. The blot was prehybridized and hybridized overnight at 65 °C with labeled probe in 3 x SSC and 1% Denhardts solution. Blots were washed extensively at 65 °C to a final stringency of 3 x SSC, 0.1% SDS, and exposed to autoradiographic film. The T. vaginalis gDNA lane was cut from the blot after film exposure for 1 h at room temperature to allow for further exposure for 10 days at –80°C.

    PolyA+ RNA was isolated from total RNA of T. vaginalis strain G3 using PolyATract mRNA Isolation System IV (Promega). Approximately 3.5 μg polyA+ RNA was electrophoresed on a 1% denaturing agarose gel containing 2% formaldehyde, and the fragments were transferred onto Hybond-XL membrane. Prehybridization, hybridization, and washing of the Northern blots were performed as for Southern blots.

    Amplification of mariner elements followed standard PCR protocols, with an annealing temperature of 50°C. Two primer sets were used: (1) a primer that anneals to both ITRs of Tvmar1 (5'-TATAGGGTGTCCAAAGGTG-3') was used to amplify the complete T. vaginalis mariner element and (2) a set of internal degenerate primers then mapped to conserved motifs in the transposase (5'-GKATYGTRACNGGNGANGAR-3', and 5'- CRGATGGNGCNANRTCNGG-3'). PCR products with the expected size were excised from 1% agarose gels, purified using QIAquick PCR Purification Kit (Qiagen), and cloned using TOPO TA Cloning Kit (Invitrogen). Clones were sequenced using the BigDye Terminator mix version 3.1 (Applied Biosystems) and run on an ABI 3100 sequencer (Applied Biosystems).

    Computational Analyses

    Preliminary sequence data from the ongoing T. vaginalis strain G3 Genome Sequencing Project (http://www.tigr.org/tdb/e2k1/tvg/) was used in this study. At 3.4-fold coverage, this consists of 58,205 contigs representing 115 Mb of the roughly 170 Mb genome. Sequence similarity searches between the T. vaginalis data set and publicly available nucleic acid and protein databases were performed using the Blast series of algorithms (Altschul et al. 1990). Complete and partial copies of the Tvmar1 element identified from the contigs were aligned with ClustalW (Thompson, Higgins, and Gibson 1994) with default parameters, and the average number of pairwise differences per site between copies, (Nei and Li 1979), calculated using DnaSP (Rozas and Rozas 1999). The location of putative HTH motifs was determined with the program "helixturnhelix," from the EMBOSS package (http://www.hgmp.mrc.ac.uk/Software/EMBOSS), and the location of NLSs was determined with the program NUCDISC, implemented in PSORT II (http://psort.nibb.ac.jp/). The regions flanking each Tvmar1 insertion, when available, were extracted for additional analyses: sequence logos were built from the first 25 nt upstream and downstream of Tvmar1 using WebLogo (Crooks et al. 2004), and percent GC was calculated from the first 100 nt on each side of the insertion site.

    T. vaginalis genes are intronless, with a few exceptions, and, thus, all open reading frames (ORFs) corresponding to protein-coding genes start with a methionine (Met) amino acid residue. The location all ORFs starting with a Met residue and that were at least 100 amino acids in length was determined for the 1,174 contigs containing Tvmar1, using the program "getorf" from the EMBOSS package, and the location of each Tvmar1 copy relative the closest upstream and downstream ORFs was recorded. In the case of ORFs immediately adjacent to Tvmar1 insertions, TE insertions that had occurred inside the ORF were identified, and those ORFs were searched against publicly available protein databases using BlastP, to determine their putative function.

    Phylogenetic Analyses

    The complete protein sequences of transposase genes from several elements of the IS630/Tc1/mariner superfamily were downloaded from GenBank and aligned using ClustalW with default parameters. Phylogenetic analyses were performed using PAUP* (Swofford 1998). Parsimony searches consisted of 20 heuristic searches, each consisting of 10 replicates, with trees obtained by random addition of taxa. Bootstrap analyses using both maximum parsimony (set as above) and neighbor-joining consisted of 100 replicates each. The plant mariner-like elements were used as a monophyletic out-group, a position suggested by previous analyses (Robertson 2002).

    Molecular Evolution Analyses

    Estimates of the number of synonymous substitutions per synomymous site (dS) and of nonsynonymous substitutions per nonsynomymous site (dN), as well as the ratio = dN /dS, were obtained using the methods of Yang et al. (2000) and Nielsen and Yang (1998), implemented in PAML (Yang 1997). The data set was composed of five elements (fig. 1b): the two human mariner sequences (Hsmar1 and Hsmar2), representatives of the two functional lineages (Mos1 and Himar2), and the new mariner element from T. vaginalis, Tvmar1. Estimates of were obtained under the models of evolution M0, M2, M3, M7, and M8 described in Yang et al. (2000), and the results of nested models compared using a likelihood ratio test, LRT (Yang et al. 2000).

    Results

    Identification and Characterization of Tvmar1

    We searched 34,681 contigs from preliminary sequence data of the T. vaginalis genome sequencing project against GenBank's nonredundant database of proteins using Blast. An initial screening of the results identified several contigs with low sequence similarity to transposase genes of mariner elements from several eukaryotic species. A complete mariner element was obtained by searching the structure of the transposase-containing contigs for the presence of ITRs flanking the transposase region. The complete sequence of this element was then compared back with all 58,205 contigs (115 MB) in the current assembly of the T. vaginalis genome using BlastN, to obtain the locations of all complete and partial copies of the element in the assembly. A total of 1,179 hits in 1,174 contigs revealed strong similarity to Tvmar1 (identity > 80%, expected value > 1 x 10–30). A 1,304-nt consensus, Tvmar1, was generated from the alignment of these sequences and found to encode a single ORF of 375 amino acids, containing the D,D34D motif and flanked by 28-bp ITRs (fig. 1a, b). Several methionines within the first 38 residues of the ORF suggest that the protein is between 375 and 337 amino acids in length. We generated a putative structure of Tvmar1 using several structural prediction programs. A putative HTH motif was identified in the element at the same position as in other functional mariner elements (fig. 1a). Two putative NLSs were also identified in Tvmar1, PDQRKIK starting at residue position 99 and PKKKIDT at residue position 214, reminiscent of those found in Himar1.

    The number of copies in the current assembly was estimated in three ways: (1) through identification of the highest frequency for any nucleotide position in an alignment of all 1,179 contig segments with significant similarity to Tvmar1, determined to be 649 times; (2) by dividing the cumulative length of all contig segments with significant similarity to Tvmar1 (802,785 nt) by the length of the Tvmar1 element (1,304 nt), which results in approximately 616; (3) by counting the number of 5' and of 3' ITRs in the assembly, which resulted in 598 and 611, respectively. These estimates are very similar and suggest between 600 to 650 copies of Tvmar1 in the current genome assembly. This is probably an underestimation of the genome copy number, because the remaining approximately 55 Mb of sequence not incorporated in the genome assembly consist mostly of repetitive sequences that could not be unambiguously placed and that may contain further Tvmar1 elements.

    All Tvmar1 copies were found to be nearly identical in sequence, with invariant ITR sequences and a polymorphism level of approximately 0.8% estimated by the average number of pairwise differences between copies. The nearly identical ITRs of Tvmar1 have sequence 5'-TAGGGTGTCCAAAGGTGYGTTTAGCACA-3', where position 18 (represented as Y) is C or T in all 5' or 3' ITRs, respectively (fig. 1b). An alignment of Tvmar1 ITRs with those of other mariner sequences shows that five positions are conserved across all elements in both ITRs and that four in the 5' ITR and three in the 3' ITR are nearly perfectly conserved (fig. 2). These are all part of the consensus ITR of mariner elements (Robertson and Asplund 1996), except for the nearly perfectly conserved "G" and "C" proximal to the body of the element, in the 5' and 3' ITRs, respectively.

    Insertion Site Preferences of Tvmar1

    All sequences adjacent to either the 5' or 3' ITR were found to exhibit a TA dinucleotide (fig. 4), the typical target site for mariner elements, which is duplicated upon insertion. The three nucleotides preceding the TA dinucleotide target site exhibit a moderate degree of conservation, with almost 50% of the sequences being either TAA or AAA. The sequence logo of the first five nucleotide positions in the 3' flanking region of Tvmar1 insertions is the reverse complement of that in the 5' region, suggesting that Tvmar1 insertions occur independently of strand orientation. The 200-nt regions flanking Tvmar1 insertions have an average GC content of approximately 29.0% (±6.5%), similar to the overall GC content of the current assembly (33.7%), which suggests that the insertion site of Tvmar1 is also independent of nucleotide composition, with the exception of the 5 nt preceding the element.

    FIG. 4.— Sequence logo of the regions flanking Tvmar1 insertions. The 25 nt that precede (a) or follow (b) each Tvmar1 copy are represented in each logo. The vertical axis is a measure of sequence information, has a maximum value of 2, and is proportional to the level of sequence conservation at each position.

    Among all sequences with similarity to Tvmar1 (1,179 in 1,174 contigs), 11% (or 135 sequences) correspond to complete copies (both ITRs present), whereas 39% (463 sequences) and 40% (476 sequences) of the copies contained only the 5' or the 3'end of the element, respectively, reflecting the partial nature of the current genome assembly. The remaining 8% of the Tvmar1 sequences contained only an internal portion of the element.

    To determine if there was evidence for Tvmar1 copies being mobilized as part of a larger structure, we aligned the regions flanking the Tvmar1 insertions. For the 135 complete Tvmar1 copies, all regions comprising 150 nucleotides upstream and downstream of the insertions were unique. However, in cases for which only the 5' end of the element is present, there were 18 pairs of elements and three triplets with matching flanking regions that extended at least 150 nucleotides upstream of the insertion. Likewise, there were 13 such cases (one of them a triplet) for regions flanking the 3' end of Tvmar1 insertions. It is unclear at this point whether these 34 cases result from assembly error caused by low sequence coverage of the genome, or whether they are truly the result of Tvmar1 being amplified as part of a larger structure.

    Preferential insertion of Tvmar1 relative to the orientation of T. vaginalis genes was determined by comparing the orientation of those insertions to that of their flanking ORFs. Of the 598 Tvmar1 copies for which the 5' ITR was present, an ORF of at least 100 amino acids in length was found upstream of the element in 376 (63%) cases. In 56% of these cases, the Tvmar1 insertion was downstream from the ORF (Tvmar1 copy and upstream ORF in the same orientation). Likewise, of the 611 Tvmar1 copies for which the 3' end was present, ORFs were found downstream of the element in 400 (65%) cases. Again, in 58% of these cases, the Tvmar1 insertion was downstream from the ORF (Tvmar1 copy and downstream ORF in opposite orientations). These results suggests a small but highly significant tendency of Tvmar1 to insert downstream of ORFs. Of all 776 ORFs of at least 100 amino acids in length found upstream or downstream of Tvmar1 insertions, 68% were located within 250 nt of the element. These results are preliminary, because there is no certainty that the T. vaginalis ORFs considered encode functional proteins. In addition, there may be genes with ORFs smaller than 100 amino acids that could be located closer to the Tvmar1 insertion than the ORF that was considered. A more detailed analysis will be possible once the annotation of the T. vaginalis genome is complete.

    A detailed study of the 80 Tvmar1 insertions located in contigs greater than 10 kb identified nine cases (10%) in which the insertion was located within a T. vaginalis ORF that had significant similarity to a known gene. In three of these cases, the strongest similarity of the disrupted ORF is to the surface antigen BspA of Tannerella forsythensis (gi 3005673); all these insertions are independent, as they are located in different regions of the disrupted ORF. In the remaining six cases, the strongest similarity is to a T. vaginalis ORF of unknown function (gi 1177872), an ankyrin-type peptide; again, all these insertions are independent.

    Phyogenetic Position of Tvmar1

    We determined the phylogenetic position of Tvmar1 in relation to known mariner subfamilies and related transposons using maximum-parsimony and neighbor-joining methods. From these analyses, Tvmar1 was identified as the sister taxon to the clade containing the derived mariner subfamilies (fig. 3). Its close association with mariner elements and the presence in Tvmar1 of the D,D34D catalytic motif characteristic of derived mariners suggest that Tvmar1 belongs to the mariner family. Amino acid identity and similarity between Tvmar1 and its sister clade range only between 21% and 30% and between 40% and 50%, respectively. Therefore, Tvmar1 is clearly distinct from its sister group and possibly is the first representative of a new branch of the mariner family's evolutionary history.

    Activity Potential of Tvmar1

    To determine if the Tvmar1 element is active in the genome of T. vaginalis, polyA+ RNA extracted from T. vaginalis strain G3 was probed with a fragment of Tvmar1. A single transcript was identified, which is approximately 1 kb in length, the expected size of the mRNA transcript of the Tvmar1 transposase (fig. 5a). The presence of an abundant, discrete Tvmar1 mRNA in the cell argues that Tvmar1 may be actively translated in T. vaginalis and that transposition may occur. Together with the extremely low level of polymorphism observed and the invariant ITR sequences, these findings argue that the consensus sequence of all Tvmar1 copies does represent the ancestral, functional Tvmar1 element and that the sequence components required for transposition of Tvmar1 are present in genomic copies of the element. Analyses of the evolution of the five complete D,D34D mariner elements shown in figure 1b indicate that despite the high divergence among these sequences, they evolve in a highly conserved fashion, with most models suggesting a rate of evolution for nonsynonymous sites approximately 50 times lower than that observed for synonymous sites (average 0.02 [table 1]). Horizontal transfer seems to be a common occurrence in the evolution of the mariner family, and because only functional elements will be able to successfully spread in the a new host upon invasion, it is not unexpected to find evidence of selective constraints when mariner elements of different species are compared (Lampe et al. 2003). However, if most of the evolution of the lineage leading to Tvmar1 had taken place in a trichomonad host, without recent horizontal transfer events, a pattern of neutral evolution should be reflected in the evolution of Tvmar1. To determine whether selective constraints could be absent specifically in the lineage leading to Tvmar1 (as indicated by a 1), while being present in the remaining lineages, was allowed to differ in the branch of the tree leading to that element, using model M3. The average obtained for the branch leading to Tvmar1 was 0.22, and the log-likelihood lnL = –6691.79. This scenario provides a slightly worse fit to the data than that in which M3 is used with a single for all branches, but the difference is not significant: Regardless of the model of evolution used, the estimated that best fits the data was always <<1. These data suggest that Tvmar1 lineage has evolved under selective pressure until at least a recent past, providing additional support for the assertion that the element must be close (if not identical) in sequence to its functional ancestor.

    FIG. 5.— Tvmar1 activity and distribution. (a) Northern blot autoradiograph showing a single approximately 1-kb transcript identified after hybridization of Tvmar1 to polyA mRNA of T. vaginalis strain G3. (b) Autoradiograph of a Southern blot of total genomic DNA extracted from T. vaginalis strains B7RC2, T1, and C1, with a control lane of G3 and hybridized with Tvmar1 from G3.

    Table 1 Log-Likelihood (ln-L) Values and Parameter Estimates Under Five Models of Evolution for the Mariner Elements

    A very recent expansion of the mariner transposon in the genome of T. vaginalis is also suggested by the low level of sequence polymorphism among Tvmar1 copies. To establish whether this event occurred before or after the global expansion of T. vaginalis, several T. vaginalis isolates obtained from different geographical regions of the world were analyzed for the presence of Tvmar1 homologs. Stringent Southern blot hybridizations using a fragment of the Tvmar1 element isolated from the G3 strain, showed that Tvmar1 is present in all T. vaginalis isolates tested (fig. 5b). This high degree of sequence similarity was confirmed by sequencing of 10 individual elements cloned from a PCR amplicon of each of the isolates, obtained with the ITR primers. The elements from each isolate were found to have the same consensus sequence and identical level of polymorphism as those present in the G3 isolate (data not shown). In contrast, low stringency hybridizations of the Tvmar1 element to Southern blots of Trichomonas tenax, a human parasite of the oral cavity and the sister taxon to T. vaginalis, and to Tritrichomonas foetus, a cattle trichomonad parasite of the urogenital tract and a distant relative of T. vaginalis and T. tenax, identified diffuse signals, indicating that if relatives of Tvmar1 are present in those species, they are very distant homologs (fig. 6). The absence of a mariner element with a high degree of similarity to Tvmar1 in T. tenax and T. foetus was further supported by PCR studies involving degenerate primers and a variety of low-stringency amplification conditions. Primer combinations that amplify a product of the expected size in T. vaginalis and Drosophila simulans were unable to produce amplification from either T. tenax or T. foetus (results not shown).

    FIG. 6.— Distribution of Tvmar1 among species of Trichomonad, as detected by Southern blot hybridization. TV: Trichomonas vaginalis G3 strain (ATCC PRA-98, Kent, UK); TT: Trichomonas tenax (ATCC30207); PH: Pentatrichomonas hominis (Hs-3:NIH, ATCC30000); TF: Tritrichomonas foetus (ATCC30924); DH: Ditrichomonas honibergii (ATCC50322); PK: Pseudotrichomonas keilini Bishop (ATCC50321). Genomic DNA was digested with DraI; TT and PH were additionally digested with BamHI. (a) The absence of signal after 1-hour exposure in all lanes, except T. vaginalis, suggest that close relatives of Tvmar1 are absent from the other Trichomonad species surveyed. (b) The results of a prolonged exposure at low stringency levels suggest that if relatives of Tvmar1 exist in other Trichomonads, they are only very distant homologs to the T. vaginalis element. The TV lane was cut from the blot after film exposure for 1 hour.

    Discussion

    Understanding the evolution of transposable elements is an essential step in the study of the evolution of genomes of which they are an integral part. Mariner elements in general, and those containing the D,D34D catalytic motif in particular, are one of the most thoroughly studied eukaryotic TEs. Until now, they were thought to be exclusively present in animals; our study has uncovered a novel, distinct member of this family in a protist, thus, providing an initial glimpse into a new branch of the mariner family's evolutionary history. Furthermore, our results indicate that the establishment of mariner lineages in the genome of unicellular organisms, even if a rare event, does occur, as attested to by the presence of several hundred copies of Tvmar1 in multiple isolates of T. vaginalis.

    The genomic copy number of mariners can vary considerably among species, from fewer than 10 to a few thousand copies (Lidholm, Gudmundsson, and Boman 1991; Garcia-Fernandez et al. 1995; Torti et al. 1997). The several hundred copies of Tvmar1 in the T. vaginalis genome is on the upper end of the mariner copy number distribution. This is a surprising result because T. vaginalis is apparently a haploid, asexual organism and as such, the presence and transposition of multiple TEs within the genome could be especially deleterious. The presence of mariner elements has been reported in an old asexual lineage (Arkhipova and Meselson 2001), and the authors suggest that their maintenance in the genome may be the result of domestication of the elements and possibly facilitated by the ameliorated effects of mariner elements relative to the potentially more deleterious retroelements, which have apparently been lost from the genome. However, Tvmar1 appears to be only one among many dozens of repeat sequences and TEs (including retroelements) found in high copy number in the genome of T. vaginalis (Silva and Carlton, unpublished data). This seems to suggest that T. vaginalis is able to avoid the potentially devastating effect caused by the presence and transposition of TEs in asexual lineages (Nuzhdin and Petrov 2003), as well as the putative selective disadvantage of the slow growth usually associated with large genome sizes in protists (Shuter et al. 1983; Wickham and Lynn 1990). Alternatively, T. vaginalis may represent a doomed evolutionary lineage, which may be led to extinction by TE lineages run amok. The evolution of the repeat component of the genome of T. vaginalis is currently being investigated.

    In addition to its novel protist host, Tvmar1 is remarkable in that it appears to be functional. Only two other mariner elements, Mos1 from a species of Drosophila and Himar1 from the horn fly Haematobia iritans, have been shown to be functional. It is likely that functional mariner elements are present in many of the species in which they have been detected, in particular those with high copy number per genome, and in which the divergence among copies is small. A new screening method to identify functional mariner copies has been recently proposed (Barry, Witherspoon, and Lampe 2004), which may bring many of those to light.

    The absence of Tvmar1 from other trichomonads, and its presence in widely distributed T. vaginalis isolates, strongly suggests that Tvmar1 has been recently acquired by T. vaginalis but that the invasion predates the diversification of the species. Tvmar1 is only distantly related to human mariners, and a search for Tvmar1 homologs in the genome of the other human urogenital parasites sequenced to date resulted in no significant hits. Therefore, the taxonomic source of this new, and quite distinct, active mariner element remains to be identified. Our finding invites speculation that variants representing this and possible additional functional lineages of the mariner family might still be found in other protists, as well as in species of prokaryotes and fungi.

    Because of their widespread distribution and the apparent lack of requirement for host-specific factors for transposition, transformation vectors based on mariner elements have become a useful tool for insertional mutagenesis studies in a variety of organisms, including archaea, eubacteria, and eukaryote taxa (Lidholm, Lohe, and Hartl 1993; Gueiros-Filho and Beverley 1997; Sherman et al. 1998; Rubin et al. 1999; Judson and Mekalanos 2000; Mamoun et al. 2000; Zhang et al. 2000; Bessereau et al. 2001; Adelman, Jasinskiene, and James 2002). Tvmar1 significantly expands the degree of sequence variation seen among mariners, and may reveal sequence requirements for mariner transposition in Trichomonas and other protists that would prove valuable for enhancement of current protozoan transfection technology. Our preliminary analyses of Tvmar1 insertion site preferences shows that the element often inserts in the vicinity of or within ORFs, which is encouraging in the context of insertion mutagenesis studies.

    Acknowledgements

    We thank Margaret Kidwell and Jonathan Eisen for critical reading of the manuscript, and three anonymous reviewers for many helpful comments. This project was supported with funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Co-Operative Agreement U01 AI050913-02.

    References

    Adelman, Z. N., N. Jasinskiene, and A. A. James. 2002. Development and applications of transgenesis in the yellow fever mosquito, Aedes aegypti. Mol. Biochem. Parasitol 121:1–10.

    Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403–410.

    Arkhipova, I., and M. Meselson. 2000. Transposable elements in sexual and ancient asexual taxa. Proc. Natl. Acad. Sci. USA 97:14473–14477.

    Barry, E. G., D. J. Witherspoon, and D. J. Lampe. 2004. A bacterial genetic screen identifies functional coding sequences of the insect mariner transposable element Famar1 amplified from the genome of the earwig, Forficula auricularia. Genetics 166:823–833.

    Bessereau, J. L., A. Wright, D. C. Williams, K. Schuske, M. W. Davis, and E. M. Jorgensen. 2001. Mobilization of a Drosophila transposon in the Caenorhabditis elegans germ line. Nature 413:70–74.

    Crooks, G. E., G. Hon, J. M. Chandonia, and S. E. Brenner. 2004. WebLogo: a sequence logo generator. Genome Res. 14:1188–1190.

    Doak, T. G., F. P. Doerder, C. L. Jahn, and G. Herrick. 1994. A proposed superfamily of transposase genes: transposon-like elements in ciliated protozoa and a common "D35E" motif. Proc. Natl. Acad. Sci. USA 91:942–946.

    Feschotte, C., and S. R. Wessler. 2002. Mariner-like transposases are widespread and diverse in flowering plants. Proc. Natl. Acad. Sci. USA 99:280–285.

    Garcia-Fernandez, J., J. R. Bayascas-Ramirez, G. Marfany, A. M. Munoz-Marmol, A. Casali, J. Baguna, and E. Salo. 1995. High copy number of highly similar mariner-like transposons in planarian (Platyhelminthe): evidence for a trans-phyla horizontal transfer. Mol. Biol. Evol. 12:421–431.

    Gueiros-Filho, F. J., and S. M. Beverley. 1997. Trans-kingdom transposition of the Drosophila element mariner within the protozoan Leishmania. Science 276:1716–1719.

    Hartl, D. L., A. R. Lohe, and E. R. Lozovskaya. 1997. Regulation of the transposable element mariner.. Genetica 100:177–184.

    Jahn, C. L., S. Z. Doktor, J. S. Frels, J. W. Jaraczewski, and M. F. Krikau. 1993. Structures of the Euplotes crassus Tec1 and Tec2 elements: identification of putative transposase coding regions. Gene 133:71–78.

    Jarvik, T., and K. G. Lark. 1998. Characterization of Soymar1, a mariner element in soybean. Genetics 149:1569–1574.

    Judson, N., and J. J. Mekalanos. 2000. TnAraOut, a transposon-based approach to identify and characterize essential bacterial genes. Nat. Biotech. 18:740–745.

    Keeling, P. J., and J. D. Palmer. 2000. Parabasalian flagellates are ancient eukaryotes. Nature 405:635–637.

    Lampe, D. J., M. E. Churchill, and H. M. Robertson. 1996. A purified mariner transposase is sufficient to mediate transposition in vitro. EMBO J. 15:5470–5479.

    Lampe, D. J., D. J. Witherspoon, F. N. Soto-Adames, and H. M. Robertson. 2003. Recent horizontal transfer of mellifera subfamily mariner transposons into insect lineages representing four different orders shows that selection acts only during horizontal transfer. Mol. Biol. Evol. 20:554–562.

    Langin, T., P. Capy, and M. J. Daboussi. 1995. The transposable element impala, a fungal member of the Tc1-mariner superfamily. Mol. Gen. Genet. 246:19–28.

    Lidholm, D. A., G. H. Gudmundsson, and H. G. Boman. 1991. A highly repetitive, mariner-like element in the genome of Hyalophora cecropia. J. Biol. Chem. 266:11518–11521.

    Lidholm, D. A., A. R. Lohe, and D. L. Hartl. 1993. The transposable element mariner mediates germline transformation in Drosophila melanogaster. Genetics 134:859–868.

    Lohe, A. R., D. De Aguiar, and D. L. Hartl. 1997. Mutations in the mariner transposase: the D,D(35)E consensus sequence is nonfunctional. Proc. Natl. Acad. Sci. USA 94:1293–1297.

    Mamoun, C. B., I. Y. Gluzman, S. M. Beverley, and D. E. Goldberg. 2000. Transposition of the Drosophila element mariner within the human malaria parasite Plasmodium falciparum. Mol. Biochem. Parasitol. 110:405–407.

    Maruyama, K., K. D. Schoor, and D. L. Hartl. 1991. Identification of nucleotide substitutions necessary for trans-activation of mariner transposable elements in Drosophila: analysis of naturally occurring elements. Genetics 128:777–784.

    Medhora, M., K. Maruyama, and D. L. Hartl. 1991. Molecular and functional analysis of the mariner mutator element Mos1 in Drosophila. Genetics 128:311–318.

    Nei, M., and W. H. Li. 1979. Mathematical models for studying genetic variation in terms of restriction endonucleases. Proc. Natl. Acad. Sci. USA 76:5269–5273.

    Nielsen, R., and Z. Yang. 1998. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929–936.

    Nuzhdin, S. V., and D. A. Petrov. 2003. Transposable elements in clonal lineages: lethal hangover from sex. Biol. J. Linn. Soc. 79:33–41.

    Oosumi, T., W. R. Belknap, and B. Garlick. 1995. Mariner transposons in humans. Nature 378:672.

    Robertson, H. M. 1997. Multiple Mariner transposons in flatworms and hydras are related to those of insects. J. Hered. 88:195–201.

    ———. 2002. Evolution of DNA transposons in eukaryotes. Pp. 1093–1110 in A. M. Lambowitz, ed. Mobile DNA II. ASM Press, Washington, DC.

    Robertson, H. M., and M. L. Asplund. 1996. Bmmar1: a basal lineage of the mariner family of transposable elements in the silkworm moth, Bombyx mori. Insect Biochem. Mol. Biol. 226:945–954.

    Robertson, H. M., and E. G. MacLeod. 1993. Five major subfamilies of mariner transposable elements in insects, including the Mediterranean fruit fly, and related arthropods. Insect Mol. Biol. 2:125–139.

    Robertson, H. M., F. N. Soto-Adames, K. K. Walden, R. M. Avancini, and D. J. Lampe. 2002. The mariner transposons of animals: horizontally jumping genes. Pp. 173–185 in C. I. Kado, ed. Horizontal gene transfer. Academic Press, San Diego, Calif.

    Rozas, J., and R. Rozas. 1999. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174–175.

    Rubin, E. J., B. J. Akerley, V. N. Novik, D. J. Lampe, R. N. Husson, and J. J. Mekalanos. 1999. In vivo transposition of mariner-based elements in enteric bacteria and mycobacteria. Proc. Natl. Acad. Sci. USA 96:1645–1650.

    Shao, H., and Z. Tu. 2001. Expanding the diversity of the IS630-Tc1-mariner superfamily: discovery of a unique DD37E transposon and reclassification of the DD37D and DD39D transposons. Genetics 159:1103–1115.

    Sherman, A., A. Dawson, C. Mather, H. Gilhooley, Y. Li, R. Mitchell, D. Finnegan, and H. Sang. 1998. Transposition of the Drosophila element mariner into the chicken germ line. Nat. Biotech. 16:1050–1053.

    Shuter, B. J., J. E. Thomas, W. D. Taylor, and A. M. Zimmerman. 1983. Phenotypic correlates of genomic DNA content in unicellular eukaryotes and other cells. Am. Nat. 122:26–44.

    Swofford, D. L. 1998. PAUP*: phylogenetic analysis using parsimony (*and other methods). Sinauer Associates, Sunderland, Mass.

    Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680.

    Torti, C., L. M. Gomulski, A. R. Malacrida, P. Capy, and G. Gasperi. 1997. Genetic and molecular investigations on the endogenous mobile elements of non-drosophilid fruitflies. Genetica 100:119–129.

    Vanacova, S., D. R. Liston, J. Tachezy, and P. J. Johnson. 2003. Molecular biology of the amitochondriate parasites, Giardia intestinalis, Entamoeba histolytica and Trichomonas vaginalis. Int. J. Parasitol. 33:235–255.

    Vos, J. C., I. De Baere, and R. H. Plasterk. 1996. Transposase is the only nematode protein required for in vitro transposition of Tc1.. Genes Devel. 10:755–761.

    Wickham, S. A., and D. H. Lynn. 1990. Relation between growth rate, cell size, and DNA content in colpodean ciliates (Ciliophora:Colpodea). European J. Protistol. 25:345–352.

    Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. CABIOS 13:555–556.

    Yang, Z., R. Nielsen, N. Goldman, and A. M. Pedersen. 2000. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449.

    Zhang, J. K., M. A. Pritchett, D. J. Lampe, H. M. Robertson, and W. W. Metcalf. 2000. In vivo transposon mutagenesis of the methanogenic archaeon Methanosarcina acetivorans C2A using a modified version of the insect mariner-family transposable element Himar1.. Proc. Natl. Acad. Sci. USA 97:9665–9670.

    Zhang, L., A. Dawson, and D. J. Finnegan. 2001. DNA-binding activity and subunit interaction of the mariner transposase. Nucleic Acids Res. 29:3566–3575.(Joana C. Silva*, Felix Ba)