RecurrentExonShufflingBetweenDistantP-

Recurrent Exon Shuffling Between Distant P-Element Families

http://www.100md.com 《分子生物学进展》2003年第2期

     Institut Jacques Monod, Dynamique du Génome et Evolution, CNRS–Universités PM Curie et D. Diderot, Paris, France&5.(]c;, 百拇医药

    Abstract&5.(]c;, 百拇医药

    Two independent stationary P-related neogenes had been previouslydescribed in the Drosophila obscura species group and in theDrosophila montium species subgroup. In Drosophila melanogaster,P-transposable elements can encode an 87 kDa transposase anda 66 kDa repressor, but the P-neogenes have only conserved thecapacity to encode a 66 kDa repressor-like protein specifiedby the first three exons. We have previously analyzed the genomicmodifications associated with the transition of a P-elementinto the montium P-neogene, the coding capacity of which hasbeen conserved for around 20 Myr ( Mol. Biol. Evol. 14:1132–1144). Here we show thatthe P-neogene of some species of the montium subgroup presentsa new structure involving the capture of an additional exonfrom a very distant P-element subfamily. This additional exonis inserted either upstream or downstream of the first exonof the P-neogene. As a result of alternative splicing, thesemodified neogenes can produce, in addition to the repressor-likeprotein, a new protein which differs only by the NH2-terminalregion. We hypothesize that this protein diversity within anorganism results in a functional diversification due to theselective advantage associated with the domestication of theP-neogene in these species. Moreover, the autonomous P-elementwhich provides the additional exons is still present in thegenome. Its nucleotide sequence is more than 45% distant fromthe previously defined P-type element (M-type, O-type, T-type)and defines a new P-type element subfamily referred to as theK-type.

    Key Words: P-element • transposon • exon shuffling • molecular domestication • Drosophila montium subgroup;y|, http://www.100md.com

    Introduction;y|, http://www.100md.com

    The increase of transposons within a species is due primarilyto their ability to replicate and not to a selective advantagefor the host. These selfish and complex pieces of DNA harbora powerful potential repertoire of new functional genetic abilities;consequently, if the host genome succeeds in domesticating suchtransposable elements (TEs) by taming their "anarchic behavior,"these repetitive DNAs could be responsible for important evolutionaryinnovations ( for a review see ).Transposable elements can be a factor in gene evolutionnot only by supplying cis regulatory domains to host genes butalso by coding novel cellular functions. The transition froma genomic parasite to a stable integrated gene that is usefulto the host has been described as "molecular domestication". Recently, a scan of the draft human genomesequence has identified at least 47 genes probably derived fromtransposable elements .It is notable that 43 of 47 of these TE-derivedgenes are composed of DNA transposon despite the fact that thisclass represents only a small proportion of the interspersedrepeats in the human genome.

    The two first examples of a molecular transition of a DNA transposoncoding sequence into a stable integrated host gene were providedby studies on the Drosophila P-transposable element family.Indeed, stationary P-element–related neogenes have beendiscovered, one in a species belonging to the obscura speciesgroup , and the otherin a member of the Drosophila montium species subgroup .Both belong to the sameP-subfamily: the T-type as defined by .Although the functional properties of the P-element-derivedneogenes in their respective host are still unknown, this systemprovides the first example of multiple independent acquisitionof the same type of TE-derived coding section in Drosophilaevolution .+, 百拇医药

    Autonomous P-transposable elements were initially discoveredin Drosophila melanogaster then in several other distant Drosophila lineages .The molecular structure of the canonical P-element includesfour exons (numbered 0 to 3) encoding two proteins completedby an alternative splicing of the primary transcript: an 87-kDatransposase (exons 0 to exon 3), and a 66-kDa protein (exons0 to exon 2) which acts as a repressor of transposition . The third intron is spliced exclusively inthe germline and thus limits transposase synthesis to this tissue.

    fig.ommitted5[, http://www.100md.com

    FIG. 1. Schematic representations of the organization of the P-autonomous elements and of the montium P-neogenes. The gene structures are represented with their products where these have been identified. (A) Autonomous P-element as described in D. melanogaster . (B) Standard P-neogene as described in D. tsacasi . (C) Rearranged montium P-neogene as described in D. bocqueti. On the right, Northern blot on adult transcripts hybridized with a riboprobe spanning exons 1 and 2. (D) Rearranged montium P-neogene as described in D. vulkana. (E) New type of autonomous P-element as cloned from D. bocqueti. Boxes correspond to exons and their corresponding regions of the proteins (open boxes specific to the canonical P-element; gray boxes specific to standard montium P-neogene; angled bar boxes specific to the K-type P-element). The additional intron present in exon 3 of K-type P-element is shown as a triangular insertion5[, http://www.100md.com

    In the stationary P-element–related neogenes, the terminalinverted repeats are missing or restricted to a skeleton andthe coding region lacks the transposase-specific exon 3. Inthe obscura species group, the repeats are tandemly clustered(10 to 50 copies), but in the species of the montium subgroupthey are present in a single copy. These P-neogenes probablyderive from previously mobile P-elements which have undergone,in the course of the transposition at the neogene genomic location,structural modifications causing their immobilization and changingtheir cis-regulatory section. Although both the obscura andthe montium P-derived neogenes are transcribed in adult fliesinto polyadenylated RNAs that encode P-repressor-like protein,the 5' regulatory sections that drive the expression have differentorigins. In the case of the obscura P-neogene, a novel promoterregion evolved from the TE-insertion of unrelated transposons By contrast, the 5' regulatory sectionof montium P-neogene might have evolved from an intergenic sequenceflanking the P-element. Surprisingly, the montium P-neogenecontains a new exon (exon-1) and a new intron (intron-1) upstreamof the original P-sequence insertion site .Thus, the two P-repressor-like neogenes have recruitedflanking genomic sequences as new regulatory regions that mayresult in different expression patterns leading to distinctnovel functions of the proteins.

    In the present article we describe two independent events ofexon shuffling that have taken place within the montium P-neogene.They have resulted in the capture of an additional exon froma very distant P-element subfamily described here for the firsttime. Each novel structure of the P-neogenes encodes two putativeproteins sharing the same COOH-terminal region but stronglydivergent for their NH2-terminal part. We hypothesize that thisprotein diversity within an organism results in functional diversification.7!*#;, 百拇医药

    Materials and Methods7!*#;, 百拇医药

    Fly Stock Sources7!*#;, 百拇医药

    Stock flies were obtained from the CNRS LaboratoirePopulations,Génétique et Evolution, Gif-sur-Yvette, France.7!*#;, 百拇医药

    DNA Hybridization Analysis and Cloning7!*#;, 百拇医药

    Genomic DNA was digested with restriction enzymes accordingto the manufacturers' instructions. Restriction fragments wereseparated by electrophoresis in agarose gels, and then transferredonto a nitrocellulose membrane (Schleicher and Schuell) accordingto standard protocols .The probes used were synthesized by polymerase chain reaction(PCR), either from the cloned neogene P-boc ()for the probe specific to exon 0' or from the cloned K-boc-P(present work) as a template for the probe specific to exon3. The primers 1359 (5'-TGTGGGAAAAATCCTTAGAATGC–3') and1632 (5'CTAGATGATAGTTGTTGCA 3') yield an amplified fragmentof 293 bp specific to exon 0' of the P-boc neogene (see Resultsand ). The primers 1938 (5'-CATTCACATTTTTCGCAGCC-3')and Reverse primer belonging to the polylinker of the TA-cloningvector at the K-boc-P 3' yield an amplified fragment of 1.1kb specific to the region of exon 3. Probes were labeled with³²P, with the random primed kit (Amersham). Prehybridizationand hybridization conditions were 6x SSC, 5x Denhardt, 0.5%SDS, and 150 µg/ml of salmon sperm at 65°C, and washingwas done twice at 65°C in 2x SSC and 0.1% SDS.

    RNA Isolation and Northern Analysis$\1't, 百拇医药

    Total RNA was isolated from adults using RNAzol reagent (Bioprobesystem). Poly(A)⁺ RNA was purified through an oligo(T) columnand separated by electrophoresis in 1.3% agarose formaldehydegel and transferred onto a nitrocellulose membrane.$\1't, 百拇医药

    RT-PCR Experiments$\1't, 百拇医药

    Reverse transcription (RT) of total RNA and subsequent PCR werecarried out with the OneStep RT-PCR kit (Qiagen) according tothe supplier's recommendations. The primers used to detect transcriptsof the P-boc neogene are shown in From the mRNAs,the primers boc1 (5'-GCATTTTGATGCGTCCCAGTGG-3') and boc2 (5'-GTCTTGGCAGGGCGTTTGGC-3')were expected to amplify a product of 437 bp; the primers boc3(5'-GACACACATTTCAAAGCATCGG-3') and boc4 (5'-ACTGCTCGAGCTGCTGACGC-3')were expected to amplify a product of 248 bp; and the primersboc1 and boc4 gave a product of 261 bp. The amplified productswere cloned into the pCR2 vector from the Topo TA-cloning kit(Invitrogen) and introduced into Escherichia coli INV {alpha}

    F' competentcells. Plasmid DNA was prepared for sequencing with the QIAprepkit (Qiagen). Automatic sequencing was done with the ABI PrismBigDye Terminator Cycle Sequencing Ready Reaction (Applied Biosystems)..|!d, 百拇医药

    Sequence Analysis.|!d, 百拇医药

    Sequences used in this study are listed, together with theiraccession numbers, in. Nucleotide and amino acid alignmentsof autonomous P-elements were made using PILEUP program with the default options and then optimizedby hand. Pairwise distance matrices were inferred using theKimura correction methods. Phylogenetic analysis was performedby the Neighbor-Joining method following the procedure indicatedin the text..|!d, 百拇医药

    fig.ommitted.|!d, 百拇医药

    Table 1 P-Sequences Used in This Study..|!d, 百拇医药

    Results.|!d, 百拇医药

    Exon Insertion Events in the montium Stationary P-Neogene.|!d, 百拇医药

    In a previous study we cloned and totally or partially sequenced12 of 18 montium P-neogenes. In seven species (D. bicornuta,D. davidi, D. jambulina, D. nikananu, D. seguyi, D. serrata,D. tsacasi), the size of the P-neogene is consistent with thesize expected from a P-neogene similar to that described inD. tsacasi . In the five otherspecies (D. bakoue, D. bocqueti, D. burlai, D. malagassya, D.vulcana), the size of the P-neogenes is greater than expected,suggesting the presence of DNA insertions. The P-neogenes ofD. bocqueti (P-boc) and D. vulkana (P-vul) have been entirelysequenced (accession numbers AF169142 and AY116625).

    Insertion of a New Coding Exon Downstream of Exon 0 of the P-Neogene of Drosophila bocquetir91, http://www.100md.com

    A comparison of the structures of the D. tsacasi and D. bocquetiP-neogenes shows that an immobilized and internaldeleted P-element is inserted inside the intron (0, 1) separatingexon 0 and exon 1 in the D. bocqueti P-neogene. This P-sequenceinsertion is 556 bp long (accession number AF169142 from nucleotides1049 to 1604). It is flanked by a direct 8 bp duplication correspondingto the duplication of the target site, with one mismatch. The31 bp of the 3' terminal inverted repeat (TIR) are 87% identicalto the sequence of the D. melanogaster P-mobile element TIR.The first 13 bp of the 5' TIR are missing. This internal insertionretains an intact open reading frame (ORF) corresponding toexon 0 of the canonical P-element. Hereafter, this insertionwill be called InsPboc and its exon, exon 0'. The identity betweenexon 0' and the first coding exon (exon 0) of the P-boc neogeneis 54.4% and 43.3% at the nucleotide and amino-acid levels,respectively. Northern blot analysis was performed on adultpoly(A)⁺ RNA with a riboprobe obtained from the subcloned regionof exons 1 and 2 of the P-tsa neogene. The probe was synthesizedusing T7 RNA polymerase and labeled with [³²P]UTP. As shownin , a 2.5-kb transcript and a 2.1-kb transcript weredetected. The difference between the sizes of the two transcriptscorresponds to that expected if alternative splicing occurs,joining either exon 0 to exon 0' and exon 0' to exon 1, or exon0 to exon 1. The complete RNA processing results in two mRNAs:one including exons –1, 0, 0', 1 and 2 (2.5 kb) and thesecond including exons –1, 0, 1 and 2 (2.1 kb) (.As the probe used for the Northern blot covers the same partof the two transcripts, the difference in intensity betweenthem probably results from quantitative differences in the adults.This alternative splicing was confirmed by RT-PCR. Transcriptswere extracted from adults and the cDNA was synthesized as describedin Materials and Methods. The primers designed for the cDNAamplification are shown in The sequences of the amplifiedproducts confirm that the alternative splice uses the donorand acceptor splicing sites corresponding to those in the canonicalP-transposable element .

    The sequence of the 2.1-kb transcript has the coding capacityfor a protein 574 amino acids long. Hereafter this protein willbe called repressor-like 1 (RL1). The 2.5-kb transcript couldalso be translated from the conventional start of translationpresent in exon 0 or in exon 0'. The translation initiated fromexon 0 ceases at the beginning of exon 0' because of the presenceof a stop codon (the splicing between exon 0 and exon 0' doesnot conserve the phase in exon 0'). In contrast, the translationinitiated from the conventional AUG of exon 0' leads to a proteinof 570 AA, which will hereafter be called repressor-like 2 protein(RL2).], http://www.100md.com

    A similar structure is found in D. burlai. (accession numberAY116626), a sibling species of the bocqueti complex of species. In this species, the P-neogene containsan insertion of 501 bp, inserted at the same site as in D. bocqueti,indicating that the primary insertion event took place in acommon ancestor of the two species. This insertion, hereaftercalled InsPbur, present TIRs which have the same characteristicsas InsPboc, except for a 7-bp insertion inside the 3' TIR. Thus,it cannot be trans-mobilized. InsPbur presents an ORF with 93amino acids showing 92.5% identity with exon 0' of InsPboc Theidentities between exon 0' for InsPbur and exon 0 of the P-burneogene are 51.5% and 42.2% at the nucleotide and amino-acidlevels, respectively. Moreover, the sequence analysis showsthe conservation of the same splice sites experimentally determinedin P-boc neogene. Consequently, the P-bur neogene would providetwo proteins with 96.5% and 95.3% identity with the correspondingRL1 and RL2 proteins, respectively, of the P-boc neogene.

    Another Example of Exon Shuffling: Insertion of a New Exon Upstream of Exon 0 of the D. vulcana P-Neogeneg(2, 百拇医药

    A comparison of the structure of the D. tsacasi P-neogene withthat of D. vulkana shows that an internal deletedP-element is inserted inside exon –1 of the D. vulcanaP-neogene. This insertion, hereafter called InsPvul, is 350bp long and has conserved an intact ORF corresponding to exon0' described above. A skeletal P-element 5' TIR can still beidentified in the sequence upstream of this ORF, but no significantidentity with a 3' TIR is detectable in the downstream region.The nucleotide comparison between the InsPvul coding sequenceand exon 0 of the P-vul neogene shows an identity of 51.1%.The structural similarity between InsPboc and InsPvul and theirhigh nucleotide sequence identity (83.9%) make it possible todeduce the putative transcripts of the P-vul neogene from thesplicing sites experimentally identified for the P-boc neogene(see Discussion).

    The P-neogenes of D. bakoue and D. malagassya have been partiallysequenced; upstream of exon 0, they present the same insertionas the P-vul neogene, located at the same target site (datanot shown). These two species belong to the same complex ofspecies as D. vulkana (the bakoue complex of species, ).This indicates that this insertion event occurredin their common ancestor. The additions of exons into the P-neogenesdescribed above are not accompanied by any other structuralmodifications. It is remarkable that, as shown in ,the sequence upstream of exon -1 is highly conserved when comparedto the promoter region in the P-neogene of D. tsacasih%, http://www.100md.com

    fig.ommittedh%, http://www.100md.com

    FIG. 2. Conservation of the promoter region: nucleotide sequence alignments of the 5' region of the P-neogenes with or without exon duplication. Gray box: promoter region of the P-neogene in D. tsacasi; the arrow indicates the start of transcription . Names of sequences are defined in . The identical nucleotides are represented by dashes. The nonconserved nucleotides are in lower case

    Identification of the Exon 0' Master Copy+arc{, 百拇医药

    The nucleotide divergences between the insertions InsPboc orInsPvul and the numerous P-sequences registered in the databanks are all greater than 35%, implying that they do not belongto a previously described P-element subfamily . Moreover, each of them could resultfrom the insertion of a complete P-element, followed by largedeletions, leaving the region (including the full coding regionof the first exon) inserted. Because of their identity (83.9%),these insertions should derive from a same P-element subfamily.These results support the hypothesis that the genome of thespecies D. bocqueti and D. vulcana and their related speciesharbor an active P-element family which is at the origin ofexons 0' identified in several montium P-neogenes.+arc{, 百拇医药

    Southern blot experiments were performed with genomic DNA fromsix species belonging to the montium subgroup (D. bocqueti,D. burlaï, D. kikkawai, D. nikananu, D. tsacasi, and D.vulkana). DNA samples were digested with Pst I endonuclease,and after electrophoresis the restriction fragments were bi-transferredonto a nitrocellulose membrane. One filter was hybridized withthe exon 0'–specific fragment amplified with the primers1359 and 1632 from the clone containing the P-boc neogene asa template (see Materials and Methods). A number of hybridizationsignals are present in D. bocqueti, as well as in other species, showing that the inserts InsPboc and InsPvul belongto a repeated dispersed P-element family. In an attempt to isolateP-elements at the origin of exon 0', a long-range PCR amplificationwas performed on D. bocqueti DNA as a template with a primer(5'CATAATGGAATAACTATAAGGTGG3') corresponding to the first 24bp of the 3' TIR sequence of Insboc. Full-length and deletedP-elements have been cloned by the TA-cloning method (Invitrogen)from PCR products. Some have been sequenced. The sequence ofa complete P-element (accession number AY116624), describedin , has the coding capacity of an autonomous P-element.This element is called the K-bok-P-element (Kenya-bocqueti P-element,for the D. bocqueti strain originated in Kenya). Six other K-bocsequences are partially sequenced. The divergence between themis less than 5%. They are available by request. The K-boc-P-elementis 3300 bp long and its termini are formed by 31 bp invertedrepeats. The difference in length between K-boc-P and the canonicalP-element results from two features: (1) the intronbetween exon 0 and exon 1 is unusually long in K-boc-P (264bp as opposed to only about 50 bp in the other P-elements),and (2) exon 3 is interrupted by an additional 172bp intron.However, the K-boc-P-element shares a number of structural featureswith the autonomous P-element from other Drosophila species(D. melanogaster, D. bifasciata, S. pallida). Subterminal invertedrepeats (SIRs) of 10 bp (positions 33–42 and 3259–3268)and 11 bp with one mismatch (positions 127–137 and 3161–3171)are found in the 5' and 3' noncoding regions. These locationscorrespond to those of SIRs in the P-elements of the other species,thus implying a functional equivalence. Moreover, exon 1, likethe D. melanogaster and Scaptomyza pallida P-elements ,presents inverted repeatsof 17 bp separated by 29 bp (positions 942–958 and 988–1004).The consensus 5'- and 3'-splice sites of the exons are conservedand the additional intron inside the exon retains the codingcapacity of the K-boc-P-element. The putative protein is 721amino acids long and has a molecular weight of 83 kDa .It is remarkable that Cys, His, Arg, Lys, and Trp are over-representedin the first 70 amino-acids of the N-terminal section (35.7% compared to 17.5% in the rest of the protein). Moreover, theCCHC putative metal-binding site present in the canonical P-elementcan be recognized at the same position in the K-bok-P protein.These results suggest that the features of DNA-binding domainsare present in the N-terminal sections of the putative transposaseof the K-boc-P-element. Furthermore, by comparison with theD. melanogaster P-element, other functionally important sectionsare also conserved: the three leucine-zipper motifs are foundat the same locations as is the helix-turn-helix motif, whichshows only four mismatches out of 19 residues.

    fig.ommittedo2n6c^c, 百拇医药

    FIG. 3. Southern blot analysis of genomic DNA from six species of the montium subgroup. All DNA samples were digested by PstI and probed (A) with the PCR amplified fragment of 293 bp by primers 1359 and 1632 from the P-boc clone (see C) and (B) with the PCR amplified fragment by the primers 1938 and reverse primer (see Materials and Methods) from the K-boc-P clone. Each lane contained 10 µg of DNA and the gel was bi-transferred onto the nitrocellulose filters A and B. 1: D. tsacasi, 2: D. bocqueti, 3: D. burlai, 4: D. vulkana, 5: D. kikkawai, 6: D. nikananu.o2n6c^c, 百拇医药

    fig.ommittedo2n6c^c, 百拇医药

    FIG. 4. Nucleotide and derived amino acid sequences for the K-boc-P-element of D. bocqueti (accession number AY116624). The terminal inverted repeats are identified with solid arrows. The 10-bp transposase binding sites that have been defined at both ends of the D. melanogaster P-element are boxed. Internal inverted repeats are underlined with broken and shaded arrows. The intronic sequences are indicated by lower case. The extra intron within exon 3 is underlined. The amino acid residues constituting the three leucine zippers are circled. The helix-turn-helix motif is boxed

    The second filter from the bi-transferred DNA samples describedabove was hybridized with a PCR product synthesized from exon3 specific to the transposase of the cloned K-boc-P-element.As shown in , a number of hybridization signals aredetected in D. bocqueti, D. burlai, D. nikananu, D. tsacasi,and D. vulkana (but not in D. kikkawai), indicating the presenceof numerous P-elements containing the exon 3 specific to thetransposase coding sequence.z1r'-, 百拇医药

    To define the relation between the K-boc-P-element and the majorP-element subfamilies as they have been previously characterizedin D. ambigua (T-type), D. bifasciata (M-type and O-type), D.helvetica (M-type), D. melanogaster (M-type), and Scaptomyzapallida (M-type) (for review, see ),the nucleotide and amino acid alignments of these elementstogether with the K-boc-P-element were performed using the Pileupprogram of the GCG package (Madison, Wis.) and improved manually.The pairwise distances are shown in . The K-boc-P-elementis very distant from all other P-elements (>0.45): this newfull-length P-element belongs to a so far unidentified P-subfamily.We define this subfamily as the K-type.

    fig.ommittedw{, 百拇医药

    Table 2 Nucleotide and Amino Acid Distances Between Seven Autonomous P-Elements.w{, 百拇医药

    A Neighbor-Joining analysis performed on the putative proteinsof these P-sequences and two additional P-sequences from moredistant species, Lucilia cupina (Calliphoridae) and Musca domestica (Muscidae) ,produces a dendrogram in which the K-boc-P-elementgroups with the elements from the Drosophilidae . have performed an extensive phylogeneticanalysis of P-sequence with 40 species in the Drosophilidaeusing a partial P-sequence (449 bp from exon 2). This analysisprovided a cladogram in which 16 clades are well supported.To define the position of the K-boc element relative to theseP-element subfamilies, a Neighbor-Joining analysis was performedusing this partial internal sequence. Only one or two P-sequencesrepresentative of each clade defined by Clark and Kidwell'swork were included in the analysis. In the new cladogram the K-boc-P-element does not group inside any previouslyidentified clades, confirming that the K-boc-P-element doesnot belong to one of the subfamilies already described.

    fig.ommittedf|\, 百拇医药

    FIG. 5. Relationship between the K-boc-P-element and other P-elements based on their amino acid sequences. Multiple alignments were performed by ClustalW . The dendrogram was created by Neighbor-Joining analysis of a matrix for 497 informative sites using the Phylo_win program . The scale bar indicates genetic distances in units of residue substitution per site. The bootstrap values are given for each node (they correspond to the percent of 1000 replications). P-sequences are noted by code name (see ). ^aAmino acid sequence deduced from the degenerate P-sequencef|\, 百拇医药

    fig.ommittedf|\, 百拇医药

    FIG. 6. Relationship between the K-boc-P-element and P-nucleotide sequences from the family Drosophilidae. Each P-sequence is named by the species from which it derives (note that some genomes retain more than one P-subfamily). Multiple alignments were performed by ClustalW . The dendrogram was created by Neighbor-Joining analysis of a matrix for 385 informative sites using the Phylo_win program . The scale bar indicates genetic distances in units of residue substitution per site. Numbers show the bootstrap percentages (1000 replications). Letters refer to the clades as defined by

    The position and coding capacity of exons 0' suggest that therearranged P-neogenes are under host level selection. Directevidence is provided by a test for selection at the sequencelevel. The pairwise comparisons of the substitution rates betweenthe exon 0 of the K-boc full-length P-element and the exon 0'of the P-neogenes in D. bocqueti, D. burlai, and D. vulkana,are presented in (not enough sequence data were availablefor the neogenes of D. malagassia and D. bakoue). All significantresults (P < 0.05) are due to d_N/d_S < 1; that is, theyshowed evidence of conservative selection. These results arein accordance with those of , obtained usingpartial sequences of the P-neogenes of D. davidi, D. tsacasi,and D. kikkawai. As very few changes occur between exon 0' ofRL2bur and exon 0' of RL2boc, the test has less power than inthe other comparisons giving a nonsignificant statistic.6.v&&%-, http://www.100md.com

    fig.ommitted6.v&&%-, http://www.100md.com

    Table 3 Evidence of Selection in Comparison of the RL2 Products of the P-Neogene and the K-boc-P-Element (first coding exon).

    Discussion4ji@, 百拇医药

    Exon Shuffling Increases the Diversity of Proteins Encoded by P-Sequences4ji@, 百拇医药

    The stationary P-neogene of the montium subgroup has been detectedin the 18 species from the montium subgroup in which it hasbeen sought, but not in species belonging to any other relatedsubgroup . This result strongly suggeststhat the domestication event took place in the ancestor of themontium subgroup more than 20 MYA. The primordial P-elementinsertion event at the origin of the P-neogene was accompaniedby a large 3' terminal deletion covering the last exon and bythe capture of a new promoter from the 5' flanking region, associatedwith a short noncoding exon and a new intron .Activation by the captured promoter might have resultedin a different expression pattern, thus initiating a novel functionfor the protein encoded by exons 0–2. This protein, asdiscussed earlier, has been named repressor-like protein, orRL, since in the canonical P-element the corresponding proteinacts as a repressor of P-transposition. Moreover, through successivespeciation events, the orthologous P-neogenes have undergoneseveral independent modifications, such as the exon acquisitionreported in the present work. The P-boc and P-vul neogenes haveretained a new exon 0 (exon 0') respectively downstream andupstream of the exon 0 of their neogene. In both cases, thenew exon originates from the new K-type P-element subfamily.

    Northern blot and RT-PCR analyses show the presence of two transcriptsfor the P-boc neogene. Nucleotide comparisons between the P-bocand the P-vul neogenes reveal a high conservation of the acceptorand donor sites at the boundaries of exon 0, exon 0', and exon1. On the basis of the sequence of the P-vul neogene, we candeduce the existence of transcripts encoding two putative proteinsanalogous to the RL1 and RL2 proteins of the P-boc neogene initiatedfrom the P-canonical start codon present in exon 0 and exon0', respectively . It must be emphasized that threeout of four splice sites used to join exons 0, 0', and 1 aresimilar to the functional splice sites of the P-element. Conversely,the last splice site is a cryptic acceptor site upstream ofexon 0' (P-boc) and exon 0 (P-vul). In both cases, the splicingof exons 0' and 1 specific to the RL2 protein has probably beenfunctional from the outset. In each species, the two proteinsdiffer only in their NH2 terminal region.3ld.x, 百拇医药

    Interspecific comparison shows that the similarity is 87.1%between the two RL1 proteins and 85.8% between the two RL2 proteins.These values strongly suggest that the exon shuffling has beenassociated with a selective advantage for the host. Evidencethat RL2 proteins are under host level selection is given byan excess of synonymous versus nonsynonymous substitutions intheir coding sequence. These d_N/d_S ratios, being lower than1, suggest that the protein region encoded by exon 0' of theneogenes has conserved functional characteristics similar tothose of functional K-boc-P-elements. Moreover, the pairwisecomparisons between the P-neogenes exon 0' are in accordancewith a host selective pressure.

    Preliminary experiments showed that the RL1 and RL2 proteinsfrom P-boc are produced in vivo in D. melanogaster transgenicflies (unpublished data), but the functions of these two proteinsare still unknown. Their common region (exons 1 and 2) containsthe same three leucine zipper motifs and the coiled-coiled domainscharacteristic of the P protein. However, pairwise comparisonsbetween RL1 and RL2 restricted to the region corresponding tothe first exon show similarities of 57.6%, 53.8%, and 53.7%in D. bocqueti, D. burlai, and D. vulkana, respectively. Thus,in each species, the neogene products strongly differ. Giventhat the DNA-binding domain is conserved in the N-terminal regionof the two proteins, this amino-acid divergence could indicatea diversification of the DNA-binding specificity of the proteins,which in turn could correspond to a functional differentiation.#?\u, 百拇医药

    Recurrent Exonic Insertion Inside the montium P-neogenes#?\u, 百拇医药

    Surprisingly, the first exon of the K-boc-P-elements has beencaptured twice by the P-neogene in the montium subgroup, oncedownstream of exon 0 (in the common ancestor of D. bocquetiand D. burlai), and once upstream of exon 0 (in the ancestorof D. vulcana, D. bakoue, and D. malagassya) . Thesetwo independent events could be due to the tendency of the K-boc-P-elementto insert inside the P-neogene and to selective advantage associatedwith the production of the chimeric protein RL2. As observedin D. melanogaster, the canonical P-element tends to insertin the 5' end regions of genes . If theK-boc-P-element has the same property, two insertions insidethe 5' region of the P-neogene might have occurred independentlyin the montium subgroup, and the elements could subsequentlyhave undergone internal deletions leaving an intact exon 0.However, a more parsimonious scenario would be that of an insertionof a K-boc-P-element into the intron separating exon 0 and exon1 of the P-neogene in the common ancestor of these five species,followed by an internal deletion event. This in turn would havebeen followed by a local transposition just upstream of exon0 in the ancestral species at the origin of the clade includingD. vulkana, D. bakoue, and D. malagassya. Another scenario canbe proposed, mutatis mutandis, but with the primary insertionupstream of exon 0. These scenarios are supported by two propertiesof the D. melanogaster P-element: the homing phenomenon andlocal transpositions. P-element transposition occurs by a nonreplicative"cut-and-paste" mechanism beginning with an excision of theelement and followed at the donor site by a double-strand gaprepair according to a process similar to gene conversion . The appearance of doubleP-elements has been explained by a homing phenomenon: the P-elementtransposase, which has an affinity for P-elements, may remainattached to the excised element and sometime helps to targetit to another P-element elsewhere in the genome. The insertiontarget will thus frequently be the copy of the excised elementpresent on the sister chromatid or on the homologous chromosome. Thiscould explain why a significant fraction of P-element transpositionsare local and often lead to P-element insertions within or neara second P-element .These local transpositions can represent up to 80%of transposition events, depending on the insertion site .This process has been proposed to be at the origin ofnested rearranged double P-elements. In this study the firstevent inserted a K-boc-P-element directly into the P-neogene,which was then followed by a local transposition event.

    So far, the K-boc-P-element has been detected only in speciesbelonging to the montium subgroup. It occurs in species in whichthe P-neogene presents exonic duplications (e.g., D. bocqueti),as well as in species in which the neogene does not have sucha duplication (i.e., D. tsacasi). It should be noted that theK-boc-P-element has not been found in the obscura group, inwhich another type of P-element domestication took place As shown by the Neighbor-Joininganalysis , the K-boc-P-family is very distant from allthe other P-families, and it is not possible to speculate onthe origin of this new P-subfamily. It is present in D. tsacasi,D. bocqueti, D. burlai, D. vulkana, and D. nikananu, but ithas not been detected in D. kikkawai, or in D. davidi and D.serrata (data not shown). For the moment, we cannot speculateon the origin of the K-boc-P-element, nor do we know whetherthe patchy distribution inside the montium subgroup resultsfrom horizontal transfer events.-{, http://www.100md.com

    The molecular domestication of P-coding sequences describedhere, and the two similar events previously described in themontium subgroup and in the obscura group, demonstrate the creativeforce of a transposable element as an evolutionary motor thatcan restructure the genome and lead to the acquisition of novelproteins

    Acknowledgements&, http://www.100md.com

    We thank S. Ronsseray and W. Miller for insightful discussionsand comments on the manuscript. This work was supported by theCentre National de Recherche Scientifique (CNRS), the UniversitiesP. and M. Curie and D. Diderot (Institut Jacques Monod, UMR7592, Dynamique du Génome et Evolution), and the GDR-CNRS2157 "Evolution des éléments transposables dugénome aux populations."&, http://www.100md.com

    Literature Cited&, http://www.100md.com

    Bellen, H. J., C. H. O'Kane, C. Wilson, U. Grossnilklaus, R. K. Pearson, and W. J. Gehring. 1989. P-element-mediated enhancer detection: a versatile method to study development in Drosophila. Genes Dev. 3:1288-1300.&, http://www.100md.com

    Clark, J. B., and M. G. Kidwell. 1997. Phylogenetic perspective on P transposable element evolution. Proc. Natl. Acad. Sci. USA 94:11428-11433.&, http://www.100md.com

    Daniels, S. B., and A. Chovnick. 1993. P element transposition in Drosophila melanogaster: an analysis of sister-chromatid pairs and the formation of intragenic secondary insertions during meiosis. Genetics 133:623-636.

    Delattre, H., D. Anxolabéhère, and D. Coen. 1995. Prevalence of localized rearrangements vs. transpositions among events induced by Drosophila P element transposase on a P transgene. Genetics 141:1407-1424.v, 百拇医药

    Dorer, D. R., and S. Henikoff. 1994. Expansions of transgene repeats cause heterochromatin formation and gene silencing in Drosophila. Cell 77:993-1002.v, 百拇医药

    Eggleston, W. B. 1990. P element transposition and excision in Drosophila: interactions between elements. Ph.D. Thesis, Univeristy of Wisconsin, Madison.v, 百拇医药

    Engels, W. R., D. M. Johnson-Schlitz, W. B. Eggleston, and J. Swed. 1990. High-frequency P elements loss in Drosophila is homologue dependent. Cell 62:515-525.v, 百拇医药

    Galtier, N., M. Gouy, and C. Gautier. 1996. SeaView and Phylo_win, two graphic tools for sequence alignment and molecular phylogeny. Comput. Applic. Biosci. 12:543-548.v, 百拇医药

    Genetics Computer Group. 1991. Wisconsin sequence analysis package. Version X. Genetics Computer Group, Madison, Wis.v, 百拇医药

    Golic, K. G. 1994. Local transposition of P element in Drosophila melanogaster and recombination between duplicated elements using a site-specific recombinase. Genetics 137:551-563.

    Hagemann, S., E. Haring, and W. Pinsker. 1996. A new P element subfamily from Drosophila tristis,D. ambigua and D. obscura. Genome 39:978-985.#4-[40e, 百拇医药

    Hagemann, S., W. J. Miller, and W. Pinsker. 1992. Identification of a complete P element in the genome of D. bifasciata. Nucleic Acids Res. 20:409-413.#4-[40e, 百拇医药

    Hagemann, S., W. J. Miller, and W. Pinsker. 1994. Two distinct P element subfamilies in the genome of D. bifasciata. Mol. Gen. Genet. 244:168-175.#4-[40e, 百拇医药

    Haring, E., S. Hagemann, and W. Pinsker. 2000. Ancient and recent horizontal invasions of drosophilids by P elements. J. Mol. Evol. 51:577-586.#4-[40e, 百拇医药

    International Human Genome Sequencing Consortium. 2001. Initial sequencing and analysis of the human genome. Nature 409:860-921.#4-[40e, 百拇医药

    Kaufman, P. D., and D. C. Rio. 1992. P element transposition in vitro proceeds by a cut-and-paste mechanism and uses GTP as a cofactor. Cell 69:27-39.#4-[40e, 百拇医药

    Kaufman, P. D., R. F. Doll, and D. C. Rio. 1989. Drosophila P element transposase recognizes internal P element DNA sequences. Cell 59:359-371.

    Kidwell, M. G., and D. R. Lisch. 2001. Perspective: transposable elements, parasitic DNA, and genome evolution. Evolution 55:1-24.1c$x79+, 百拇医药

    Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitution through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120.1c$x79+, 百拇医药

    Kimura, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge.1c$x79+, 百拇医药

    Laski, F. A., D. C. Rio, and G. M. Rubin. 1986. Tissue specificity of Drosophila P element transposition is regulated at the level of mRNA splicing. Cell 44:7-19.1c$x79+, 百拇医药

    Lee, C. C., Y. M. Mul, and D. C. Rio. 1996. The Drosophila P-element KP repressor protein dimerizes and interacts with multiple sites on the P-element DNA. Mol. Cell. Biol. 16:5616-5622.1c$x79+, 百拇医药

    Lee, S. H., J. B. Clark, and M. G. Kidwell. 1999. A P element-homologous sequence in the house fly Musca domestica. Insect Mol. Biol. 8:491-500.1c$x79+, 百拇医药

    Lemeunier, F., J. R. David, L. Tsacas, and M. Ashburner. 1986. The melanogaster species group. Pp. 147–256 in M. Ashburner, H. L. Carson, and J. M. Thompson, Jr., eds. The genetics and biology of Drosophila. Academic Press, New York.

    Maniatis, T., E. F. Fritsch, and J. Sambrook. 1982. Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.e\)), 百拇医药

    Miller, W. J., S. Hagemann, E. Reiter, and W. Pinsker. 1992. P element homologous sequences are tandemly repeated in the genome of D. guanche. Proc. Natl. Acad. Sci. USA 89:4018-4022.e\)), 百拇医药

    Miller, W. J., J. F. McDonald, D. Nouaud, and D. Anxolabéhère. 1999. Molecular domestication—more than a sporadic episode in evolution. Genetica 107:197-207.e\)), 百拇医药

    Miller, W. J., A. Nagel, J. Bachmann, and L. Bachmann. 2000. Evolutionary dynamics of the SGM transposon family in the Drosophila obscura species group. Mol. Biol. Evol. 17:1597-1609.e\)), 百拇医药

    Miller, W. J., N. Paricio, S. Hagemann, M. J. Martinez-Sebastian, W. Pinsker, and R. DeFrutos. 1995. Structure and expression of the clustered P element homologous in Drosophila subobscura and D. guanche. Gene 156:167-174.e\)), 百拇医药

    Nouaud, D., and D. Anxolabéhère. 1997. P element domestication: a stationary truncated P element may encode a 66-kDa repressor-like protein in the Drosophila montium species subgroup. Mol. Biol. Evol. 14:1132-1144.

    Nouaud, D., B. Boëda, L. Levy, and D. Anxolabéhère. 1999. A P element has induced intron formation in Drosophila. Mol. Biol. Evol. 16:1503-1510.%|7#, 百拇医药

    O'Hare, K., and G. M. Rubin. 1983. Structure of transposable elements and the site of insertion and excision in the Drosophila melanogaster genome. Cell 34:25-35.%|7#, 百拇医药

    Paricio, N. M., M. Perez-Alonso, M. J. Martinez-Sebastian, and R. De Frutos. 1991. P sequences of D. subobscura lack exon 3 and may encode a 66kd repressor-like protein. Nucleic Acids Res. 19:6713-6718.%|7#, 百拇医药

    Perkins, H. D., and A. J. Howells. 1992. Genomic sequences with homology to the P element of Drosophila melanogaster occur in the blowfly Lucilia cuprina. Proc. Natl. Acad. Sci. USA 89:10753-10757.%|7#, 百拇医药

    Pinsker, W., E. Haring, S. Hagemann, and W. J. Miller. 2001. The evolutionary life history of P transposons: from horizontal invaders to domesticated neogenes. Chromosoma 110:148-158.%|7#, 百拇医药

    Rio, D. C., F. A. Laski, and G. M. Rubin. 1986. Identification and immunochemical analysis of biologically active Drosophila P element transposase. Cell 44:21-32.

    Robertson, H. M., and W. R. Engels. 1989. Modified P elements that mimic the P cytotype in Drosophila melanogaster. Genetics 123:815-824.+#6%gd, 百拇医药

    Rubin, G. M., M. G. Kidwell, and P. M. Bingham. 1982. The molecular basis of P-M dysgenesis: the nature of induced mutations. Cell 29:987-994.+#6%gd, 百拇医药

    Simonelig, M., and D. Anxolabéhère. 1991. A P element of Scaptomyza pallida is active in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 88:6102-6106.+#6%gd, 百拇医药

    Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.+#6%gd, 百拇医药

    Tower, J., G. H. Karpen, N. Craig, and A. C. Spradling. 1993. Preferential transposition of Drosophila P-elements to nearby chromosomal sites. Genetics 188:347-349.+#6%gd, 百拇医药

    Witherspoon, D. J. 1999. Selective constraints on P-element evolution. Mol. Biol. Evol. 16:472-478.+#6%gd, 百拇医药

    Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. CABIOS 13:555-556.+#6%gd, 百拇医药

    Zhang, P., and A. C. Spradling. 1993. Efficient and dispersed local P element transposition from Drosophila females. Genetics 188:361-375.+#6%gd, 百拇医药

    Accepted for publication October 1, 2002.(Danielle Nouaud Hadi Quesneville and Dominique Anxolabéhère)

百拇医药网 http://www.100md.com/html/DirDu/2005/05/06/58/24/23.htm