当前位置: 首页 > 医学版 > 期刊论文 > 基础医学 > 分子生物学进展 > 2004年 > 第11期 > 正文
编号:11255303
Long-Term Conservation of Six Duplicated Structural Genes in Cephalopod Mitochondrial Genomes
     Laboratory for Cellular Biochemistry, Department of Molecular Biology, School of Life Science, Tokyo University of Pharmacy and Life Science, Tokyo, Japan

    E-mail: yokobori@ls.toyaku.ac.jp.

    Abstract

    The complete nucleotide sequences of the mitochondrial (mt) genomes of three cephalopods, Octopus vulgaris (Octopodiformes, Octopoda, Incirrata), Todarodes pacificus (Decapodiformes, Oegopsida, Ommastrephidae), and Watasenia scintillans (Decapodiformes, Oegopsida, Enoploteuthidae), were determined. These three mt genomes encode the standard set of metazoan mt genes. However, W. scintillans and T. pacificus mt genomes share duplications of the longest noncoding region, three cytochrome oxidase subunit genes and two ATP synthase subunit genes, and the tRNAAsp gene. Southern hybridization analysis of the W. scintillans mt genome shows that this single genome carries both duplicated regions. The near-identical sequence of the duplicates suggests that there are certain concerted evolutionary mechanisms, at least in cephalopod mitochondria. Molecular phylogenetic analyses of mt protein genes are suggestive, although not statistically significantly so, of a monophyletic relationship between W. scintillans and T. pacificus.

    Key Words: cephalopods ? mitochondrial genome ? gene duplication ? gene rearrangement

    Introduction

    The types and number of genes encoded by metazoan mitochondrial (mt) genomes are well conserved among various metazoan species for which these data are currently available (see Boore [1999]). Most metazoan mt genomes are circular, carrying single copies of 12 protein (cox1 to 3 [cytochrome oxidase subunits I to III], nad1 to 6 and 4L [NADH dehydrogenase subunits 1 to 6 and 4L], atp6 and 8 [ATP synthase subunits 6 and 8], and cob [apocytochrome b]), 2 rRNA (rrnL and rrnS [large and small subunit ribosomal RNAs]), and 22 tRNA genes (trnA, etc.). None or few intergenic nucleotides are found, with the exception of long noncoding regions (NCR) that contain control elements for replication and transcription.

    However, gene organization varies among metazoan species. For example, most molluscan mt genomes reported so far have different gene organizations, and there are large differences in mt gene organization within each class (Hoffmann, Boore, and Brown 1992; Boore and Brown 1994; Hatzoglou, Rodakis, and Lecanidou 1995; Terrett, Miles, and Thomas 1996; Yamazaki et al. 1997; Kurabayashi and Ueshima 2000; Tomita et al. 2002; Grande et al. 2002, Wilding, Mill, and Grahame 1999). In pulmonate land snails, variation in gene organization is found at the level of the superfamily (Yamazaki et al. 1997).

    Multiplication of NCRs has been observed in various metazoan mt genomes (e.g., Kumazawa et al. 1998). In some cases, the duplication of coding regions has also been reported. For example, the oyster Crassostrea gigas mt genome (GenBank/EMBL/DDBJ accession number AF177226) carries two copies of rrnS. The nematode Romanomermis culicivorax has also been reported to carry multiple copies of protein genes (Azevedo and Hyman 1993; Hyman and Azevedo 1996; Hyman, Beck, and Weiss 1988). In these cases, a high sequence similarity between duplicated genes has been observed. On the other hand, gene duplication resulting from one of the duplicates becoming a pseudogene has also been reported. For example, the partial duplication of the gecko Heteronotia binoei mt genome has been reported, and one of the duplicates appears to be a pseudogene (Zevering et al. 1991). Sasuga et al. (1999) identified a pseudogene of trnH between nad4 and nad5 in the Loligo bleekeri mt genome, where trnH is located in the Katharina tunicata mt genome (Boore and Brown 1994). trnH was found to be at a different position (Sasuga et al. 1999; Tomita et al. 2002). It has, thus, been proposed that the metazoan mt genome is under strong selective pressure for genome minimization.

    Recently, Tomita et al. (2002) reported the first complete cephalopod mt nucleotide sequence for the L. bleekeri mt genome. The gene content of the L. bleekeri mt genome is the same as that of the typical metazoan mt genome. However, the arrangement of genes within the mt genome is different from that of any other metazoan reported to date. One of the most notable characteristics of the L. bleekeri mt genome is that it contains three near-identical, 500-bp NCRs. These NCRs are not placed on the genome in tandem; instead, their placement seems to be closely related to the gene rearrangement that has taken place in the L. bleekeri mt genome (Tomita et al. 2002). In addition, concerted evolution could be considered as the mechanism underlying the maintenance of high similarity within these NCR sequences.

    To further our understanding of the evolution of mt genome structures in cephalopods, we determined the complete mt genome nucleotide sequences of three cephalopods, Octopus vulgaris (Coleoidea, Neocoleoidea, Octopodiformes, Octopoda, Incirrata, Octopodidae), Todarodes pacificus (Coleoidea, Neocoleoidea, Decapodiformes, Oegopsida, Ommastrephidae), and Watasenia scintillans (Coleoidea, Neocoleoidea, Decapodiformes, Oegopsida, Enoploteuthidae). The genome structure of the O. vulgaris mt genome is rather similar to that of the K. tunicata mt genome. However, the T. pacificus and W. scintillans mt genomes contain long and complicated duplications of long NCRs and six structural genes. We conclude our report with a discussion of the evolution of cephalopod mt genomes.

    Materials and Methods

    Samples

    O. vulgaris, T. pacificus, and W. scintillans were bought at the Tsukiji Fishery Market, Tokyo, Japan. All individuals were caught in the seas around Japan. The identification of T. pacificus and W. scintillans was carried out by Dr. K. Tsuchiya at the Tokyo University of Marine Science and Technology. The identification of O. vulgaris was confirmed by comparison of partial cox1 and cox3 sequences with published octopod sequences, including that of O. vulgaris (Bonnaud, Boucher-Rodoni, and Monnerot. 1997; Carlini, Young, and Vecchione 2001).

    DNA Isolation, PCR, Cloning, and Sequencing

    The DNA sequence determination strategies for W. scintillans, T. pacificus, and O. vulgaris mt genomes were the same as those used for the Ciona savignyi mt genome (Yokobori, Watanabe, and Oshima 2003). As an example, a brief description of the methods used to determine the sequence of the W. scintillans mt genome follows. First, parts of cox1, cox3, and cob were amplified by nested PCR, using total DNA as the template with EX-Taq DNA polymerase (TAKARA). The primers used for the amplification of partial sequences of cox1, cox3, and cob are listed in table 1. Amplified fragments were cloned with the TOPO TA cloning kit (Invitrogen) and sequenced using a PRISM 3100 DNA autosequencer (Applied Biosystems). For sequence reactions, BigDye Terminator version 3.1 (Applied Biosystems) was used. Using the sequence information, the remaining parts of the mt genomes were amplified by nested long PCR (the primers used are listed in table 1) with LA Taq DNA polymerase (TAKARA). Among all combinations of PCR primers, four combinations of PCR primers, (A) cox3-3' and cox1-5', (B) cox1-3' and cob-3', (C) cob-5' and cox3-5', and (D) cox1-3. cox3-5' gave amplified fragments. The lengths of the PCR fragments are approximately 1 kbps and 2 kbps for (A) (fragment AS and AL, respectively), approximately 6 kbps for (B) (fragment B), and approximately 4.5 kbps for (C) and (D) (fragments C and D). The amplified fragments were cloned and sequenced. The two fragments amplified with the primers specific for cox3-3' and cox1-5' have the cox3 gene at one end and the cox1 gene at the other end. However, their gene organizations are cox3(3')–trnA–trnN–trnI–nad3–cox1(5') for fragment AS and cox3(3')–trnK–trnR–trnS(gcu)–nad2–cox1(5') for fragment AL. The orders in fragments B, C, and D are cox1(3')–cox2–trnD–atp8–atp6–nad5–trnH–nad4–trn4l–trnT–trnS(uga)–cob(3'), cob(5')–nad6–trnP–nad1–trnL(uaa)–trnL(uag)–rrnL–trnY–trnW–trnG–trnE–NCR–cox3(5'), and cox1(3')–cox2–trnD–atp8–atp6–trnF–trnV–rrnS–trnM–trnC–trnQ–NCR–cox3(5'), respectively. The underlined genes are encoded by the opposite strand. Together with five fragments, we identified all known genes, but we also found duplicated long NCRs and six structural genes. Fragments B and C seem to be neighbors at the ends of the cob gene because there are no additional cob gene in fragments AS, AL, or D. Therefore, the order of these five fragments might be B–C–AS–D–AL–(B) or B–C–AL–D–AS–(B). Confirmation of the gene arrangement for the duplicated regions was determined by additional PCR analyses. The PCR primers specific for rrnS (on fragment D) and nad3 (on fragment AS), listed in table 1, were synthesized. Fragments were amplified with nad3-3' primer and rrnS-3' primer by PCR, but no other combination of PCR primers gave amplified fragments. Therefore, the order of the fragments is thought to be B–C–AS–D–AL–(B) (fig. 1). Furthermore, PCR products (0.5 to 5 kbs) covering the entire W. scintillans mtDNA were amplified (fragments 1 to 9), cloned, and sequenced (fig. 1). The sequences of the PCR primers were adapted from the first version of the complete nucleotide sequence of W. scintillans mtDNA. The sequences of these primers are listed in table 1. For the amplification of long fragments (longer than 3 kbps; fragments 1 to 4), fragments were subjected to nested long PCR. The PCR primers used for the nested PCR are listed in table 1. PCR conditions were as described above. Amplified PCR products (fragments 1 to 8) were cloned as described above. Both strands of these fragments were determined by primer-walking. Fragment 9, once amplified, was purified with the QIAquick PCR purification kit (QIAgen) according to the manufacturer's protocol. The resulting purified PCR product was then directly sequenced. Similar sequencing strategies were applied for the remaining T. pacificus and O. vulgaris mt genomes (details are not shown).

    Table 1 PCR Primer Sequences of Analysis of W. scintillans mt Genomes

    FIG. 1.— PCR amplification strategy for the W. scintillans mt genome. In the top row, the inferred gene organization of W. scintillans mt genome is shown. Circular genomes are presented linearly to ease comparison of gene organization. Thick lines are the first PCR fragments. Lines with black arrowheads at both ends are long PCR products. The names of the fragments (AL, AS, B, C, and D) are defined in the text. The PCR fragments (1 to 9), indicated by lines with white arrowheads at both ends, are PCR fragments for sequence confirmation as noted in the text. Each tRNA gene is indicated by the letter corresponding to the appropriate amino acid. L1, L2, S1, and S2 indicate trnL(uaa), trnL(uag), trnS(uga), and trnS(gcu), respectively. The protein and rRNA genes encoded by the opposite strand are shown by gray boxes. The tRNA genes encoded by the opposite strand are shown below the column. Long NCRs are indicated by black boxes.

    Identification of protein and rRNA genes was carried out by comparing them with counterparts from L. bleekeri (Tomita et al. 2002) and K. tunicata (Boore and Brown 1994) mt genomes. Identification of tRNA genes was carried out manually, by making visual searches of the cloverleaf structures. The complete nucleotide sequences of W. scintillans, O. vulgaris, and T. pacificus mt genomes were entered into the DDBJ/EMBL/GenBank DNA databases under the accession numbers AB086202, AB158363, and AB158364, respectively.

    Southern Analysis of the W. scintillans mt Genome

    Total DNA of W. scintillans was prepared from either the liver or eggs of a single animal by QIAgen Genometip (QIAgen). W. scintillans total DNA was digested with Apa I, Bgl II, Nco I, Pst I, Sal I, and Xho I. Six genes, cox1, cox3, cob, nad2, nad3, and rrnS, were the targets of Southern hybridization. Labeled probes with DIG (Digoxigenin) (ca. 300 bp each) were synthesized with a DIG-PCR probing kit (Roche Diagnostics). Sequences of PCR primers are as listed in table 1. Nondigested and digested W. scintillans total DNA, 0.5 μg each, was used for 0.6 % agarose gel electrophoresis in TAE, followed by alkali transfer of the DNA to nylon membrane Hybond N+ (Amersham Biosciences). Southern hybridization was performed with UltraHyb (Roche Diagnostics) as described by the manufacturer. A DIG Nucleic Acid Detection kit (Roche Diagnostics) was used for detection as described by the manufacturer.

    Phylogenetic Analyses

    All mt protein genes, except atp8, were used for phylogenetic analyses. The following complete nucleotide sequence entries were retrieved from GenBank: K. tunicata (U09810), L. bleekeri (AB029616), Inversidens japanensis (female type) (AB055625), Terebratulina retusa (AJ245743), Lumbricus terrestris (U24570), Platynereis dumerii (AF178678), Limulus polyphemus (AF216203), Artemia franciscana (X69067), Drosophila yakuba (X03240), Lithobius forficatus (AF309492), Homo sapiens (J01415), Balanoglossus carnosus (AF051097), Asterina pectinifera (D16387), and Metridium senile (AF000023).

    The amino acid sequences of each protein gene with its counterparts in O. vulgaris, T. pacificus, and W. scintillans were aligned by application of ClustalX (Thompson et al. 1997) using the default settings. The best-aligned regions were selected by GBLOCKS (Castresana 2000). The selected regions of all protein genes were then concatenated (2,592 sites). For maximum-likelihood (ML) analysis (site-by-site rate variations), Tree-Puzzle version 5.1 (Schmidt et al. 2002) was used under the conditions of the mtREV24 substitution model and one invariable and eight-class discrete gamma distribution model for site-by-site rate variations. In total, 16 % of sites were estimated to be invariable, and from the data, the shape parameter of gamma distribution was estimated to be 0.92. In addition, the ML tree without site-by-site rate variation was estimated using the PROTML routine in MOLPHY version 2.3b (Adachi and Hasegawa 1996a). The topology of the neighbor-joining (NJ) tree, which was constructed with the ML distance matrix estimated by PROTML (D option), was used as the initial tree for the ML tree search by the NNI search routine of PROTML.

    Furthermore, four cephalopods and K. tunicata were used for ML analysis (3,505 sites). A possible 15 unrooted tree topologies were then used for the ML analysis with PAML (Yang 1997) (eight-class discrete gamma distribution model) and with Tree-Puzzle (Schmidt et al. 2002) (one invariable and eight-class discrete gamma distribution model).

    Results and Discussion

    General Features of W. scintillans, T. pacificus, and O. vulgaris mt Genomes

    The O. vulgaris mt genome is 15,744 bp long. It encodes the standard set of metazoan mt genes, and the gene organization is similar to that of the polyplacophoran K. tunicata (Boore and Brown 1994); there are only two differences (translocation of trnD and inversion of trnP) (fig. 2).

    FIG. 2.— Comparison of the genome structures of cephalopod mt genomes. Four cephalopod (O. vulgaris, W. scintillans, T. pacificus, and L. bleekeri [Tomita et al. 2002]) mt gene structures were compared with the K. tunicata mt genome structure (Boore and Brown 1994). Circular genomes are presented linearly to ease comparison of gene organization. Abbreviation for each gene is as noted in figure 1.

    In contrast to the O. vulgaris mt genome, the W. scintillans and T. pacificus mt genomes exhibit several unusual features. The mt genomes are 20,091 bp long and 20,254 bp long, respectively, and both carry the standard sets of metazoan mt genes, but six structural genes—cox3, cox1, cox2, trnD, atp8, and atp6—and the longest noncoding region are duplicated (fig. 2). The duplication patterns of these two genomes are not simple. The first duplicate copy contains a four-gene insertion—trnA, trnN, trnI, and nad3—between cox3 and cox1, whereas the second copy carries a different insertion—trnK, trnR, trnS(gcu), and nad2—between cox3 and cox1. In addition, the two duplicates are separated by the following genes: trnF, trnV, rrnS, trnM, trnC, and trnQ. Otherwise, the mt genome structures of W. scintillans and T. pacificus are nearly identical; only the location of trnM differs between them.

    The nucleotide compositions of O. vulgaris, T. pacificus, and W. scintillans mt genomes are 41.2 % A, 33.2 % T, 7.6 % G, and 17.6 % C for O. vulgaris; 38.4 % A, 34.2 % T, 9.9 % G, and 17.5 % C for T. pacificus; and 35.3 % A, 33.4 % T, 11.6 % G, and 19.2 % C for W. sintillans, respectively. These values are similar to those in the L. bleekeri mt genome (Tomita et al. 2002). The nucleotide composition of NCRs of O. vulgaris, T. pacificus, and W. scintillans mt genomes are 44.0 % A, 37.6 % T, 4.6 % G, and 13.8 % C for O. vulgaris; 40.4 % A, 39.3 % T, 5.5 % G, and 14.8 % C for T. pacificus; and 39.0 % A, 35.7 % T, 9.1 % G, and 16.2 % C for W. sintillans, respectively. In all three cephalopod mt genomes, the NCRs are slightly richer in A and T than are other regions.

    O. vulgaris, T. pacificus, and W. scintillans mt NCRs form stem-and-loop structures at the both ends (data not shown). In addition, within their central regions, one or more stem-and-loop structures can also be formed. The NCRs of T. pacificus and W. scintillans mt genomes are well conserved (66.7% to 66.8%). The 5' part is conserved more than the central region and the 3' part of the NCRs between the T. pacificus and W. scintillans mt genomes. The L. bleekeri NCR is less similar to the W. scintillans and T. pacificus NCRs, but these three NCR sequences are easily aligned. The O. vulgaris NCR shares some characteristics in its primary sequences with squid NCRs, but alignment of the O. vulgaris NCR with squid NCRs are not easy. Near the 5' end of the NCRs, a short sequence, 5'-TATATATAATAAACA-3', is conserved between T. pacificus and W. scintillans, and two similar sequences, 5'-TGTATATAATACACG-3'and 5'-TGTATATAATATACA-3', are found near the 5' ends of the L. bleekeri and O. pacificus NCRs, respectively. At the 3' end half, all four cephalopod NCRs carry a C-rich track. These shared characteristics among cephalopod NCRs possibly contribute to the functions of the NCRs in replication and transcription initiation. However, further studies are required to determine this for certain.

    Southern Aanalysis for Confirmation of W. scintillans Gene Organization

    To confirm gene duplication in the W. scintillans mt genome, Southern analysis was performed. The predicted lengths and locations of the restriction fragments of W. scintillans mtDNA and the predicted hybridization patterns are shown in figure 3A and B.

    FIG. 3.— Southern analysis of the W. scintillans mt genome. (A) Locations of probes (vertical arrows) and restriction fragments of the W. scintillans mt genome. (B) Expected results of the Southern hybridization experiment on the W. scintillans mt genome. Abbreviations for each gene are as follows. III: cox3; I: cox1; 3: nad3; 2: nad2; B: cob; and S: rrnS. Abbreviations for each restriction enzyme are as follows. C: control (noncut); A: Apa I; B: Bgl II; N: Nco I; P: Pst I; S: Spe I; and X: Xho I. (C) Southern hybridization. The results of hybridizations of various probes with W. scintillans total DNA are shown. The noncut DNA (C), Apa I–treated DNA (A), Bgl II–treated DNA (B), Nco I–treated DNA (N), Pst I–treated DNA (P), Spe I–treated DNA (S), and Xho I–treated DNA (X) were loaded for each gel. In the lower column for each hybridization image, the name of the probe is shown. "All" indicates that a mixture of probes for all six genes was used for hybridization.

    Using the restriction enzymes that were cut once for W. scintillans mtDNA (Apa I and Xho I) caused all probes derived from cox1, cox3, nad2, nad3, cob, and rrnS to hybridize to the same band (fig. 3C). As predicted (fig. 3A and B), the specific probe for cox1, which might be duplicated, hybridized to two distinct bands of W. scintillans DNA treated with either Bgl II, Nco I, or Pst I (fig. 3C). Again, as predicted (fig. 3A and B), the cox3-specific probe also hybridized to two distinct bands of W. scintillans DNA treated with either Nco I, Pst I, or Spe I (fig. 3C). Conversely, the other four probes hybridized to a single band (fig. 3C). All the hybridization patterns match the gene organization of the W. scintillans mt genome presented in figures 1 and 2A. Note that the W. scintillans DNA used for the Southern hybridization analysis was prepared from a different individual that those used for sequence determination. In addition, we could not find any additional bands that are not inferred from the determined nucleotide sequence shown in figure 3C. Therefore, the observed gene duplication in the W. scintillans mt genome could be common for the W. scintillans mt genome. Thus, the gene organizations of the W. scintillans and T. pacificus mt genomes shown in figure 2 are suggested.

    Concerted Evolution

    The sequences of the duplicated regions are nearly identical within each species, but there are large differences when comparing between W. scintillans and T. pacificus (table 2). From the analyses of duplicated noncoding regions in various metazoan mt genomes, it has been suggested that there are concerted evolution processes in metazoan mitochondria (cf. Kumazawa et al. 1998). Because the W. scintillans and T. pacificus mt genomes belonging to different families share gene duplication, and the duplicated regions in the same species have nearly identical sequences, it should be concluded that concerted evolution processes exist within cephalopod mt genomes. Duplicated units in the W. scintillans and T. pacificus mt genomes are not placed in tandem, meaning that a simple slippage model cannot explain their sequence homogeneity; however, inter/intramolecular recombination might provide an adequate explanation. In certain molluscan species, two types of mtDNA (male and female types) coexist within a single cell (e.g., Zouros et al. 1994). Some reports suggest the existence of intermolecular recombination (Ladoukakis and Zouros 2001), supporting our conclusion that recombination is the possible mechanism maintaining the homogeneity of the duplicated sequences in W. scintillans and T. pacificus mt genomes. Furthermore, recombination activity has also been reported from mammalian mitochondria (Thyagarajan, Padua, and Campbell 1996).

    Table 2 Nucleotide Sequence Similarities (%) Between Duplicated Regions of W. scintillans and T. pacificus mt Genomes

    To understand the effect of gene duplication on the evolutionary rate, the relative rate test (RRTree [Robinson-Rechavi and Huchon 2000]) of each protein gene (amino acid sequence level) was performed. When the evolutionary rate is compared among O. vulgaris, L. bleekeri, T. pacificus, and W. scintillans (using K. tunicata as an outgroup), there are no significant differences in the evolutionary rate of any species pair (data not shown). When the evolutionary rate is compared among squid species (using O. vulgaris as an outgroup), only the hypothesis that T. pacificus and W. scintillans nad4 evolved at the same rate is statistically rejected (P < 0.01) (data not shown). These results suggest that neither the duplication of genes or concerted evolutionary process seriously affect the evolutionary rate of the genes themselves.

    Conversely, recent independent duplications in the W. scintillans and T. pacificus mt genomes cannot be rejected. This scenario would explain the occurrence of nearly identical sequences between duplicates in a single genome, but different sequences between different species' duplicated regions. However, as we discuss later, for such a situation to be at all possible, at least two duplication events and several subsequent gene-loss events would have needed to take place independently, but in the same order, in both the T. pacificus and W. scintillans lineages. This extent of parallel evolution in the T. pacificus and W. scintillans mt genomes requires far more assumptions than the hypothesis that concerted evolution maintains homogeneity of the duplicated sequences within a species. Therefore, we believe that the model of ancestral gene duplication in T. pacificus and W. scintillans mt genomes followed by a concerted evolution process that homogenized the duplicated sequences is more likely than the model of recent and independent gene duplications in both T. pacificus and W. scintillans mt genomes.

    As shown in table 2, there are four differences between duplicates of the protein genes in the W. scintillans mt genome. These differences are found at the second position of codons. This means that the resultant proteins of these genes have different amino acid sequences. These differences can change the property of proteins originated from another copy of the duplicated genes. However, all the substituted positions are at the poorly conserved regions, suggesting that their effects on the proteins might not be serious. In addition, when the sequenced PCR clones were checked, nucleotide variations are found at different sites within the duplicated genes. The minor nucleotides of PCR clones at various positions in the one of duplicates are identical to the nucleotides at the corresponding positions in the other duplicates. The observed sequence differences between the copies of duplicated regions might be because of polymorphism, although the possibility of PCR errors cannot be ignored. Does this situation suggest relaxed functional constraints in the duplicated genes? To address this question, further analyses of the function of these duplicated genes are needed.

    Evolution of Cephalopod mt Gene Arrangement

    Within both this study and our previous study (Tomita et al. 2002), we have reported four complete cephalopod mt genome nucleotide sequences: L. bleekeri, W. scintillans, T. pacificus, and O. vulgaris. The O. vulgaris mt genome appears to retain a greater level of ancestral gene organization than the other species because the O. vulgaris mt gene organization is nearly identical to that of the polyplacophoran K. tunicata (fig. 2).

    When the O. vulgaris and K. tunicata mt genomes are compared, one ancestral gene location and one derived gene location can be distinguished. In the case of the O. vulgaris mt genome, the location of trnD is the ancestral feature, whereas the direction of trnP is the derived feature. Conversely, in the K. tunicata mt genome, the direction of trnP is the ancestral feature, whereas the location of trnD is the derived feature. As Tomita et al. (2002) have discussed, a trnD gene located between cox2 and atp8 is found in the Littorina saxatilis (Mollusca and Gastropoda) mt genome (Wilding, Mill, and Grahame 1999) and various other metazoan mt genomes such as several arthropod mt genomes (e.g., Clary and Wolstenholme 1985; Lavrov, Boore, and Brown 2000). This finding suggests that the location of trnD in the O. vulgaris mt genome is the ancestral feature for that in the K. tunicata mt genome. On the other hand, the same direction of trnP in K. tunicata, rather than that observed in O. vulgaris, is found in these other mt genomes (e.g., Clary and Wolstenholme 1985; Lavrov, Boore, Brown 2000), suggesting that it is the ancestral feature for the direction observed in the O. vulgaris mt genome.

    Using a model in which slippage during replication causes gene duplication, after which random gene loss causes changes in gene organization, differences in the gene organization of K. tunicata and O. vulgaris mt genomes can be explained (fig. 4A). A tRNA-like cloverleaf structure (anticodon AGA) is found between cox2 and atp8 in the K. tunicata mt genome (Boore and Brown 1994). However, we found another possible tRNA-like structure in this region (positions 2810 to 2871, overlapping the tRNASer-like structure [antiocodon AGA]). The sequence of the cloverleaf structure is very similar to that of trnD (78% identity), although the anticodon loop of the former structure is composed of eight nucleotides (5'-TTATTTAA-3') instead of seven nucleotides. Thus, the cloverleaf structure between cox2 and atp8 in the K. tunicata mt genome seems to be a pseudogene of trnD. Pseudogenes of trnD could be created during the rearrangement process, as in the case of the L. bleekeri mt genome, in which the creation of the trnH pseudogene by a similar means has been reported (Sasuga et al. 1999).

    FIG. 4.— Possible pathways for how W. scintillans and T. pacificus mt gene organizations originated from the ancestral gene organizations found in O. vulgaris and K. tunicata mt genomes. (A) Between O. vulgaris and K. tunicata mt genomes. (B) From the O. vulgaris type mt genome to the W. scintillans mt genome. In both models, one of two ends of the duplicated unit is under the constraint of being in the noncoding region. The and ? mean of the seven-tRNA gene cluster appears upstream of NCR and the five-tRNA gene cluster appears between cox3 and nad3, respectively, in O. vulgaris and K. tunicata mt genomes and their corresponding regions in other "mt genomes" such as W. scintillans mt genome. Abbreviations of tRNA genes are as in figure 1. Abbreviations of protein and rRNA genes are as follows: C1 to C3 for cox1 to 3, CB for cob, A6 and A8 for atp6 and 8, N1 to N6 and NL for nad1 to 6 and nad4L, and LR and SR for rrnL and rrnS.

    Both W. scintillans and T. pacificus mt gene organizations could have originated from an Octopus-type mt gene organization, through two gene-duplication events followed by the loss of one of the duplicated genes (fig. 4B). This process appears to be a typical sequence of events in the evolution of gene organization. The problem is that the W. scintillans and T. pacificus mt genomes carry six duplicated genes. Because there are only simple sequence differences between the duplicated regions, including those of the long NCRs in which the transcription initiation site can be located (table 2), both duplicated genes could be functional.

    As discussed above (see also figure 4B), many processes (duplications and loss of genes and NCR) are needed to create W. scintillans and T. pacificus mt gene organizations from Octopus-type mt gene organization. However, the position of trnM is the only difference between the W. scintillans and T. pacificus mt genomes. If, as the result of parallel evolution of W. scintillans and T. pacificus mt genomes, the gene duplication and gene organization of W. scintillans and T. pacificus mt genomes were created independently, not just one but many independent, identical gene losses and gene rearrangements would have been needed, although we do not have to presume the existence of a concerted evolution process. For example, parallel evolution on mt gene organization has been suggested in bird mt genomes (Mindell, Sorenson, and Dimcheff 1998). However, we prefer the ancient duplication model rather than the parallel evolution model, because the former requires fewer gene rearrangement/gene loss events than the latter.

    Why Are Duplicated Genes Maintained in the W. scintillans and T. pacificus mt Genomes?

    Several researchers have pointed out that an mt genome with a duplicated functional long NCR has several advantages over one with only a single NCR (e.g., Kumazawa et al. 1998). Because the long NCR contains the initiation region for replication, mt genomes with multiple long NCRs can be replicated from multiple points, whereas those with only a single long NCR can only be replicated from one point. An mt genome with multiple replication origins might, therefore, be capable of faster replication, provided the duplicated region is not too long. In mitochondria, there are multiple copies of mtDNA, and there must be a certain degree of selection pressure between individual mtDNAs. Therefore, mt genomes with multiple replication origins might be expected to proliferate at the expense of those with only a single replication origin.

    Are there any advantages for maintaining duplicate copies of functional structural genes in W. scintillans and T. pacificus mt genomes? Among metazoan mt genomes for which complete nucleotide sequences have been published, there is only one (Venerupis philippinarum [AB065375]) that encodes duplicated protein genes (cox2). However, in this genome, the two cox2 genes have very different sequences. As we have shown above, this is not the case for the duplicate genes in W. scintillans and T. pacificus mt genomes.

    Consider that each subunit of NADH dehydrogenase, for which the gene is encoded by the mt genome, is equally synthesized and that the rate of translation of each subunit of NADH dehydrogenese gene is also equal. If the two assumptions of equal transcription rate of genes and equal translation rate of NADH dehydrogenase genes are accepted, the duplicated genes—cox3, cox1, cox2, trnD, atp8, and atp6—should be transcribed twice as frequently as nad2, nad3, and other NADH dehydrogenase subunit genes. The duplicated protein genes in W. scintillans and T. pacificus mt genomes encode subunits of complex IV and complex V but not complex I and complex III.

    The transcription frequency of rRNA molecules is differently controlled from those of mRNA and tRNA molecules in vertebrate mitochondria (e.g., Clayton 1992). In addition to the transcript covering the entire genome, another transcript containing only srRNA, lrRNA, tRNAPhe, and tRNAVal is also synthesized. Thus, in vertebrate mitochondria, the copy numbers of rRNAs are generally maintained at a higher level than those of tRNA and mRNA (Clayton 1992). A similar situation might exist for rRNA expression in the O. vulgaris mt genome. In the cases of W. scintillans and T. pacificus mt genomes, two rRNA genes are encoded at the distinct positions. However, both rrnL and rrnS are flanked at both ends by tRNA gene(s), and the long NCRs are located upstream from the tRNA genes, which are next to the 5' ends of rrnL and rrnS. Because the long NCRs have nearly identical sequences, the number of transcripts of rrnL and rrnS can be controlled in a similar manner.

    Molecular Phylogenetic Analysis

    Quartet-puzzling (QP) analysis with Tree-Puzzle (Schmidt et al. 2002) under the invariable and gamma model (mtREV model) shows monophyly of oegopsids (W. scintillans and T. pacificus) (fig. 5). L. bleekeri, a representative of the myopsids in this tree, is the sister group of the oegopsid squids, and O. vulgaris is the sister taxon of the decapods in this tree. In this tree, cephalopods form a monophyletic group with K. tunicata (Polypracophora) and Inversidens (Bivalvia). However, statistical support for the monophyly of the Mollusca is not sufficiently high. On the other hand, Terebratulina retusa, a representative of the Brachiopoda, forms a group with the annelid species in this tree, although the statistical support for this group is not particularly high. The monophyly of three groups (Mollusca, Annelida, and Brachiopoda) of the Lophotrochozoa is well supported in this tree, as in our previous analysis (Tomita et al. 2002), but the relationship among the major groups of Mollusca, Annelida, and Brachiopoda could not be resolved from the present data.

    FIG. 5.— ML analysis of 17 metazoans based on the amino acid sequences of concatenated 12-mt protein genes. The ML tree inferred with Tree-Puzzle is shown. Support for the internal branches of the quartet puzzling tree topology (%) (left) and local bootstrap probability of the internal branches of ML tree estimated with PROTML (%) (right) are shown at the node. The log-likelihood of the tree estimated with Tree-Puzzle is –49482.91. The log-likelihood of this tree (±SE) estimated with PROTML is –52542.78 (±696.28). The topologies of the trees obtained with these two methods are identical.

    When we performed a ML analysis on five molluscan species—K. tunicata (outgroup), O. vulgaris, L. bleekeri, T. pacificus, and W. scintillans—using CODEML in PAML (Yang 1997) (eight-class discrete gamma distribution model), we found that W. scintillans and T. pacificus are monophyletic, and L. bleekeri is their sister taxon in the ML tree (lnL = –24077.52). Next, the tree in which L. bleekeri and T. pacificus are monophyletic is the second tree (difference of lnL from best tree with SE = –5.49 ± 10.84), and this tree cannot be rejected on the basis of BP values, the Kishino-Hasegawa test (Kishino and Hasegawa 1989), the Shimodaira-Hasegawa test (Shimodaira and Hasegawa 1999), or differences in log-likelihood. On the other hand, in the ML tree constructed with Tree-Puzzle (one invariable and eight-class discrete gamma distribution model), L. bleekeri and T. pacificus appear as the sister taxa (lnL = –23817.07). The best tree chosen by CODEML is the second tree (difference of lnL from best tree with SE = 12.08 ± 9.65), and it cannot be rejected.

    The relationship among W. scintillans, T. pacificus, and L. bleekeri could not be resolved by our analyses, although a closer relationship between W. scintillans and T. pacificus is suggested by the mt genome structures. This finding, in turn, suggests that the squid radiation has occurred over a rather short period. In addition, the monophyly of the Mollusca could not be supported. The radiation of the Mollusca, Annelida, and Brachiopoda might also have occurred over a short period.

    Conclusion

    The T. pacificus and W. scintillans mt genomes are the first examples of metazoan mt genomes that have been found to have stably carried duplicated structural genes over a long period. Together with the L. bleekeri mt genome (Tomita et al. 2002), which carries a triplicated NCR with nearly identical sequences, the cephalopod, at least squid, mitochondria are concluded to have certain concerted evolutionary processes. Analyses of other cephalopod mt genome structures will tell us how complicated squid mt genomes really are.

    On the other hand, how such mt genomes with duplicated noncoding regions and structural genes maintain their functions, such as replication and transcription, is an unresolved but important issue. The means of replication, transcription, and other processes within cephalopod mt genomes provide an interesting target for further studies.

    Acknowledgements

    We thank Dr. K. Tsuchiya at Tokyo University of Marine Science and Technology for the identification of squid samples. We also thank Dr. A. Yamagishi at Tokyo University of Pharmacy and Life Science for his valuable comments. This work was supported by grants to T.O. and S.Y. from Ministry of Education, Culture, Sports, Science and Technology, Japan.

    References

    Adachi, J., and M. Hasegawa. 1996a. MOPHY 2.3b. Institute of Statistical Mathematics, Tokyo.

    ———. 1996b. Model of amino acid substitution in proteins encoded by mitochondrial DNA. J. Mol. Evol. 42:459–468.

    Azevedo, J. L., and B. C. Hyman. 1993. Molecular characterization of lengthy mitochondrial DNA duplications from the parasitic nematode Romanomermis culicivorax. Genetics 133:933–942.

    Bonnaud, L., R. Boucher-Rodoni, and M. Monnerot. 1997. Phylogeny of cephalopods inferred from mitochondrial DNA sequences. Mol. Phylogenet. Evol. 7:44–54.

    Boore, J. L. 1999. Animal mitochondrial genomes. Nucleic Acids Res. 27:1767–1780.

    Boore, J. L., and W. M. Brown. 1994. Complete DNA sequence of the mitochondrial genome of the black chiton, Katharina tunicata. Genetics 138:423–443.

    Carlini, D. B., R. E. Young, and M. Vecchione. 2001. A molecular phylogeny of the Octopoda (Mollusca: Cephalopoda) evaluated in light of morphological evidence. Mol. Phylogenet. Evol. 21:388–397.

    Castresana, J. 2000. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17:540–552.

    Clary, D. O., and D. R. Wolstenholme. 1985. The mitochondrial DNA molecular of Drosophila yakuba: nucleotide sequence, gene organization, and genetic code. J. Mol. Evol. 22:252–271.

    Clayton, D. A. 1992. Transcription and replication of animal mitochondrial DNAs. Int. Rev. Cytol. 141:217–232.

    Grande, C., J. Templado, J. L. Cervera, and R. Zardoya. 2002. The complete mitochondrial genome of the nudibranch Roboastra europaea (Mollusca: Gastropoda) supports the monophyly of opisthobranchs. Mol. Biol. Evol. 2002 19:1672–1685.

    Hatzoglou, E., G. C. Rodakis, and R. Lecanidou. 1995. Complete sequence and gene organization of the mitochondrial genome of the land snail Albinaria coerulea. Genetics 140:1353–1366.

    Hoffmann, R. J., J. L. Boore, and W. M. Brown. 1992. A novel mitochondrial genome organization for the blue mussel, Mytilus edulis. Genetics 131:397–412.

    Hyman, B. C., and J. L. Azevedo. 1996. Similar evolutionary patterning among repeated and single copy nematode mitochondrial genes. Mol. Biol. Evol. 13:221–232.

    Hyman, B. C., J. L. Beck, and K. C. Weiss. 1988. Sequence amplification and gene rearrangement in parasitic nematode mitochondrial DNA. Genetics 120:707–712.

    Kishino, H., and M. Hasegawa. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea. J. Mol. Evol. 29:170–179.

    Kumazawa, Y., H. Ota, M. Nishida, and T. Ozawa. 1998. The complete nucleotide sequence of a snake (Dinodon semicarinatus) mitochondrial genome with two identical control regions. Genetics 150:313–329.

    Kurabayashi, A., and R. Ueshima. 2000. Complete sequence of the mitochondrial DNA of the primitive opisthobranch gastropod Pupa strigosa: systematic implication of the genome organization. Mol. Biol. Evol. 17:266–277.

    Ladoukakis, E. D., and E. Zouros. 2001. Direct evidence for homologous recombination in mussel (Mytilus galloprovincialis) mitochondrial DNA. Mol. Biol. Evol. 18:1168–1175.

    Lavrov, D. V., J. L. Boore, and W. M. Brown. 2000. The complete mitochondrial DNA sequence of the horseshoe crab Limulus polyphemus. Mol. Biol. Evol. 17:813–824.

    Mindell, D. P., M. D. Sorenson, and D. E. Dimcheff. 1998. Multiple independent origins of mitochondrial gene order in birds. Proc. Natl. Acad. Sci. USA 95:10693–10697.

    Robinson-Rechavi, M., and D. Huchon. 2000. RRTree: relative rate tests between groups of sequences on a phylogenetic tree. Bioinformatics 16:296–297.

    Sasuga, J., S. Yokobori, M. Kaifu, T. Ueda, K. Nishikawa, and K. Watanabe. 1999. Gene contents and organization of a mitochondrial DNA segment of the squid Loligo bleekeri. J. Mol. Evol. 48:692–702.

    Schmidt, H. A., K. Strimmer, M. Vingron, and A. von Haeseler. 2002. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502–504.

    Shimodaira, H., and M. Hasegawa. 1999. Multiple comparison of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16:1114–1116.

    Terrett, J. A., S. Miles, and R. H. Thomas. 1996. Complete DNA sequence of the mitochondrial genome of Cepaea nemoralis (Gastropoda: Pulmonata). J. Mol. Evol. 42:160–168.

    Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:4876–4882.

    Thyagarajan, B., R. A. Padua, and C. Campbell. 1996. Mammalian mitochondria process homologous DNA recombination activity. J. Biol. Chem. 271:27536–27543.

    Tomita, K., S. Yokobori, T. Oshima, T. Ueda, and K. Watanabe. 2002. Cephalopod Loligo bleekeri mitochondrial genome: multiplied noncoding regions and transposition of tRNA Genes. J. Mol. Evol. 54:486–500.

    Wilding, C. S., P. J. Mill, and J. Grahame. 1999. Partial sequence of the mitochondrial genome of Littorina saxatilis: relevance to gastropod phylogenetics. J. Mol. Evol. 48:348–359.

    Yamazaki, N., R. Ueshima, J. A. Terrett et al. (12 co-authors). 1997. Evolution of pulmonate gastropod mitochondrial genomes: comparisons of gene organizations of Euhadra, Cepaea and Albinaria and implications of unusual tRNA secondary structures. Genetics 145:749–758.

    Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. CABIOS 13:555–556.

    Yokobori, S., Y. Watanabe, and T. Oshima. 2003. Mitochondrial genome of Ciona savignyi (Urochordata, Ascidiacea, Enterogona): comparison of gene arrangement and tRNA genes with Halocynthia roretzi mitochondrial genome. J. Mol. Evol. 57:574–587.

    Zevering, C. E., C. Moritz, A. Heideman, and R. A. Sturm. 1991. Parallel origins of duplications and the formation of pseudogenes in mitochondrial DNA from parthenogenetic lizards (Heteronotia binoei; Gekkonidae). J. Mol. Evol. 33:431–441.

    Zouros E., A. O. Ball, C. Saavedra, and K. R. Freeman, 1994. An unusual type of mitochondrial DNA inheritance in the blue mussel Mytilus. Proc. Natl. Acad. Sci. USA 91:7463–7467.(Shin-ichi Yokobori, Naoya)