当前位置: 首页 > 期刊 > 《分子生物学进展》 > 2005年第2期 > 正文
编号:11176488
Evolution of the AID/APOBEC Family of Polynucleotide (Deoxy)cytidine Deaminases
http://www.100md.com 《分子生物学进展》
     Medical Research Council Laboratory of Molecular Biology, Cambridge, UK

    Correspondence: E-mail: silvoc@mrc-lmb.cam.ac.uk.

    Abstract

    The AID/APOBEC family (comprising AID, APOBEC1, APOBEC2, and APOBEC3 subgroups) contains members that can deaminate cytidine in RNA and/or DNA and exhibit diverse physiological functions (AID and APOBEC3 deaminating DNA to trigger pathways in adaptive and innate immunity; APOBEC1 mediating apolipoprotein B RNA editing). The founder member APOBEC1, which has been used as a paradigm, is an RNA-editing enzyme with proposed antecedents in yeast. Here, we have undertaken phylogenetic analysis to glean insight into the primary physiological function of the AID/APOBEC family. We find that although the family forms part of a larger superfamily of deaminases distributed throughout the biological world, the AID/APOBEC family itself is restricted to vertebrates with homologs of AID (a DNA deaminase that triggers antibody gene diversification) and of APOBEC2 (unknown function) identifiable in sequence databases from bony fish, birds, amphibians, and mammals. The cloning of an AID homolog from dogfish reveals that AID extends at least as far back as cartilaginous fish. Like mammalian AID, the pufferfish AID homolog can trigger deoxycytidine deamination in DNA but, consistent with its cold-blooded origin, is thermolabile. The fine specificity of its mutator activity and the biased codon usage in pufferfish IgV genes appear broadly similar to that of their mammalian counterparts, consistent with a coevolution of the antibody mutator and its substrate for the optimal targeting of somatic mutation during antibody maturation. By contrast, APOBEC1 and APOBEC3 are later evolutionary arrivals with orthologs not found in pufferfish (although synteny with mammals is maintained in respect of the flanking loci). We conclude that AID and APOBEC2 are likely to be the ancestral members of the AID/APOBEC family (going back to the beginning of vertebrate speciation) with both APOBEC1 and APOBEC3 being mammal-specific derivatives of AID and a complex set of domain shuffling underpinning the expansion and evolution of the primate APOBEC3s.

    Key Words: AID ? APOBEC ? antibody gene diversification ? hypermutation ? RNA editing ? DNA deamination ? immunity

    Introduction

    The AID/APOBEC protein family comprises several members that are capable of deaminating cytosine to uracil in the context of a single-stranded polynucleotide while fulfilling quite diverse physiological functions. The founder member APOBEC1 is the catalytic component of a complex that edits apolipoprotein B RNA, deaminating C6666U, thereby creating a premature stop codon and potentiating the tissue-specific production of a truncated apolipoprotein B polypeptide chain (Navaratnam et al. 1993; Teng, Burant, and Davidson 1993). In contrast, AID functions in the adaptive humoral immune system, where it deaminates cytosine residues in the DNA of the immunoglobulin locus, thereby, potentiating antibody gene diversification (somatic hypermutation and gene conversion of the immunoglobulin V gene and switch recombination of the IgC gene) (Muramatsu et al. 1999, 2000; Revy et al. 2000; Neuberger et al. 2003). Two members of the APOBEC3 family (APOBEC3G and APOBEC3F) underpin an innate pathway of restriction of retroviral infection, deaminating the cytosines in retroviral first-strand cDNA replication intermediates (Sheehy et al. 2002; Harris et al. 2003; Lecossier et al. 2003; Mangeat et al. 2003; Zhang et al. 2003; Bishop et al. 2004; Liddament et al. 2004; Wiegand et al. 2004; Zheng et al. 2004). The physiological functions of APOBEC2 (Liao et al. 1999; Anant et al. 2001) and of other APOBEC3s are unknown.

    APOBEC1 was the first member of the family to be discovered (Teng, Burant, and Davidson 1993) and has been used as a paradigm for subsequent investigations. However, here we use phylogenetic sequence analysis to gain insight into the likely ancestral function of the AID/APOBEC family and conclude that APOBEC1 is a recent evolutionary arrival and that AID and APOBEC2 are the ancestral family members. It is possible, therefore, that RNA editing is a recently acquired activity for an AID/APOBEC family member and may not have provided the major driving selective force for the evolution of the family.

    Materials and Methods

    DNA/Protein Sequences and Comparisons

    To prepare a database of deaminases, phi-Blast searches (Zhang et al. 1998) were performed using the following deaminases as queries: (the number of hits returned from each search is indicated in parentheses; human query sequences were used unless otherwise specified) APOBEC1 (116); AID (118); CDA (565); CDD1 [S. cerevisiae] (438); dCMP DA (435); ADAR1 (336); ADAT1 (148); ADAT1 [S. cerevisiae] (148); ADAT2 (457); ADAT3 (402) Recursive searches were performed on the nonredundant database of NCBI using a threshold of P < 0.005. The resulting sequences were pooled, and redundant sequences, splice variants, sequences from closely related organisms, and those of bacterial origin were removed, leaving 114 protein sequences from fungi, archaea, and metazoa.

    To minimize the risk that relevant, available genes had been omitted from the pool, Blast searches were performed using human AID/APOBEC sequences as queries on specific EST (frog, chicken, horse, cow, and pig) and genome databases (H. sapiens, M. musculus, R. norvegicus, T. rubripes, D. rerio, C. elegans, D. melanogaster, D. discoideum, S. cerevisiae, S. purpuratus, and C. intestinalis). The final data set consisted of 160 protein sequences belonging to characterized or putative deaminases.

    Owing to the divergence in the primary sequences among the various groups of deaminases, we restricted our phylogenetic analysis to the region containing the zinc-coordinating motif. Most vertebrate deaminases (dCMP, Cytosine, and AID/APOBEC) encode the [HC]XE-PCXXC signature of the zinc-coordination motif on a single 150-nt to 300-nt exon: for these deaminases, the analyses were performed using the sequence of this exon. In the case of cytidine deaminases and ADAR/ADATIs, where the zinc-coordinating motif is encoded on more than one exon, we used a 150–amino acid sequence corresponding to the exon boundaries of the zinc domain of the (dCMP, Cytosine, and AID/APOBEC) deaminases. Regarding the double-domained APOBEC3s, we divided the molecules according to the domain boundaries and considered each of them individually. DNA sequences from the exon encoding for the zinc-coordinating domain were used to generate the tree shown in figure 2. Sequences were aligned with ClustalX (Thompson et al. 1997) using the [HC]XE and PCXXC motifs as guides: the resulting alignment was used to generate a phylogenetic tree (the alignments are available as Supplementary Material online).

    FIG. 2.— Phylogenetic relationships within the AID/APOBEC family. Neighbor-joining tree generated from a DNA alignment of the exon encoding the presumed zinc-coordinating motif. The different clusters in the AID/APOBEC family are identifed with the APOBEC3 cluster, further divided into Z1a, Z1b, and Z2 domains. The N-terminal and C-terminal domains in the double-domained APOBEC3 sequences are labeled [N] and [C], respectively. The labels at the end of each branch indicate the organism from which the sequence derives (Bos taurus, Bt; Danio rerio, Dr; Equus caballus, Ec; Gallus gallus, Gg; Homo sapiens, Hs; Monodelphis domestica, Md; Mus musculus, Mm; Oryctolagus cuniculus, Oc; Oryzias latipes, Ol ; Rattus norvegicus, Rn; Sus scrofa, Ss; Takifugu rubripes, Tr; and Xenopus laevis, Xl). Bootstrap values greater than 50 are shown. Phylogenetic trees with a similar topology were generated using sequences from other exons.

    The MEGA package (Kumar et al. 2001) was used for the phylogenetic analysis. Given the large number of sequences, the relatively low number of amino acids (average length of the sequences used was 80 amino acids), and the relatively high divergence among the sequences, the phylogenetic tree was generated using the neighbor-joining method, and the p-distance was used to measure the pairwise sequence distances (Nei, Kumar, and Takahashi 1998; Takahashi and Nei 2000). Other tree-generating algorithms (minimal evolution, UPMGA, maximum likelihood) resulted in phylogenetic trees with a similar topology regarding the main branching.

    Logo alignments (Schneider and Stephens 1990), for consistency given the restricted distribution of AID/APOBEC genes, were generated using only animal sequences.

    Analysis of Synteny

    Maps were derived using the assemblies from human (release 34), murine (release 32), chimpanzee (release 22.1.1), rat (release 22.3b.1), chicken (release 22.1.1), pufferfish (release 22.2c.1) and zebrafish (release 22.3b.1) genomes. Where adequate annotation was not available (in particular for the chicken and the pufferfish genomes), the genes flanking the AID/APOBEC1, APOBEC2, and APOBEC3 loci in human and mouse were Blasted against the relevant genomes and, from the results, the genomic contigs then compared with the human/rodent loci.

    Assaying Mutator Activity

    The pufferfish AID cDNA was assembled by joining PCR-derived exons from a genomic DNA library (details of the primers used are in table 1 of Supplementary Material online). The prokaryotic expression plasmid was generated by cloning the assembled cDNA as an NcoI/BamHI fragment into pTrc99A (Petersen-Mahrt, Harris, and Neuberger 2002). Mutation analyses were performed as described in Petersen-Mahrt, Harris, and Neuberger (2002) using E. coli strains KL16 (Hfr (PO-45) relA1 spoT1 thi-1) and its ung-1 derivative (BW310). Bacterial cultures were grown at either18°C for 32 h or 37°C for 14 h.

    Serine Codon Preference in Fish IgV Genes

    The sequences of pufferfish and zebrafish genomic immunoglobulin V genes were retrieved from the Ensembl database and checked for presence of the heptamer/nonamer rearrangement signal sequences at their 3' ends. The V gene sequences were then aligned with ClustalX and the resulting alignment used to generate a tree and assign individual V gene sequences to the different locus and family groups. CDRs were identified by similarity to those in published fish V gene sequences (Widholm et al. 1999; Danilova et al. 2000).

    Cloning of Dogfish AID

    Degenerate primers (see table 1 in Supplementary Material online) were used for RT-PCR amplification of AID cDNA from stimulated spleen cells of spotted dogfish (Scyliorhinus caniculus).

    Results and Discussion

    AID/APOBECs Form a Separate Gene Family Among Deaminases

    Deaminases that work on free base, nucleoside, or nucleotide (cytosine, cytidine/deoxycytidine, dCMP) as well as those working on cytosine or adenine in the context of a polynucleotide target (the AID/APOBEC and ADAR/ADAT families, respectively) all contain a motif (generally characterized by an [H/C]XE and PCXXC signature) that, based on studies of bacterial cytidine deaminases (Betts et al. 1994; Xiang et al. 1996,1997; Johansson et al. 2002) and of yeast cytosine and cytidine deaminases (Ireton et al. 2002; Ko et al. 2003; Xie et al. 2004), is responsible for coordinating zinc in the vicinity of the active site (Gerber and Keller 2001). A phylogenetic tree was established on the basis of the protein sequence in the vicinity of the presumed zinc-coordinating motif in a pool of 160 eukaryotic deaminase (or putative deaminase) gene sequences as described in Materials and Methods. This motif is encoded by a single exon in most vertebrate dCMP/cytosine/ADAT2/-ADAT3/AID/APOBEC deaminases, suggesting it may have evolved as a functional cassette; it is essentially the protein sequence encoded by this exon that was used to construct the phylogenetic tree.

    The tree (fig. 1A) reveals the segregation of the different deaminase genes into four major clusters. One comprises the cytidine deaminases (CDAs) and another the AID/APOBEC homologs. A third comprises a family of putative RNA-binding proteins and many of the adenosine deaminases that work on RNA (ADAR1, 2, and 3) as well as ADAT1 (which is a tRNA modifying adenosine deaminase). The fourth is a functionally heterogeneous grouping in which the known dCMP deaminases, ADAT2s and ADAT3s form distinct clusters. The other members of the group include cytosine deaminase (an activity that has been identified in fungi but not described in metazoa) as well as a variety of sequences encoding genes of unknown function.

    FIG. 1.— Relationships of the deaminase DNA sequences in the vicinity of their presumed zinc coordination motifs. (A). Neighbor-joining tree generated from a protein alignment of the domain that includes the presumed zinc-coordinating motif. Clusters highlighted in dark gray are supported by a bootstrap value greater than 50, those grouped in a medium shading of gray are clustered by function, and the largest groupings in the lightest shading reflect gene families are based on sequence similarity. Bootstrap values for selected nodes are indicated. An asterisk indicates the position of yeast CDD1. The labels at the end of each branch indicate the organism from which the sequence derives (Anopheles gambiae, Ag; Ascaris suum, As; Aspergillus terreus, Atr; Arabidopsis thaliana, At; Brugia malayi, Bm; Candida albicans, Ca; Caenorhabditis elegans, Ce; Ciona intestinalis, Ci; Dictyostelium discoideum, Dd; Dirofilaria immitis, Di; Drosophila melanogaster, Dm; Danio rerio, Dr; Encephalitozoon cuniculi, Enc; Echinococcus multilocularis, Em; Gallus gallus, Gg; Homo sapiens, Hs; Monodelphis domestica, Md; Macaca fascicularis, Mf; Mus musculus, Mm; Neurospora crassa, Nc; Oryctolagus cuniculus, Oc; Oryzias latipes, Ol ; Oryza sativa, Os ; Plasmodium falciparum, Pf; Plasmodium yoelii yoelii, Py; Rattus norvegicus, Rn; Saccharomyces cerevisiae, Sc; Schizosaccharomyces pombe, Sp; Trypanosoma cruzi, Tc; Tetraodon fluviatilis, Tf; Takifugu rubripes, Tr; and Xenopus laevis, Xl). Within each darker shaded area, multiple sequences from the same organism are distinguished by use of asterisks (where their distinct functions are unknown) or by use of digits, where they can be assigned to separate groupings (e.g., 1, 2, 3 for ADAR1, 2, or 3 family members; 1, 2, 3, AID for APOBEC1,2, 3 or AID family members; and 1, 2,3...9 for hypothetical plant CDAs ) as well, in the case of double-domained APOBEC3 proteins, by [N] or [C] to indicate whether the sequence derives from the N-terminal or C-terminal zinc motif. CeC33G8 is a gene of unknown function. The list of the genes used is provided as Supplementary Material online. (B) Logo alignments of the major clusters of presumed zinc-coordination motifs. The alignments were generated using only proteins of animal origin. The height of the letter represents the conservation of that given residue. In the ADAR/ADAT1 alignment, the extra exon inserted into the putative catalytic domain has not been included. The number of sequences used to generate each logo are AID/APOBECs, 32 (with overrepresentation of APOBEC3 domains avoided by excluding those with greater than 70% similarity to those already included); dCMP deaminases/ADAT2/ADAT3, 26; CDA, 17; and ADAR/ADATI, 30.

    As expected, the sequence divergence between proteins in the same cluster but derived from different species broadly correlates with the evolutionary separation of the species from which the sequences derive. Thus, in the cytidine deaminase family, most of the observed divergence reflects evolutionary separation of the species from which the enzymes originate with a bunch of plant hypothetical cytidine deaminases forming a distinct subbranch. In contrast, much of the divergence observed in the AID/APOBEC family reflects the separation into subfamilies with APOBEC1, APOBEC2, APOBEC3, and AID sequences forming distinct clusters (fig. 1A).

    The zinc coordination motif of the AID/APOBEC family conforms to the HAE consensus common to the other deaminases (except for cytidine deaminases, which display CAE) (fig. 1B). The alignments also highlight distinctive features of the AID/APOBEC family. Thus, several individual amino acids are conserved downstream of the PCXXC motif (e.g., several leucines, an arginine, and a glycine) and a conserved phenylalanine is found two positions downstream of the HAE motif, which has been shown to be important in RNA binding by APOBEC1 (Navaratnam et al. 1995). In addition, we identify a CYX[VI]TW[YF]XS[WS]S consensus that is located on the amino-terminal flank of the PCXXC motif and which is not found in the non-AID/APOBEC pyrimidine deaminases.

    APOBEC2 and AID Homologs Trace Back to Bony Fish

    Members of the AID/APOBEC gene family can be separated into AID, APOBEC1, APOBEC2, and APOBEC3 clusters based on the sequences surrounding the putative zinc coordination motif (fig. 2). Database screening reveals homologs of mammalian AID and APOBEC2 in chicken, frog, and bony fish. Indeed, whereas the genome sequences of pufferfish and zebrafish reveal the presence of a single AID ortholog (Zhao et al. 2004; Saunders and Magor 2004), both species of bony fish harbor two distinct APOBEC2 paralogs (fig. 2). The existence of duplicate APOBEC2s likely reflects the genome duplication that appears to have occurred at the origin of the ray-finned fishes (Taylor et al. 2003; Christoffels et al. 2004).

    Regarding chromosomal location, both the AID and APOBEC2 loci are located in a syntenic region in human, mouse, and chicken. Microsynteny for the AID locus is also maintained in pufferfish (fig. 3A), but the incompleteness of its genome sequence precludes determination of whether microsynteny is also conserved with APOBEC2.

    FIG. 3.— Syntenic relationships of the genomic contexts of (A) AID-APOBEC1 (AID; A1) and (B) APOBEC3 (A3) loci in human, mouse, chicken, and fugu. Vertical bars on the chromosomes represent genes; lines connecting the genes in different species indicate the orthologs/paralogs. Arrowheads depict transcriptional orientation. The region of mouse chromosome 15 that includes the APOBEC3 locus (not shown) exhibits high similarity to the corresponding regions in human and chicken. Synteny at the APOBEC2 locus is also preserved among human, mouse, and chicken (not shown), but synteny with the bony fishes has not been established because of the limited size of the current genomic contigs.

    Pufferfish AID Has a Similar (but Temperature Sensitive) DNA Deaminating Activity to Mammalian AID

    Whereas the function of APOBEC2 is unknown in any species, AID potentiates antibody gene diversification in man and mouse by triggering dCdU deamination within the immunoglobulin locus with a local sequence preference that is likely responsible for the generation of mutational hotspots (Petersen-Mahrt, Harris, and Neuberger 2002; Bransteitter et al. 2003; Pham et al. 2003; Beale et al. 2004).

    To ascertain whether the gene identifiable in bony fish as the sequence homolog of mammalian AID does indeed exhibit a similar DNA deaminating activity, we exploited a bacterial genetic assay for mutator activity. A pufferfish AID cDNA was expressed in E. coli, but initial experiments monitoring the frequency of rifampicin resistant mutants after growth at 37°C failed to reveal any mutator activity. However, bearing in mind that fish are cold-blooded, we repeated the assay, but this time allowing the bacteria to grow at 18°C (as opposed to 37°C). A clearly enhanced mutation frequency caused by pufferfish AID was now detected (fig. 4A). As expected for mutation through cytosine deamination, this stimulation of mutation was augmented in a background deficient in uracil-DNA glycosylase. As judged by analysis of the distribution of mutations conferring resistance to rifampicin, the fine target specificity of pufferfish AID is broadly similar to that of human AID (fig. 4B).

    FIG. 4.— Temperature-sensitive mutator activity of Fugu AID. (A) Frequencies of rifampicin-resistant mutants in cultures of either Ung+ or Ung– E. coli transformants carrying human or fish AID expression constructs or vector-only controls grown at 18°C or 37°C as indicated. Each point represents the mutation frequency of an independent overnight culture. The median mutation frequency and the fold enhancement resulting from AID expression are indicated. (B) Comparison of the distribution of independent rpoB point mutations identified in rifampicin-resistant cells transformed with empty vector (pTrc99) or with human or Fugu AID expression constructs after 32 hours growth at 18°C.

    Biased Codon Usage in Fish IgV Genes Consistent with Conserved Specificity of AID

    Analysis of germline immunoglobulin V gene sequences in mammals has revealed a localized bias in codon utilization. This discovery led to the suggestion that these germline IgV sequences have evolved in the context of the local-target specificity of AID so as to allow optimal targeting of somatic mutation during antibody affinity maturation—although the possibility cannot be excluded that other constraints might explain the biased codon usage (Wagner, Milstein, and Neuberger 1995; Jolly et al. 1996; Kepler 1997). Because the target specificity of pufferfish and human AID appear similar, we asked whether the biased codon utilization characteristic of IgV genes in higher organisms is conserved in pufferfish. Regarding IgVH genes, analysis reveals that in both pufferfish and zebrafish, there is, as in their mammalian counterparts, a clear preference for AGY serine codons in the CDRs (especially CDR1)—where somatic mutation is more likely to be beneficial for affinity maturation—but a preference for TCN codons in framework region 1 (fig. 5). Similar findings have also been described for the IgVH genes of another bony fish (Trematomus bernacchii [Oreste and Coscia 2002]), as well as in the amphibian Ambystoma mexicanum (Golub and Charlemagne 1998). With regard to IgV genes, although biased serine codon usage is also evident in mammals, albeit to a somewhat less dramatic extent than in their IgVH genes (Wagner, Milstein, and Neuberger 1995), a similar bias is only clearly discernible in some of the fish IgV families. Thus, although the data appear consistent with the idea of coevolution of the local context preference of the AID mutator and the codon usage of the IgV gene target so as to facilitate optimal targeting of somatic mutation of antibodies, this coevolution may not apply with equal stringency to all IgV gene families.

    FIG. 5.— Distribution of TCN and AGY serine codons in fish immunoglobulin gene segments. Fugu and Danio VH and V genes have been grouped into families and, at each amino acid position, the number of family members containing an AGY codon at that position is depicted above the line, and a TCN codon is depicted below the line. CDRs are shaded.

    An AID Homolog Is Identifiable in Dogfish

    Because rearranging immunoglobulin V genes (IgM, IgNAR, and IgW) have been identified in shark and have been shown to undergo hypermutation (Hinds-Frey et al. 1993; Diaz, Greenberg, and Flajnik 1998; Du Pasquier et al. 1998; Diaz et al. 1999; Lee et al. 2002), we anticipated that an AID homolog would extend back beyond bony fish, at least to cartilaginous fish. Based on sequence alignments of AID cDNAs from different species, we devised a set of degenerate primers for RT-PCR amplification of AID, and using these primers, cloned a partial cDNA of what appears to be a bona fide AID ortholog in dogfish (Scyliorhinus caniculus). This molecule shows 79% similarity to human AID with the main difference from AID in other species, being that it comprises the sequence HAE (rather than the usual AID-specific HVE consensus) in the zinc-coordinating domain (fig. 6). The other residues characteristic of the AID/APOBEC gene family are conserved. So far, we have failed to identify an APOBEC2 ortholog in dogfish but cannot conclude that such a homolog is absent.

    FIG. 6.— Alignment of AID proteins from different species. The amino-terminal residues of frog, zebrafish, and pufferfish AID are at present unknown because we have not been able to identify the first exon of the AID gene in these species unambiguously. The sequence of the dogfish AID homolog is incomplete. Identical residues are shaded in dark gray, similar ones are shaded in light gray. Dashes indicate indels.

    APOBEC1 Is a Later Arrival, Likely Derived from AID

    Several screens have been performed to find homologs of APOBEC1 (the first member of the AID/APOBEC family to be identified) (Anant et al. 1997; Anant, Yu, and Davidson 1998; Fujino et al. 1999; Dance et al. 2001). However, our analyses indicate that none of the APOBEC1-related sequences that can be identified in nonmammalian species are true APOBEC1 orthologs. Thus, for example, a yeast cytidine deaminase (CDD1; asterisked in the CDA cluster in figure 1A) has recently been described as an APOBEC1 ortholog (Dance et al. 2001; Xie et al. 2004). Nevertheless, despite a description of its ability to deaminate apolipoprotein B RNA (Dance et al. 2001), we find that it is clearly much more closely related to canonical cytidine deaminases (that work on free cytidine) than to members of any other gene family, including the AID/APOBECs. Indeed, its zinc-coordinating domain displays the signature of the cytidine deaminases, as opposed to that characteristic of the AID/APOBEC family (fig. 1B). Furthermore, not only is there evidence that yeast does not normally exhibit any editing activity that can work on apoliporotein B RNA (Lellek et al. 2002), but genetic evidences identifies CDD1 as functioning in the free pyrimidine nucleotide salvage pathway in yeast (Kurtz et al. 1999). Thus, APOBEC1 appears as a mammal-specific gene.

    The phylogenetic relationships of the sequences (fig. 1A and 2) suggest that APOBEC1 (and APOBEC3 [see below]) are most likely to have arisen by duplication of the AID (as opposed to the APOBEC2) locus. The similar location of the intron/exon boundaries in AID, APOBEC3, and APOBEC1 as compared with those in APOBEC2 (fig. 7A) is consistent with this suggestion. The presumed duplication of AID that gave rise to APOBEC1 resulted in the two genes being closely related on the chromosome. There is, however, a striking contrast between the AID/APOBEC1 loci in primates and rodents. Whereas in human and chimpanzee, AID and APOBEC1 are separated by approximately 1 Mb and are in the same transcriptional orientation, in rodents they are located within approximately 30 kb and are oppositely oriented (fig. 3A). The fact that the chicken genomic region surrounding AID resembles the rodent genomic region suggests that the difference between human and rodent loci originated in a 1 Mb inversion containing the APOBEC1 locus that took place after the rodent/primate divergence.

    FIG. 7.— Evolution and organization of APOBEC3 genes. (A) Exon boundaries of the deaminases used to build the phylogenetic tree in figure 2. Translated and untranslated portions of the transcripts are indicated respectively with filled or empty boxes. The exons encoding the zinc-coordinating domain are shaded in gray; sequences from these exons were used to build the tree shown in figure 2. The gray-filled box in APOBEC1 represents the portion of the last exon without corresponding sequences in the other AID/APOBEC genes. (B) Hypothetical scheme for the origin and divergence of the APOBEC3 family in the mammals starting from two ancestral domains (Z1 [circles] and Z2 [squares]) that either constituted a double-domained APOBEC3 or two single-domained APOBEC3s. The Z1 domain in primates is then envisaged as having diversified into Z1a (open circles) and Z1b (filled circles) subtypes. (C) Organization of the human APOBEC3 locus showing the presence of repetitive elements identified by RepeatMasker analysis.

    The greatest divergence between the AID and APOBEC1 coding sequences is at the C-terminus, with the C-terminal portion of APOBEC1 being 32 amino acids longer. It has been suggested that, although there is no primary sequence similarity, the last two exons of APOBEC1 might correspond to the pseudocatalytic domain at the C-terminus of the E. coli cytidine deaminase (Navaratnam et al. 1998; Xie et al. 2004). However, in view of the fact that the extra amino acids in APOBEC1 exhibit no correspondent in the other AID/APOBEC family members and that APOBEC1 appears to be a later, AID-derived gene, we suggest that this C-terminal portion of APOBEC1 might have evolved to allow APOBEC1-specific interactions with other molecules to facilitate its function in, for example, RNA editing. There might be an analogy here with AID, where mutant analysis has revealed that the C-terminal portion is important for class switch recombination but dispensable for the somatic hypermutation of immunoglobulin genes (Barreto et al. 2003; Ta et al. 2003).

    Evolution of the APOBEC3 Family

    As with APOBEC1, orthologs of APOBEC3 are also restricted to mammals. However, whereas there is a single APOBEC3 gene in mouse and rat as judged from genomic databases (as well as by Southern blot analysis [data not shown]), there are eight APOBEC3 genes in the human. Six of these genes, designated APOBEC3A to APOBEC3G, form a 130-kb cluster on chromosome 22 (Jarmuz et al. 2002). (The sequence originally designated APOBEC3E appears likely in the light of EST evidence to encode the second domain of the APOBEC3D [see also Wedekind et al. {2003}]). Inspection of databases also reveals two additional APOBEC3 genes. One, which we designate APOBEC3H (equivalent to ARP10 [Wedekind et al. 2003]) is expressed at least in lymphoid tissue and placenta, as judged from EST databases and is located 14 kb downstream of the APOBEC3G locus. The other (NCBI LOC196469) is on human chromosome 12q24.11 and is likely a pseudogene originated from a very recent duplication of the APOBEC3G gene. It has two partial putative zinc domains, lacks introns (figure 1 in Supplementary Material online), and has no ESTs unequivocally ascribable to it.

    The region of human chromosome 22 containing the APOBEC3 cluster is syntenic to a segment of murine chromosome 15, the only major difference being the APOBEC3 expansion in human. Interestingly, although APOBEC3 genes are restricted to mammals with homologs absent from the bony fish and chicken genome sequences, synteny between the human, chicken, and pufferfish genomes is evident in the region flanking the APOBEC3 locus. Thus, the genes flanking the human APOBEC3 cluster are conserved in chicken and pufferfish but with the APOBEC3 genes themselves missing, supporting the later origin of the APOBEC3 locus (fig. 3B).

    The draft genome of the chimpanzee reveals that orthologs of all the human APOBEC3 genes are present (but with some uncertainty regarding APOBEC3A), including the pseudogene on the chromosome 12, placing the expansion of this locus at the beginning of the primate evolution.

    How might the APOBEC3 locus have evolved, given that primates have multiple genes (containing either one or two putative zinc-cordination domains [fig. 7B]), whereas rodents have a sole double-domained APOBEC3? It is not merely a simple case of gene amplification. From the phylogenetic tree shown in figure 2, it appears that the zinc-coordination domains of the APOBEC3s can be grouped into two major clades, labeled Z1 and Z2, with the Z1 group being further divisible into Z1a and Z1b. Thus, for example, whereas the Z1 domains retain the SWS motif upstream of the PC-C, which is common to other AID/APOBEC family members, a threonine residue is found in place of the first serine in Z2 domains.

    The double-domained rodent APOBEC3 is a Z1-Z2 composite, whereas all human APOBEC3s have either a Z1 (single-domained) or Z1-Z1 (double-domained) structure, except for human APOBEC3H, which consists of a single Z2 domain. The likely evolution, therefore, is from a common ancestor that harbored both Z1 and Z2 domains in the form of either single- or double-domained APOBEC3 proteins (fig. 7A). Consistent with this proposal, analysis of EST sequence databases from cow and pig (that have diverged before the separation of the rodent/primate lineage [Madsen et al. 2001; Murphy et al. 2001]) reveals that Z1 and Z2 domains are both present in arctyodactyls and contribute to both single-domained and double-domained APOBEC3 proteins (fig. 2).

    It is notable that although all the domains of the human APOBEC3 genes, except APOBEC3H, are of the Z1 type, the division of Z1 domains into Z1a and Z1b subgroups reveals a complex set of similarities between the individual Z1 domains of the different APOBEC3 genes (fig. 2 and 6B). In some cases, the C-terminal domain is of Z1a type and sometimes of Z1b type. These sequence relationships suggest that there has been substantial shuffling of sequence information between the domains during evolution. Such amplification and reassortment in the primate APOBEC3 locus is likely to have occurred through unequal crossover/recombination. It is possible that retroviral elements might have facilitated this process, in view of the fact that their relicts sum up to the 19% of the entire human APOBEC3 locus, with repetitive elements (mainly LTRs from ERV class I) heavily represented in the regions flanking APOBEC3G and APOBEC3H (fig. 7C). This hypothesis is supported by the fact that the APOBEC3 pseudogene on chromosome 12 has originated from a retrotranscriptional event. Recently, it has been suggested that evolution of APOBEC3G itself in primates has been driven by positive selection (Sawyer, Emerman, and Malik 2004; Zhang and Webb 2004). Given the demonstrated role of some APOBEC3 members in viral restriction, it may well be that issues pertaining to host/virus interaction have provided the driving force for the rapid expansion of the entire APOBEC3 locus in primates.

    Dynastic Relationships of AID/APOBEC Family Members

    The results presented here support a scenario in which AID and APOBEC2 are the ancestral members of the AID/APOBEC family with APOBEC1 and APOBEC3 being later arrivals, derived from AID, and restricted to mammals. We cannot formally exclude the possibility that APOBEC1 and/or APOBEC3 arose early but were then lost in fish and chickens. However, because bony-fish diverged from the tetrapod lineage around 450 MYA and birds diverged around 310 MYA (Benton 1990; Kumar and Hedges 1998), this hypothesis would seem unlikely, because it implies that APOBEC1 and APOBEC3 were independently lost in distant lineages.

    Although AID homologs can be traced back to cartilaginous fish, and APOBEC2 homologs are at least found in bony fish, homologs are not identifiable for either protein among nonchordates, nor are they present in the genome of Ciona intestinalis, an invertebrate chordate (Dehal et al. 2002; Azumi et al. 2003; the present work). Thus, the phylogenetic sequence analysis provides no indication as to whether AID evolved from APOBEC2 or vice-versa. It will certainly be interesting in the future to gain insight into the physiological function of APOBEC2.

    Supplementary Material

    The GenBank accession number for the dogfish AID is AY705386. The list of all the genes used in the phylogenetic analysis is available as Supplementary Material on the MBE Web site, as well as the alignments, a schematic representation of the exon structures of the genes used in figure 2, and a dot matrix comparison of APOBEC3G with the APOBEC3 pseudogene.

    Acknowledgements

    We thank C. Rada for helpful discussions. Scyliorhinus caniculus material was kindly provided by A. Goostrey from the Scottish Fish Immunology Research Centre at the University of Aberdeen. S.G.C. was supported by a Marie Curie Fellowship of the European Community programme FP5 under contract number MCFI-2002-01357, C.J.F.T. was supported by a studentship from Boehringer Ingelheim Fond, S.K.P.-M. in part by a grant from the Arthritis Research Campaign, and the work was also in part supported by a grant from the Leukaemia Research Fund to M.S.N.

    References

    Anant, S., S. A. Martin, H. Yu, A. J. MacGinnitie, E. Devaney, and N. O. Davidson. 1997. A cytidine deaminase expressed in the post-infective L3 stage of the filarial nematode, Brugia pahangi, has a novel RNA-binding activity. Mol. Biochem. Parasitol. 88:105–114.

    Anant, S., D. Mukhopadhyay, V. Sankaranand, S. Kennedy, J. O. Henderson, and N. O. Davidson. 2001. ARCD-1, an apobec-1-related cytidine deaminase, exerts a dominant negative effect on C to U RNA editing. Am. J. Physiol. Cell Physiol. 281:C1904–1916.

    Anant, S., H. Yu, and N. O. Davidson. 1998. Evolutionary origins of the mammalian apolipoprotein B RNA editing enzyme, apobec-1: structural homology inferred from analysis of a cloned chicken small intestinal cytidine deaminase. Biol. Chem. 379:1075–1081.

    Azumi, K., R. De Santis, A. De Tomaso et al. (18 co-authors). 2003. Genomic analysis of immunity in a Urochordate and the emergence of the vertebrate immune system: "waiting for Godot". Immunogenetics 55:570–581.

    Barreto, V., B. Reina-San-Martin, A. R. Ramiro, K. M. McBride, and M. C. Nussenzweig. 2003. C-terminal deletion of AID uncouples class switch recombination from somatic hypermutation and gene conversion. Mol. Cell. 12:501–508.

    Beale, R. C., S. K. Petersen-Mahrt, I. N. Watt, R. S. Harris, C. Rada, and M. S. Neuberger. 2004. Comparison of the differential context-dependence of DNA deamination by APOBEC enzymes: correlation with mutation spectra in vivo. J. Mol. Biol. 337:585–596.

    Benton, M. J. 1990. Phylogeny of the major tetrapod groups: morphological data and divergence dates. J. Mol. Evol. 30:409–424.

    Betts, L., S. Xiang, S. A. Short, R. Wolfenden, and C. W. Carter Jr. 1994. Cytidine deaminase. The 2.3 A crystal structure of an enzyme: transition-state analog complex. J. Mol. Biol. 235:635–656.

    Bishop, K. N., R. K. Holmes, A. M. Sheehy, N. O. Davidson, S. J. Cho, and M. H. Malim. 2004. Cytidine deamination of retroviral DNA by diverse APOBEC proteins. Curr. Biol. 14:1392–1396.

    Bransteitter, R., P. Pham, M. D. Scharff, and M. F. Goodman. 2003. Activation-induced cytidine deaminase deaminates deoxycytidine on single-stranded DNA but requires the action of RNase. Proc. Natl. Acad. Sci. USA 100:4102–4107.

    Christoffels, A., E. G. Koh, J. M. Chia, S. Brenner, S. Aparicio, and B. Venkatesh. 2004. Fugu genome analysis provides evidence for a whole-genome duplication early during the evolution of ray-finned fishes. Mol. Biol. Evol. 21:1146–1151.

    Dance, G. S., P. Beemiller, Y. Yang, D. V. Mater, I. S. Mian, and H. C. Smith. 2001. Identification of the yeast cytidine deaminase CDD1 as an orphan CU RNA editase. Nucleic Acids Res. 29:1772–1780.

    Danilova, N., V. S. Hohman, E. H. Kim, and L. A. Steiner. 2000. Immunoglobulin variable-region diversity in the zebrafish. Immunogenetics 52:81–91.

    Dehal, P., Y. Satou, R. K. Campbell et al. (87 co-authors). 2002. The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science 298:2157–2167.

    Diaz, M., A. S. Greenberg, and M. F. Flajnik. 1998. Somatic hypermutation of the new antigen receptor gene (NAR) in the nurse shark does not generate the repertoire: possible role in antigen-driven reactions in the absence of germinal centers. Proc. Natl. Acad. Sci. USA 95:14343–14348.

    Diaz, M., J. Velez, M. Singh, J. Cerny, and M. F. Flajnik. 1999. Mutational pattern of the nurse shark antigen receptor gene (NAR) is similar to that of mammalian Ig genes and to spontaneous mutations in evolution: the translesion synthesis model of somatic hypermutation. Int. Immunol. 11:825–833.

    Du Pasquier, L., M. Wilson, A. S. Greenberg, and M. F. Flajnik. 1998. Somatic mutation in ectothermic vertebrates: musings on selection and origins. Curr. Top. Microbiol. Immunol. 229:199–216.

    Fujino, T., N. Navaratnam, A. Jarmuz, A. von Haeseler, and J. Scott. 1999. CU editing of apolipoprotein B mRNA in marsupials: identification and characterisation of APOBEC-1 from the American opossum Monodelphus domestica. Nucleic Acids Res. 27:2662–2671.

    Gerber, A. P., and W. Keller. 2001. RNA editing by base deamination: more enzymes, more targets, new mysteries. Trends Biochem. Sci. 26:376–384.

    Golub, R., and J. Charlemagne. 1998. Structure, diversity, and repertoire of VH families in the Mexican axolotl. J. Immunol. 160:1233–1239.

    Harris, R. S., K. N. Bishop, A. M. Sheehy, H. M. Craig, S. K. Petersen-Mahrt, I. N. Watt, M. S. Neuberger, and M. H. Malim. 2003. DNA deamination mediates innate immunity to retroviral infection. Cell 113:803–809.

    Hinds-Frey, K. R., H. Nishikata, R. T. Litman, and G. W. Litman. 1993. Somatic variation precedes extensive diversification of germline sequences and combinatorial joining in the evolution of immunoglobulin heavy chain diversity. J. Exp. Med. 178:815–824.

    Ireton, G. C., G. McDermott, M. E. Black, and B. L. Stoddard. 2002. The structure of Escherichia coli cytosine deaminase. J. Mol. Biol. 315:687–697.

    Jarmuz, A., A. Chester, J. Bayliss, J. Gisbourne, I. Dunham, J. Scott, and N. Navaratnam. 2002. An anthropoid-specific locus of orphan C to U RNA-editing enzymes on chromosome 22. Genomics 79:285–296.

    Johansson, E., N. Mejlhede, J. Neuhard, and S. Larsen. 2002. Crystal structure of the tetrameric cytidine deaminase from Bacillus subtilis at 2.0 A resolution. Biochemistry 41:2563–2570.

    Jolly, C. J., S. D. Wagner, C. Rada, N. Klix, C. Milstein, and M. S. Neuberger. 1996. The targeting of somatic hypermutation. Semin. Immunol. 8:159–168.

    Kepler, T. B. 1997. Codon bias and plasticity in immunoglobulins. Mol. Biol. Evol. 14:637–643.

    Ko, T. P., J. J. Lin, C. Y. Hu, Y. H. Hsu, A. H. Wang, and S. H. Liaw. 2003. Crystal structure of yeast cytosine deaminase: insights into enzyme mechanism and evolution. J. Biol. Chem. 278:19111–19117.

    Kumar, S., and S. B. Hedges. 1998. A molecular timescale for vertebrate evolution. Nature 392:917–920.

    Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244–1245.

    Kurtz, J. E., F. Exinger, P. Erbs, and R. Jund. 1999. New insights into the pyrimidine salvage pathway of Saccharomyces cerevisiae: requirement of six genes for cytidine metabolism. Curr. Genet; 36:130–136.

    Lecossier, D., F. Bouchonnet, F. Clavel, and A. J. Hance. 2003. Hypermutation of HIV-1 DNA in the absence of the Vif protein. Science 300:1112.

    Lee, S. S., D. Tranchina, Y. Ohta, M. F. Flajnik, and E. Hsu. 2002. Hypermutation in shark immunoglobulin light chain genes results in contiguous substitutions. Immunity 16:571–582.

    Lellek, H., S. Welker, I. Diehl, R. Kirsten, and J. Greeve. 2002. Reconstitution of mRNA editing in yeast using a Gal4-apoB-Gal80 fusion transcript as the selectable marker. J. Biol. Chem. 277:23638–23644.

    Liao, W., S. H. Hong, B. H. Chan, F. B. Rudolph, S. C. Clark, and L. Chan. 1999. APOBEC-2, a cardiac- and skeletal muscle-specific member of the cytidine deaminase supergene family. Biochem. Biophys. Res. Commun. 260:398–404.

    Liddament, M. T., W. L. Brown, A. J. Schumacher, and R. S. Harris. 2004. APOBEC3F properties and hypermutation preferences indicate activity against HIV-1 in vivo. Curr. Biol. 14:1385–1391.

    Madsen, O., M. Scally, C. J. Douady, D. J. Kao, R. W. DeBry, R. Adkins, H. M. Amrine, M. J. Stanhope, W. W. de Jong, and M. S. Springer. 2001. Parallel adaptive radiations in two major clades of placental mammals. Nature 409:610–614.

    Mangeat, B., P. Turelli, G. Caron, M. Friedli, L. Perrin, and D. Trono. 2003. Broad antiretroviral defence by human APOBEC3G through lethal editing of nascent reverse transcripts. Nature 424:99–103.

    Muramatsu, M., K. Kinoshita, S. Fagarasan, S. Yamada, Y. Shinkai, and T. Honjo. 2000. Class switch recombination and hypermutation require activation-induced cytidine deaminase (AID), a potential RNA editing enzyme. Cell 102:553–563.

    Muramatsu, M., V. S. Sankaranand, S. Anant, M. Sugai, K. Kinoshita, N. O. Davidson, and T. Honjo. 1999. Specific expression of activation-induced cytidine deaminase (AID), a novel member of the RNA-editing deaminase family in germinal center B cells. J. Biol. Chem. 274:18470–18476.

    Murphy, W. J., E. Eizirik, S. J. O'Brien et al. (11 co-authors). 2001. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 294:2348–2351.

    Navaratnam, N., S. Bhattacharya, T. Fujino, D. Patel, A. L. Jarmuz, and J. Scott. 1995. Evolutionary origins of apoB mRNA editing: catalysis by a cytidine deaminase that has acquired a novel RNA-binding motif at its active site. Cell 81:187–195.

    Navaratnam, N., T. Fujino, J. Bayliss, A. Jarmuz, A. How, N. Richardson, A. Somasekaram, S. Bhattacharya, C. Carter, and J. Scott. 1998. Escherichia coli cytidine deaminase provides a molecular model for ApoB RNA editing and a mechanism for RNA substrate recognition. J. Mol. Biol. 275:695–714.

    Navaratnam, N., J. R. Morrison, S. Bhattacharya, D. Patel, T. Funahashi, F. Giannoni, B. B. Teng, N. O. Davidson, and J. Scott. 1993. The p27 catalytic subunit of the apolipoprotein B mRNA editing enzyme is a cytidine deaminase. J. Biol. Chem. 268:20709–20712.

    Nei, M., S. Kumar, and K. Takahashi. 1998. The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small. Proc. Natl. Acad. Sci. USA 95:12390–12397.

    Neuberger, M. S., R. S. Harris, J. Di Noia, and S. K. Petersen-Mahrt. 2003. Immunity through DNA deamination. Trends Biochem. Sci. 28:305–312.

    Oreste, U., and M. Coscia. 2002. Specific features of immunoglobulin VH genes of the Antarctic teleost Trematomus bernacchii. Gene 295:199–204.

    Petersen-Mahrt, S. K., R. S. Harris, and M. S. Neuberger. 2002. AID mutates E. coli suggesting a DNA deamination mechanism for antibody diversification. Nature 418:99–103.

    Pham, P., R. Bransteitter, J. Petruska, and M. F. Goodman. 2003. Processive AID-catalysed cytosine deamination on single-stranded DNA simulates somatic hypermutation. Nature 424:103–107.

    Revy, P., T. Muto, Y. Levy et al. (21 co-authors). 2000. Activation-induced cytidine deaminase (AID) deficiency causes the autosomal recessive form of the Hyper-IgM syndrome (HIGM2). Cell 102:565–575.

    Saunders, H. L., and B. G. Magor. 2004. Cloning and expression of the AID gene in the channel catfish. Dev. Comp. Immunol. 28:657–663.

    Sawyer, S. L., M. Emerman, and H. S. Malik. 2004. Ancient adaptive evolution of the primate antiviral DNA-editing enzyme APOBEC3G. PLoS Biol 2:E275.

    Schneider, T. D., and R. M. Stephens. 1990. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 18:6097–6100.

    Sheehy, A. M., N. C. Gaddis, J. D. Choi, and M. H. Malim. 2002. Isolation of a human gene that inhibits HIV-1 infection and is suppressed by the viral Vif protein. Nature 418:646–650.

    Ta, V. T., H. Nagaoka, N. Catalan et al. (12 co-authors). 2003. AID mutant analyses indicate requirement for class-switch-specific cofactors. Nat. Immunol. 4:843–848.

    Takahashi, K., and M. Nei. 2000. Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. Mol. Biol. Evol. 17:1251–1258.

    Taylor, J. S., I. Braasch, T. Frickey, A. Meyer, and Y. Van de Peer. 2003. Genome duplication, a trait shared by 22000 species of ray-finned fish. Genome Res. 13:382–390.

    Teng, B., C. F. Burant, and N. O. Davidson. 1993. Molecular cloning of an apolipoprotein B messenger RNA editing protein. Science 260:1816–1819.

    Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:4876–4882.

    Wagner, S. D., C. Milstein, and M. S. Neuberger. 1995. Codon bias targets mutation. Nature 376:732.

    Wedekind, J. E., G. S. Dance, M. P. Sowden, and H. C. Smith. 2003. Messenger RNA editing in mammals: new members of the APOBEC family seeking roles in the family business. Trends Genet. 19:207–216.

    Widholm, H., A. S. Lundback, A. Daggfeldt, B. Magnadottir, G. W. Warr, and L. Pilstrom. 1999. Light chain variable region diversity in Atlantic cod (Gadus morhua L.). Dev. Comp. Immunol. 23:231–240.

    Wiegand, H. L., B. P. Doehle, H. P. Bogerd, and B. R. Cullen. 2004. A second human antiretroviral factor, APOBEC3F, is suppressed by the HIV-1 and HIV-2 Vif proteins. EMBO J. 23:2451–2458.

    Xiang, S., S. A. Short, R. Wolfenden, and C. W. Carter, Jr. 1996. Cytidine deaminase complexed to 3-deazacytidine: a "valence buffer" in zinc enzyme catalysis. Biochemistry 35:1335–1341.

    ———. 1997. The structure of the cytidine deaminase-product complex provides evidence for efficient proton transfer and ground-state destabilization. Biochemistry 36:4768–4774.

    Xie, K., M. P. Sowden, G. S. Dance, A. T. Torelli, H. C. Smith, and J. E. Wedekind. 2004. The structure of a yeast RNA-editing deaminase provides insight into the fold and function of activation-induced deaminase and APOBEC-1. Proc. Natl. Acad. Sci. USA 101:8114–8119.

    Zhang, H., B. Yang, R. J. Pomerantz, C. Zhang, S. C. Arunachalam, and L. Gao. 2003. The cytidine deaminase CEM15 induces hypermutation in newly synthesized HIV-1 DNA. Nature 424:94–98.

    Zhang, J., and D. M. Webb. 2004. Rapid evolution of primate antiviral enzyme APOBEC3G. Hum. Mol. Genet.

    Zhang, Z., A. A. Schaffer, W. Miller, T. L. Madden, D. J. Lipman, E. V. Koonin, and S. F. Altschul. 1998. Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res. 26:3986–3990.

    Zhao, Y., Q. Pan-Hammarstrom, Z. Zhao, and L. Hammarstrom. 2005. Identification of the activation-induced cytidine deaminase gene from zebrafish: an evolutionary analysis. Dev. Comp. Immunol. 29:61–71.

    Zheng, Y. H., D. Irwin, T. Kurosu, K. Tokunaga, T. Sata, and B. M. Peterlin. 2004. Human APOBEC3F is another host factor that blocks human immunodeficiency virus type 1 replication. J. Virol. 78:6073–6076.(Silvestro G. Conticello1,)