当前位置: 首页 > 医学版 > 期刊论文 > 基础医学 > 分子生物学进展 > 2004年 > 第7期 > 正文
编号:11255029
Coenzyme A Biosynthesis: Reconstruction of the Pathway in Archaea and an Evolutionary Scenario Based on Comparative Genomics
     Lehrstuhl für Genetik, Wissenschaftszentrum Weihenstephan, Technische Universit?t, München, Freising, Germany

    E-mail: genschel@wzw.tum.de.

    Abstract

    Coenzyme A (CoA) holds a central position in cellular metabolism and therefore can be assumed to be an ancient molecule. Starting from the known E. coli and human enzymes required for the biosynthesis of CoA, phylogenetic profiles and chromosomal proximity methods enabled an almost complete reconstruction of archaeal CoA biosynthesis. This includes the identification of strong candidates for archaeal pantothenate synthetase and pantothenate kinase, which are unrelated to the corresponding bacterial or eukaryotic enzymes. According to this reconstruction, the topology of CoA synthesis from common precursors is essentially conserved across the three domains of life. The CoA pathway is conserved to varying degrees in eukaryotic pathogens like Giardia lamblia or Plasmodium falciparum, indicating that these pathogens have individual uptake-mechanisms for different CoA precursors. Phylogenetic analysis and phyletic distribution of the CoA biosynthetic enzymes suggest that the enzymes required for the synthesis of phosphopantothenate were recruited independently in the bacterial and archaeal lineages by convergent evolution, and that eukaryotes inherited the genes for the synthesis of pantothenate (vitamin B5) from bacteria. Homologues to bacterial enzymes involved in pantothenate biosynthesis are present in a subset of archaeal genomes. The phylogenies of these enzymes indicate that they were acquired from bacterial thermophiles through horizontal gene transfer. Monophyly can be inferred for each of the enzymes catalyzing the four ultimate steps of CoA synthesis, the conversion of phosphopantothenate into CoA. The results support the notion that CoA was initially synthesized from a prebiotic precursor, most likely pantothenate or a related compound.

    Key Words: cofactor biosynthesis ? evolution of metabolic pathways ? horizontal gene transfer ? pantothenate synthetase ? pantothenate kinase

    Introduction

    Coenzyme A (CoA) is an essential cofactor in numerous metabolic and energy-yielding reactions and is involved in the regulation of key metabolic enzymes (Abiko 1975). CoA is also the source of 4'-phosphopantetheine, which is required in several biosynthetic pathways, among them fatty acid and polyketide syntheses (Kleinkauf 2000). It was estimated that about 4% of all enzymes utilize CoA, CoA thioesters, or 4'-phosphopantetheine as substrates (Begley, Kinsland, and Strauss 2001). The CoA-precursors pantothenate (vitamin B5; Miller and Schlesinger 1993) and pantetheine (Keefe, Newton, and Miller 1995) were demonstrated to be likely prebiotic compounds. Prebiotic availability of CoA-precursors and the central role of CoA in metabolism suggest it was already present at a very early stage of the evolution of life.

    Pantothenate is produced by prokaryotes, fungi, and plants only, while animals must obtain it from their diet (Smith and Song 1996). Several pathogenic bacteria lack de novo biosynthesis of pantothenate and rely on scavenging exogenous pantothenate (Gerdes et al. 2002). In contrast, almost all prokaryotes and eukaryotes are able to convert pantothenate into CoA. In bacteria, de novo biosynthesis of CoA comprises nine steps (fig. 1) (Jackowsky 1996). Briefly, ?-alanine is produced from aspartate by the action of aspartate -decarboxylase (EC 4.1.1.11; ADC). Ketopantoate hydroxymethyltransferase (EC 2.1.2.11; KPHMT) converts -ketoisovalerate to ketopantoate. The latter is reduced to pantoate by ketopantoate reductase (EC 1.1.1.169; KPR). Pantothenate synthetase (EC 6.3.2.1; PS) catalyzes the condensation of pantoate and ?-alanine to pantothenate. Phosphorylation of pantothenate catalyzed by pantothenate kinase (EC 2.7.1.33; PANK) gives 4'-phosphopantothenate. This is condensed with cysteine by phosphopantothenoylcysteine synthetase (EC 6.3.2.5; PPCS) to yield 4'-phospho-N-pantothenoylcysteine. In the next step, 4'-phosphopantetheine is produced by the action of phosphopantothenoylcysteine decarboxylase (EC 4.1.1.36; PPCDC). Transfer of an adenylyl-group to 4'-phosphopantetheine is catalyzed by phosphopantetheine adenylyltransferase (EC 2.7.7.3; PPAT). Finally, the resulting dephospho-CoA is phosphorylated by dephospho-coenzyme A kinase (EC 2.7.1.24; DPCK) to give CoA.

    FIG. 1. Bacterial biosynthesis of coenzyme A. See text for abbreviations of enzyme names

    Recently, cloning and characterization of PPCS (Strauss et al. 2001) and PPCDC (Kupke et al. 2000) completed the set of biosynthetic genes involved in the production of pantothenate and CoA in E. coli (for a review see Begley, Kinsland, and Strauss 2001). Also, genes encoding the five steps of the mammalian CoA biosynthetic pathway were recently identified (Rock et al. 2000; Daugherty et al. 2002). Several groups commented on the conservation of CoA biosynthetic enzymes among distantly related species. PS from bacteria, yeasts and plants were shown to be similar (Genschel et al. 1999). Comparison of the E. coli bifunctional PPCS/PPCDC enzyme with eukaryotic and archaeal homologues revealed a high degree of conservation for PPCDC and some conservation for PPCS (Kupke et al. 2001; Daugherty et al. 2002). Also, homologues of E. coli DPCK were detected in a range of bacterial and eukaryotic genomes (Mishra, Park, and Drueckhammer 2001). In contrast, bacterial and eukaryotic versions of PANK were found to be unrelated (Calder et al. 1999). Differences between human and bacterial versions of CoA biosynthetic enzymes were evaluated in order to identify novel drug targets (Gerdes et al. 2002). However, so far no systematic attempt has been made to elucidate the evolutionary relationships of CoA biosynthetic genes among bacteria, archaea, and eukaryotes.

    Here, comparative genomics is used to visualize the mosaic of conserved CoA biosynthetic genes and to reconstitute the so far unexplored CoA biosynthetic pathway in archaea. Based on the phylogenetic distribution and the evolutionary histories of CoA biosynthetic genes, two phases can be distinguished in the evolution of the pathway: CoA biosynthesis from phosphopantothenate was complete in the universal ancestor, while pathways for the synthesis of phosphopantothenate arose only after the separation of bacteria and archaea.

    Materials and Methods

    Translated amino acid sequences of the E. coli or human CoA biosynthetic genes were used to query the National Center for Biotechnology Information (NCBI) non-redundant peptide sequence database by using the Blast algorithm (Altschul et al. 1990). Protein sequences with an expectation value (E) of 10–3 or less were scored as putative homologues and were retrieved from NCBI (http://www.ncbi.nlm.nih.gov). Searches for distantly related homologues were performed using the Psi-Blast algorithm (Altschul et al. 1997). Position-specific sequence matrices (PSSMs) were generated in iterative Psi-Blast searches against the NCBI non-redundant sequence database and then used to query specific subsets of that database. Orthologous relationships between proteins and functional classifications were also derived from the database of Clusters of Orthologous Groups of proteins (COGs and KOGs) (Tatusov et al. 2001, 2003).

    Functional links to archaeal homologues of CoA biosynthetic genes were inferred by chromosomal proximity using established criteria (Overbeek et al. 1999; Yanai, Mellor, and DeLisi 2002). Links based on mere proximity were inferred in the same way except that all genes occurring within 3 kb of the conserved CoA genes were analyzed. Genome contexts of 16 completely sequenced archaeal genomes (8/2003) were inspected at Entrez Genome (http://www.ncbi.nlm.nih.gov:80/genomes/static/a_g.html).

    Protein sequences were aligned using T-Coffee (Notredame, Higgins, and Heringa 2000) or ClustalX (Chenna et al. 2003) and edited to remove positions with gaps. Phylogenetic trees were constructed based on T-Coffee alignments using the MOLPHY package (Adachi and Hasegawa 1996).

    Results

    Occurrence and Phyletic Distribution of Pantothenate and Coenzyme A Biosynthetic Genes

    The distribution of E. coli and human genes involved in CoA biosynthesis across 47 genomes is summarized in figure 2. Apart from some pathogenic species that lack the pantothenate pathway, the nine E. coli enzymes required for de novo CoA synthesis are essentially conserved across the 20 completely sequenced bacterial genomes considered. All bacterial genomes also contain homologues to human PPCDC and to the DPCK domain of human CoA synthetase. There are no highly conserved bacterial homologues to human PANK, PPCS, and PPAT. One exception is Bacillus anthracis, which contains a homologue to human PANK but not to E. coli PANK. This is likely due to lateral gene transfer because bacterial and eukaryotic PANK enzymes are unrelated (Calder et al. 1999) and other Bacillus species have the bacterial PANK.

    FIG. 2. Phylogenetic profile of E. coli and human CoA biosynthetic genes across 47 completely sequenced genomes. Within each complete genome considered here, the best scoring Blast hit yielding an E-value of 10–3 or below was considered as a putative homologue, while "no hit" indicates lack of an entry satisfying this E-value threshold. The putative homologues retrieved in this way were scored according to pairwise amino acid identity in the best local Blast alignment. None of the homologues showed less than 20% amino acid identity in these alignments. The bifunctional enzymes PPCS/PPCDC from E. coli and PPAT/DPCK (CoA synthetase) from human were divided into individual domains, which were used in separate Blast searches (see below). The boxes indicate groups of species in which these bifunctional enzymes are conserved. In the genomes of Arabidopsis and Drosophila, a uridine kinase domain was detected with the E. coli PANK query. These hits, marked by an asterisk (*), are considered false positives. The accession numbers of the E. coli query sequences used for database searches are: P31664 (ADC), P31057 (KPHMT), P77728 (KPR), P31663 (PS), P15044 (PANK), P24285 (bifunctional PPCS/PPCDC, PPCS domain: Ser181–Arg406, PPCDC domain: Met1–Phe180), P23875 (PPAT), P36679 (DPCK). The accession numbers of the human query sequences are: NP_683879 (PANK, isoform 1b), NP_078940 (PPCS), BAB55151 (PPCDC), AAL50813 (CoA synthetase, PPAT domain: Val180–Leu350, DPCK domain: Tyr351–Asp564)

    In the 16 sequenced archaeal genomes, homologues to E. coli KPHMT are found in eight non-methanogenic species. Six archaeal genomes also contain homologues to KPR. The structure of E. coli KPHMT was recently solved and seven conserved residues were found to be involved in substrate binding (von Delft et al. 2003). Likewise, mutagenesis studies of E. coli KPR revealed two active site residues that were confirmed by the crystal structure of the enzyme (Matak-Vinkovic et al. 2001). All of these residues are conserved in the archaeal KPHMTs and KPRs. The PPCS and PPCDC domains of the bifunctional E. coli enzyme were used in separate Blast searches, and both PPCS and PPCDC are conserved in the available archaeal genomes. The fusion of PPCS (C-terminal) and PPCDC (N-terminal) is conserved in most bacteria and in all archaea. Most archaea also contain obvious homologues to the human PPAT domain but not to E. coli PPAT. Thus, archaea share some CoA biosynthetic enzymes with bacteria and at least one enzyme with eukaryotes. However, conserved genes encoding ADC, PS, PANK, and DPCK are generally lacking in archaea. Moreover, methanogenic archaea lack KPHMT and KPR, that is, they do not contain any homologues to the known genes for the synthesis of phosphopantothenate.

    The pattern of CoA biosynthetic genes conserved in eukaryotic parasites can be used to predict the CoA precursor that these pathogens obtain from their respective hosts. Sustained cultivation in vitro of Gardia lamblia (Adam 2001), Encephalitozoon cuniculi (Visvesvara 2002), and Plasmodium falciparum (Schuster 2002) requires complex media including serum or serum fractions. Therefore, the nutritional requirements for CoA or CoA precursors have not been determined experimentally in these organisms. G. lamblia, the common cause of diarrhea in humans, contains homologues to human PANK, PPAT, and DPCK. However, PPCS or PPCDC cannot be detected in G. lamblia by using Blast or Psi-Blast. As pantetheine can be phosphorylated to 4'-phoshopantetheine by PANK (Abiko 1967), this pattern indicates that G. lamblia is able to convert pantetheine, but not pantothenate, into CoA. Given the universal conservation of PPCS and PPCDC (see below), the possibility that G. lamblia has unrelated enzymes for the synthesis of phosphopantetheine seems remote. E. cuniculi is known to cause opportunistic infections in immunocompromised patients. This pathogen contains a homologue to DPCK from E. coli but not to any other CoA biosynthetic enzyme, suggesting that it has an uptake-mechanism for dephospho-CoA or CoA. Similarly, bacterial parasites like Rickettsia were implied to depend on import of external CoA (Daugherty et al. 2002). Transporter-mediated uptake of CoA has been demonstrated for isolated mitochondria from plants (Neuburger, Day, and Douce 1984) and animals (Tahiliani 1991), raising the possibility that mitochondria and some intracellular parasites share a conserved CoA transporter. The phylogenetic pattern of CoA genes in Plasmodium falciparum, the causal agent of human malaria, is similar to that found in animals, indicating that P. falciparum, like animals, is able to produce CoA from exogenous pantothenate. While PPAT from P. falciparum is not detected using Blast (fig. 2), iterative Psi-Blast searches consistently identify one weak homologue to human PPAT (E = 10–6, second iteration, 17% identity over 112 amino acids at the N-terminus of the human PPAT domain). This P. falciparum PPAT-homologue (gi|23612449) is a hypothetical protein of 1,337 amino acids that contains a weakly conserved nucleotidyltransferase domain (KOG3351) found in eukaryotic PPATs. PPAT was previously demonstrated to be essential in E. coli (Gerdes et al. 2002) and in yeast (Giaever et al. 2002). Because of its low similarity to human PPAT, the P. falciparum PPAT-homologue might, therefore, have potential as an antimalarial drug target.

    The genomes from the eukaryotic crown group (fungi, plants, animals) all contain highly conserved homologues to the five human enzymes required for conversion of pantothenate into CoA. The human bifunctional CoA synthetase containing PPAT and DPCK domains (Daugherty et al. 2002) also occurs in other metazoa. In addition to their bifunctional CoA synthetase, metazoa have monofunctional DPCK enzymes while PPAT is only found as part of the bifunctional enzyme. All other eukaryotes have monofunctional PPAT and DPCK enzymes. In agreement with their ability to synthesize pantothenate, fungi contain homologues to bacterial KPHMT, KPR, and PS. Similarly, KPHMT and PS homologues are found in plants.

    Eukaryotes and archaea lack homologues to ADC that are required for the synthesis of ?-alanine in bacteria. In Saccharomyces cerevisiae, ?-alanine is synthesized from spermine through the action of an amine oxidase, FMS1, and two aldehyde dehydrogenases, ALD2 and ALD3 (White, Gunyuzlu, and Toyn 2001; White et al. 2003). Homologues to FMS1 are found in eukaryotes only, mainly in fungi and plants, suggesting that ?-alanine generally may be derived from spermine in fungi and plants (White, Gunyuzlu, and Toyn 2001). Thus, distinct pathways for the production ?-alanine operate in bacteria and eukaryotes. The key enzymes involved in these pathways are not conserved in archaea, which may, therefore, synthesize ?-alanine using an unrelated enzyme or by yet another route.

    Reconstitution of Archaeal Pantothenate and Coenzyme A Biosynthesis

    Archaea lack highly conserved homologues to E. coli or human ADC, PS, PANK, and DPCK, and methanogens also lack KPHMT and KPR. The possibility that archaeal genomes might contain more distantly related homologues to these enzymes was investigated by using iterative Psi-Blast searches against the archaeal subset of the NCBI peptide sequence database. In addition, Blast searches were carried out using the archaeal homologues to KPHMT and KPR as queries. These approaches produced no new archaeal matches to ADC, KPHMT, PS, or PANK. One additional archaeal KPR-homologue was detected in Aeropyrum pernix by similarity to archaeal KPR-homologues. Psi-Blast detected weak similarity between E. coli PS and PPAT from Pyrococcus species (E = 0.11, eighth iteration). This is plausible, as both PS and PPAT belong to the HIGH superfamily of nucleotidyltransferases (NTases) (Bork et al. 1995; von Delft et al. 2001). In summary, Blast or Psi-Blast searches cannot identify any of the missing archaeal genes involved in the synthesis of phosphopantothenate.

    The COG database groups eukaryotic and putative archaeal PPATs into COG1019 (predicted NTases). More distantly related archaeal members of COG1019, which are not identified in figure 2, can be detected in the second and third Psi-Blast iterations using the human PPAT domain as a query sequence. COG0237 (dephospho-CoA kinase) includes the conserved bacterial and eukaryotic DPCKs and also includes a family of nucleotide kinases conserved in archaea. While archaeal COG0237 members were not detected by Blast (fig. 2), they were straightforwardly detected in Psi-Blast searches with PSSMs derived from E. coli DPCK or the human DPCK domain (E = 10–6 – 10–8, second iteration), supporting the view that the archaeal members of COG0237 are DPCKs.

    The presence of conserved CoA biosynthetic genes indicates that several non-methanogenic archaea produce pantoate in the same way as bacteria and that all archaea convert 4'-phosphopantothenate into CoA by using PPCS, PPCDC, PPAT, and DPCK activities. Thus, at least a subset of archaeal species should be able to convert pantoate into phosphopantothenate. However, no candidate genes for this conversion were found by homology to PS or PANK. The simplest explanation for this finding is that unrelated genes encode PS and PANK activities in archaea. This possibility was explored by inferring functional links to archaeal genes for KPHMT, KPR, and PPCS/PPCDC using the chromosomal proximity method (table 1). Archaeal genes for PPCS/PPCDC (COG0452) were found to be chromosomally linked to members of COG0388 (amidohydrolases), COG1701 (uncharacterized protein conserved in archaea), and COG1829 (predicted archaeal kinase). In addition, archaeal KPHMT genes (COG0413) are linked to COG1701 and COG1829 by physical proximity.

    Table 1 Functional Links to Archaeal CoA Biosynthetic Genes Inferred by Conserved Chromosomal Proximity.

    COG1701 and COG1829 are very likely to represent archaeal forms of PS and PANK, respectively. COG1701 and COG1829 genes are clustered with genes for PPCS/PPCDC and, in a different set of genomes, also with genes for KPHMT. Taken together, these links suggests that COG1701 and COG1829, like PPCS/PPCDC and KPHMT, have a function in CoA biosynthesis. COG1701 and COG1829 genes have identical phylogenetic distributions. They are present in all archaea except in Thermoplasma, but they are absent from bacteria or eukaryotes. Also, COG1701 and COG1829 genes are functionally linked by chromosomal proximity (table 1). This is a strong indication that COG1701 and COG1829 genes act in the same pathway. Using Psi-Blast, COG1701 enzymes reveal a distant relationship to acetolactate synthases. Interestingly, acetolactate synthase acts in the branched chain amino acid pathway, upstream of pantothenate synthetase, and is required for the production of -ketoisovalerate, the precursor to both valine and pantothenate. COG1829 members are distantly related to mevalonate and homoserine kinases of the GHMP-kinase family, and searching the Pfam protein families database (Bateman et al. 2002) confirms that COG1829 proteins belong to this kinase superfamily. Finally, the COG1701 gene from Methanococcus jannaschii (MJ0209) was cloned and shown to complement an E. coli panC mutant that lacks PS activity (U. Genschel, unpublished data).

    The following picture of archaeal CoA biosynthesis emerges from the above results. Bacterial enzymes for the synthesis of pantoate, KPHMT and KPR, are conserved in several archaea but are absent from methanogens and Thermoplasma. Enzymes for the synthesis of pantoate in methanogens could not be identified using homology or non-homology methods. In all likelihood, pantoate is converted into phosphopantothenate by the predicted archaeal forms of PS (COG1701) and PANK (COG1829). This confirms that the set of enzymes for synthesis of phosphopantothenate in methanogens is unrelated to the corresponding bacterial enzymes. Thermoplasma lack bacterial or archaeal enzymes for the synthesis of pantothenate and may be dependent on exogenous pantothenate. All archaea considered in this study have bifunctional PPCS/PPCDC (COG0452), PPAT (COG1019), and DPCK (COG0237) enzymes for the conversion of phosphopantothenate into CoA. Thus, at least in several non-methanogenic archaea, the topology of the synthesis of CoA from -ketoisovalerate is conserved with that in bacteria or eukaryotes.

    Phylogenetic Relationships of CoA Biosynthetic Enzymes

    Phylogenetic analysis was carried out for individual enzymes of pantothenate and CoA synthesis to shed light on the origin of eukaryotic CoA genes and to reveal cases of horizontal gene transfer that are not apparent from the mere presence or absence of these genes in the genome. The scope of ancient horizontal gene transfers and the criteria for the detection of such events were recently reviewed (Brown 2003). The enzymes KPHMT, KPR, and PS catalyze the first three steps in CoA biosynthesis, leading from -ketoisovalerate via ketopantoate and pantoate to pantothenate (fig. 1). All three enzymes are widely distributed in bacteria, and much higher genetic diversity for KPHMT, KPR, and PS is represented in bacteria than in archaea or eukaryotes (fig. 3). Therefore, it is likely that KPHMT, KPR, and PS originated in the bacterial domain. The archaeal homologues to KPHMT are clustered in one branch with Thermotoga maritima (fig. 3a), pointing to an ancient horizontal gene transfer event. If KPHMT had been vertically inherited, it would be expected that archaeal KPHMTs are significantly more similar to each other than to their closest bacterial homologue. However, the pairwise similarity scores between Thermotoga and archaeal KPHMTs are within the range of the scores obtained within the archaeal group, and some archaeal KPHMTs are more similar to the enzyme from Thermotoga than to homologues in other archaea. The assumption that KPHMT was present in the common ancestor of bacteria and archaea would also require that the gene was subsequently lost in many archaeal species. A similar situation is found in the case of KPR, which catalyzes the reduction of ketopantoate to pantoate. Archaeal KPRs are clustered with the homologue from Aquifex aeolicus in the tree shown in figure 3b. As is the case for archaeal KPHMTs, archaeal KPRs are not more conserved among each other than with some bacterial KPR homologues. Previous analyses of the genomes of T. maritima (Nelson et al. 1999) and A. aeolicus (Aravind et al. 1998) indicated that horizontal gene transfer occurred extensively between hyperthermophilic bacteria and archaea. The fact that the distributions and phylogenetic positions of archaeal KPHMT and KPR genes are more parsimoniously accounted for by horizontal gene transfer than by vertical inheritance suggests that a subset of archaea received the genes for the synthesis of pantoate from bacterial thermophiles after the separation of the bacterial and archaeal lineages.

    FIG. 3. Phylogenetic relationships of the enzymes of pantothenate biosynthesis: (a) KPHMT, (b) KPR, and (c) PS. Neighbor-joining trees were generated from maximum likelihood distances using the JTT-F matrix. The neighbor-joining trees were then refined by local rearrangement searches using the ProtML algorithm. The scale bar indicates 100 substitutions for each tree. Branches supported at a bootstrap proportion of 90% are labeled with a dot. Clustering of archaeal sequences with homologues from thermophile bacteria is supported at a bootstrap proportion of 87% in the case of KPHMT, and at 90% in the case of KPR. For clarity, some sequences from groups of highly related species were excluded from the trees

    Plants and fungi contain homologues to bacterial KPHMT and PS. Homologues to KPR are present in fungi but absent from plants and several bacterial species. Species that contain KPHMT and PS but lack a KPR homologue might possess an unrelated form of KPR or depend on acetohydroxy acid reductoisomerase (EC 1.1.1.86), which was shown to possess significant KPR activity in bacteria (Primerano and Burns 1983). S. pombe apparently obtained genes for KPHMT and PS from -proteobacteria. Both S. pombe genes appear in the -proteobacterial clusters in the phylogenetic trees shown in figures 3a and c, respectively. Also, the panBC operon, which encodes KPHMT and PS in many proteobacteria, is conserved on chromosome I of S. pombe. The high similarity between S. pombe and -proteobacterial KPHMT and PS, suggests a comparatively recent horizontal gene transfer event that might have displaced pre-existing genes for KPHMT and PS in S. pombe. The remaining eukaryotic KPHMTs in figure 3a are monophyletic, indicating that the gene might have been present in the common ancestor of plants and fungi. In contrast, plant and fungal sequences for PS are polyphyletic (fig. 3c). Eukaryotic KPHMT, KPR, and PS show no specific affinity to the respective homologues from -proteobacteria or cyanobacteria, the presumed progenitors of mitochondria and chloroplasts.

    There are unrelated forms of PANK found in bacteria (COG1072) and eukaryotes (COG5146). Neither bacterial nor eukaryotic PANKs are present in archaea, and an unrelated archaeal PANK (COG1829) could be predicted by using comparative genomics (see above). This suggests that unrelated forms of PANK were recruited independently after the separation of the domains. Interestingly, homologues to these PANK forms were not detected in bacteria such as Aquifex, Thermotoga, Deinococcus, Caulobacter, Pseudomonas, Helicobacter, and Synechocystis. However, these species would be expected to contain some form of PANK because the remaining CoA pathway is conserved in them (fig. 2). Psi-Blast searches identify distant homologues to E. coli PANK in Synechocystis and Deinococcus. These homologues are annotated as putative phosphoribulokinase and uridine kinase, respectively (E = 10–27 – 10–24, second iteration). However, no homologues can be detected in the other species mentioned above. While the identity of PANK in archaea and in some bacteria remains to be demonstrated experimentally, these observations indicate clearly that there are at least three and possibly more unrelated forms of PANK.

    Based on protein sequence comparisons and structural characteristics, the four final enzymes of CoA biosynthesis, PPCS, PPCDC, PPAT, and DPCK, are monophyletic. Conservation of these enzymes across all three domains can be established by Blast or Psi-Blast searches. The E. coli PPCS domain is well conserved in prokaryotes, but no eukaryotic PPCS is detected using this query in figure 2. This is in accordance with the observation that human PPCS is more similar to the PPCS domain from M. jannaschii than from E. coli (Daugherty et al. 2002). Indeed, using the M. jannaschii PPCS domain as a query, Psi-Blast detects all eukaryotic PPCS enzymes (second iteration, E 10–23), producing no false positive hits. PPCDC is well conserved in all three domains of life, which directly supports monophyly of this enzyme. PPCS and PPCDC are organized in a bifunctional enzyme in almost all prokaryotes, whereas Enterococci, Streptococci, and eukaryotes have monofunctional PPCS and PPCDC enzymes.

    The protein sequences of bifunctional PPCS/PPCDC enzymes were aligned using T-Coffee, and the resulting alignment was split according to biochemical evidence for the E. coli PPCS (Ser181–Arg406) and PPCDC (Met1–Phe180) domains (Kupke 2001, 2002). The PPCS and PPCDC domains were then realigned with monofunctional PPCS and PPCDC sequences, respectively. The PPCS alignment shows five invariant sites including Lys289 from the E. coli PPCS domain, which was previously shown to be functionally essential (Kupke 2002). E. coli PPCS has a strong preference for CTP as a cofactor (Strauss et al. 2001), whereas human PPCS uses ATP more efficiently than CTP (Daugherty et al. 2002). The model of the ATP binding site derived from the crystal structure of human PPCS (Manoj et al. 2003) implies Tyr176, Phe230, and Asn257 in cofactor binding. Phe230 and Asn257 correspond to invariant or highly conserved sites, respectively, while Tyr176 is conserved in eukaryotes but is changed to isoleucine or valine in prokaryotes, including bacterial species with monofunctional PPCS. On this basis, archaeal PPCS would be expected to use CTP as was shown for the E. coli enzyme. In E. coli PPCDC, Asn125 and Cys158 are essential for enzyme activity (Kupke 2001). Asn125 is conserved in eukaryotes and in most bacteria but is changed to His in archaea and actinobacteria. Cys158 is part of the proposed substrate recognition clamp of E. coli PPCDC, which spans residues Pro151–Met166 (Kupke 2001). This motif is largely conserved in bacteria and eukaryotes, except in actinobacteria where Cys158 is changed to Gly. In contrast, archaeal PPCDCs lack residues corresponding to Gly154–Gly159 (T-Coffee) or Gln156–Ile161 (ClustalX) of E. coli PPCDC, depending on the alignment method used. One possible explanation is that these exchanges, affecting both Asn125 and Cys158, are compensatory and restore activity.

    Phylogenetic trees constructed from separate PPCS and PPCDC alignments support domain monophyly for bacteria, archaea, and eukaryotes (fig. 4). The position of streptococcal and enterococcal taxa between the archaeal and eukaryotic domains in the PPCS tree (fig. 4a) is the only exception from strict domain monophyly. The PPCS and PPCDC trees show similar topologies for the PPCS and PPCDC domains from prokaryotic bifunctional enzymes, suggesting that fission and recombination of PPCS and PPCDC was a very rare event in the evolution of prokaryotes. Eukaryotes appear as a sister domain to archaea in the case of PPCS (fig. 4a) and as a sister domain to bacteria in the case of PPCDC (fig. 4b).

    FIG. 4. Phylogenetic relationships of PPCS (a) and PPCDC (b) enzymes. The trees were constructed as described in figure 3. The scale bar indicates 100 substitutions for each tree. Branches supported at a bootstrap proportion of 90% are labeled with a dot. Monophyly of the archaeal, bacterial, and eukaryotic domains is supported in both trees, except for the position of PPCS from Streptococcus and Enterococcus

    PPAT from E. coli (Geerlof, Lewendon, and Shaw 1999) or human (Daugherty et al. 2002) belongs to the superfamily of NTases characterized by the conserved HIGH motif. Monophyly of this group was inferred by sequence and structure comparisons (Bork et al. 1995). However, two distinct forms of PPAT are represented in bacteria (COG0669) and in archaea and eukaryotes (COG1019). Also, Psi-Blast searches reveal an affinity of eukaryotic PPAT to certain bacterial NTases, but not to bacterial PPAT. Analysis of the phyletic patterns of NTases and structurally related proteins suggested that at least four members of this superfamily were present in the universal ancestor and that diversification of NTases predated the universal ancestor (Aravind, Anantharaman, and Koonin 2002). Bacterial and archaeal PPATs may, therefore, originate from distinct ancestral NTases, which would explain the observed sequence divergence between them. Phylogenetic tree analysis of the conserved archaeal and eukaryotic PPATs supports monophyly of these domains (data not shown), indicating that the eukaryotic ancestor inherited PPAT from archaea.

    Archaeal DPCKs are not highly conserved with bacterial or eukaryotic DPCKs, but they can be identified using Psi-Blast (see above). Additionally, based on reciprocal best hits in complete genomes, archaeal DPCKs are grouped with bacterial and eukaryotic DPCKs into COG0237. Thus, there is sufficient homology to predict archaeal DPCKs, none of which have been confirmed, and to suggest common ancestry for all DPCKs. Metazoa have both a monofunctional DPCK and a bifunctional CoA synthetase that contains a DPCK domain. Eukaryotic monofunctional DPCKs and metazoan DPCK domains are found in two distinct clusters in a phylogenetic tree constructed from bacterial and eukaryotic DPCKs (data not shown). This indicates that the two metazoan forms of DPCK are not derived from gene duplication in the ancestor of metazoa but were inherited independently from bacteria.

    Conclusions

    Phylogenetic profiling revealed a mosaic of orthologous relationships of CoA biosynthetic genes in bacteria, archaea, and eukaryotes. The set of CoA pathway enzymes from E. coli is widely conserved among bacteria, suggesting that it represents the ancestral CoA pathway in bacteria. Similarly, the human CoA enzymes are well conserved within the eukaryotic domain. The ability to correctly project the CoA pathway onto distantly related species was recently demonstrated by the reconstitution of the Arabidopsis thaliana pathway leading from pantothenate to CoA (Kupke, Hernández-Acosta, and Culiá?ez-Macià 2003). However, based on homology, only the four ultimate CoA enzymes can be identified in archaea. Also, KPHMT and KPR are found in non-methanogenic archaeal species. This suggests that archaea have unrelated enzymes with PS and PANK activities and that methanogens additionally have unrelated enzymes for the synthesis of pantoate. Archaeal PS and PANK were predicted by chromosomal proximity and found to be unrelated to the bacterial or eukaryotic functional analogues, indicating that convergent evolution acted in the bacterial and archaeal lineages. Methanogens contain no homologues to KPHMT and KPR, while the KPHMT and KPR homologues in non-methanogens were likely acquired by horizontal gene transfer from bacterial thermophiles. Therefore, the enzymes required for the synthesis of phosphopantothenate were recruited independently in bacteria and archaea.

    Eukaryotes inherited their genes for pantothenate synthesis from bacteria, whereas eukaryotic genes for CoA biosynthesis were partly derived from bacteria and partly from archaea. Eukaryotic PPCS and PPCDC enzymes are related to the archaeal PPCS and bacterial PPCDC domains, respectively. Given that the bifunctional PPCS/PPCDC enzyme is nearly universally conserved among prokaryotes, this finding is best explained by assuming that the eukaryotic ancestor initially possessed both a bacterial and an archaeal copy of the bifunctional PPCS/PPCDC. Subsequently, one domain was lost from each bifunctional copy to generate the monofunctional PPCS and PPCDC enzymes found in eukaryotes today. This explanation is based on the view that eukaryotes arose through some fusion of an archaeon with a bacterium, a view that figures in several models aiming to explain the transition from prokaryotes to eukaryotes (Martin et al. 2001, Brown 2003). Interestingly, phylogenetic analysis relates eukaryotic PPCS specifically to the PPCS domain of methanogens (fig. 4a). This is best accounted for by the ‘hydrogen hypothesis’ (Martin and Müller 1998), which derives a hydrogen-dependent methanogenic archaeon as the most likely host that engulfed a bacterial symbiont to give rise to the first eukaryote.

    There are many CoA-dependent enzymes with universal phyletic distribution, for example, citrate synthase (EC 2.3.3.1), the alpha- and beta-subunits of succinyl-CoA synthetase (EC 6.2.1.5), and acetyl-CoA synthetase (EC 6.2.1.1), that would be expected to have been present in the universal ancestor. Hence, CoA must have been available already in the RNA world or at a very early stage of the universal ancestor. Experimental evidence implied pantothenate and pantetheine, but not dephospho-CoA or CoA, as potential prebiotic compounds (Miller and Schlesinger 1993; Keefe, Newton, and Miller 1995). Principally, CoA might initially have been synthesized from prebiotic pantetheine, a possibility supported by the ancient origin of PPAT and other NTases (Aravind, Anantharaman, and Koonin 2002). Alternatively, the most ancient step of the CoA pathway could have been the synthesis of phosphopantetheine from prebiotic pantothenate. It may be possible to discriminate among these alternatives if comparative genome analysis can reveal whether phosphopantetheine-dependent or CoA-dependent enzymes are the more ancient.

    Acknowledgements

    Work in this laboratory on pantothenate and CoA biosynthesis is funded by the Deutsche Forschungsgemeinschaft (GE 1204/2–1).

    Literature Cited

    Abiko, Y. 1967. Investigations on pantothenic acid and its related compounds. IX. Biochemical studies. 4. Separation and substrate specificity of pantothenate kinase and phosphopantothenoylcysteine synthetase. J. Biochem. (Tokyo) 61:290-299.

    Abiko, Y. 1975. Metabolism of coenzyme A. Pp. 1–25 in D. M. Greenberg, ed. Metabolism of sulphur compounds. Academic Press, New York.

    Adachi, J., and M. Hasegawa. 1996. Computer science monographs, No. 28. MOLPHY version 2.3: programs for molecular phylogenetics based on maximum likelihood. Institute of Statistical Mathematics, Tokyo.

    Adam, R. D. 2001. Biology of Giardia lamblia. Clin. Microbiol. Rev. 14:447-475.

    Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403-410.

    Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped Blast and PSI-Blast: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.

    Aravind, L., V. Anantharaman, and E. V. Koonin. 2002. Monophyly of class I aminoacyl tRNA synthetase, USPA, ETFP, photolyase, and PP-ATPase nucleotide-binding domains: implications for protein evolution in the RNA world. Proteins 48:1-14.

    Aravind, L., R. L. Tatusov, Y. I. Wolf, D. R. Walker, and E. V. Koonin. 1998. Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles. Trends Genet. 14:442-444.

    Bateman, A., E. Birney, L. Cerruti, R. Durbin, L. Etwiller, S. R. Eddy, S. Griffiths-Jones, K. L. Howe, M. Marshall, and E. L. Sonnhammer. 2002. The Pfam protein families database. Nucleic Acids Res. 30:276-280.

    Begley, T. P., C. Kinsland, and E. Strauss. 2001. The biosynthesis of coenzyme A in bacteria. Vitam. Horm. 61:157-171.

    Bork, P., L. Holm, E. V. Koonin, and C. Sander. 1995. The cytidylyltransferase superfamily: identification of the nucleotide-binding site and fold prediction. Proteins 22:259-266.

    Brown, J. R. 2003. Ancient horizontal gene transfer. Nat. Rev. Genet. 4:121-132.

    Calder, R. B., R. S. B. Williams, G. Ramaswamy, C. O. Rock, E. Campbell, S. E. Unkles, J. R. Kinghorn, and S. Jackowski. 1999. Cloning and characterization of a eukaryotic pantothenate kinase gene (panK) from Aspergillus nidulans. J. Biol. Chem. 274:2014-2020.

    Chenna, R., H. Sugawara, T. Koike, R. Lopez, T. J. Gibson, D. G. Higgins, and J. D. Thompson. 2003. Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res. 31:3497-3500.

    Daugherty, M., B. Polanuyer, M. Farrell, M. Scholle, A. Lykidis, V. de Crécy-Lagard, and A. Osterman. 2002. Complete reconstitution of the human coenzyme A biosynthetic pathway via comparative genomics. J. Biol. Chem. 277:21431-21439.

    Geerlof, A., A. Lewendon, and W. V. Shaw. 1999. Purification and characterization of phosphopantetheine adenylyltransferase from Escherichia coli. J. Biol. Chem. 274:27105-27111.

    Genschel, U., C. A. Powell, C. Abell, and A. G. Smith. 1999. The final step of pantothenate biosynthesis in higher plants: cloning and characterization of pantothenate synthetase from Lotus japonicus and Oryza sativum (rice). Biochem. J. 341:669-678.

    Gerdes, S. Y., M. D. Scholle, and M. D'Souza, et al. (16 co-authors). 2002. From genetic footprinting to antimicrobial drug targets: examples in cofactor biosynthesis. J. Bacteriol. 184:4555-4572.

    Giaever, G., A. M. Chu, and L. Ni, et al. (73 co-authors). 2002. Functional profiling of the Saccharomyces cerevisiae genome. Nature 418:387-391.

    Jackowsky, S. 1996. Biosynthesis of pantothenate and coenzyme A. Pp. 687–694 in F. C. Neidhardt, ed. Escherichia coli and Salmonella typhimurium: cellular and molecular biology. American Society for Microbiology, Washington, DC.

    Keefe, A. D., G. L. Newton, and S. L. Miller. 1995. A possible prebiotic synthesis of pantetheine, a precursor to coenzyme A. Nature 373:683-685.

    Kleinkauf, H. 2000. The role of 4'-phosphopantetheine in the biosynthesis of fatty acids, polyketides and peptides. Biofactors 11:91-92.

    Kupke, T. 2001. Molecular characterization of the 4'-phosphopantothenoylcysteine decarboxylase domain of bacterial Dfp flavoproteins. J. Biol. Chem. 276:27597-27604.

    Kupke, T. 2002. Molecular characterization of the 4'-phosphopantothenoylcysteine synthetase domain of bacterial Dfp flavoproteins. J. Biol. Chem. 277:36137-36145.

    Kupke, T., P. Hernández-Acosta, and F. A. Culiá?ez-Macià. 2003. 4'-phosphopantetheine and coenzyme A biosynthesis in plants. J. Biol. Chem. 278:38229-38237.

    Kupke, T., P. Hernández-Acosta, S. Steinbacher, and F. A. Culiá?ez-Macià. 2001. Arabidopsis thaliana flavoprotein AtHAL3a catalyzes the decarboxylation of 4'-phosphopantothenoylcysteine to 4'-phosphopantetheine, a key step in coenzyme A biosynthesis. J. Biol. Chem. 276:19190-19196.

    Kupke, T., M. Uebele, D. Schmid, G. Jung, M. Blaesse, and S. Steinbacher. 2000. Molecular characterization of lantibiotic-synthesizing enzyme EpiD reveals a function for bacterial Dfp proteins in coenzyme A biosynthesis. J. Biol. Chem. 275:31838-31846.

    Manoj, N., E. Strauss, T. P. Begley, and S. E. Ealick. 2003. Structure of human phosphopantothenoylcysteine synthetase at 2.3 A resolution. Structure 11:927-936.

    Martin, W., M. Hoffmeister, C. Rotte, and K. Henze. 2001. An overview of endosymbiotic models for the origins of eukaryotes, their ATP-producing organelles (mitochondria and hydrogenosomes), and their heterotrophic lifestyle. Biol. Chem. 382:1521-1539.

    Martin, W., and M. Müller. 1998. The hydrogen hypothesis for the first eukaryote. Nature 392:37-41.

    Matak-Vinkovic, D., M. Vinkovic, S. A. Saldanha, J. L. Ashurst, F. von Delft, T. Inoue, R. N. Miguel, A. G. Smith, T. L. Blundell, and C. Abell. 2001. Crystal structure of Escherichia coli ketopantoate reductase at 1.7 A resolution and insight into the enzyme mechanism. Biochemistry 40:14493-14500.

    Miller, S. L., and G. Schlesinger. 1993. Prebiotic syntheses of vitamin coenzymes: II. Pantoic acid, pantothenic acid, and the composition of coenzyme A. J. Mol. Evol. 36:308-314.

    Mishra P. K., P. K. Park, and D. G. Drueckhammer. 2001. Identification of yacE (coaE) as the structural gene for dephosphocoenzyme A kinase in Escherichia coli. K-12. J. Bacteriol. 183:2774-2778.

    Nelson, K. E., R. A. Clayton, and S. R. Gill, et al. (29 co-authors). 1999. Evidence for lateral gene transfer between archaea and bacteria from genome sequence of Thermotoga maritima. Nature 399:323-329.

    Neuburger, M., D. A. Day, and R. Douce. 1984. Transport of coenzyme A in plant mitochondria. Arch. Biochem. Biophys. 229:253-258.

    Notredame, T., D. G. Higgins, and J. Heringa. 2000. T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302:205-217.

    Overbeek, R., M. Fonstein, M. D'Souza, G. D. Pusch, and N. Maltsev. 1999. The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. USA 96:2896-2901.

    Primerano, D. A., and R. O. Burns. 1983. Role of acetohydroxy acid isomeroreductase in biosynthesis of pantothenic acid in Salmonella typhimurium. J. Bacteriol. 153:259-269.

    Rock, C. O., R. B. Calder, M. A. Karim, and S. Jackowski. 2000. Pantothenate kinase regulation of the intracellular concentration of coenzyme A. J. Biol. Chem. 275:1377-1383.

    Schuster, F. L. 2002. Cultivation of Plasmodium spp. Clin. Microbiol. Rev. 15:355-364.

    Smith, C. M., and W. O. Song. 1996. Comparative nutrition of pantothenic acid. J. Nutr. Biochem. 7:312-321.

    Strauss, E., C. Kinsland, Y. Ge, F. W. McLafferty, and T. P. Begley. 2001. Phosphopantothenoylcysteine synthetase from Escherichia coli. J. Biol. Chem. 276:13513-13516.

    Tahiliani, A. G. 1991. Evidence for net uptake and efflux of mitochondrial coenzyme A. Biochim. Biophys. Acta. 1067:29-37.

    Tatusov, R. L., N. D. Fedorova, and J. D. Jackson, et al. (17 co-authors). 2003. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4:41.

    Tatusov, R. L., D. A. Natale, I. V. Garkavtsev, T. A. Tatusova, U. T. Shankavaram, B. S. Rao, B. Kiryutin, M. Y. Galperin, N. D. Fedorova, and E. V. Koonin. 2001. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 29:22-28.

    Visvesvara, G. S. 2002. In vitro cultivation of microsporidia of clinical importance. Clin. Microbiol. Rev. 15:401-413.

    von Delft, F., T. Inoue, and S. A. Saldanha, et al. (11 co-authors). 2003. Structure of E. coli ketopantoate hydroxymethyl transferase complexed with ketopantoate and Mg2+, solved by locating 160 selenomethionine sites. Structure 11:985-996.

    von Delft, F., A. Lewendon, V. Dhanaraj, T. L. Blundell, C. Abell, and A. G. Smith. 2001. The crystal structure of E. coli pantothenate synthetase confirms it as a member of the cytidylyltransferase superfamily. Structure 9:439-450.

    White, W. H., P. L. Gunyuzlu, and J. H. Toyn. 2001. Saccharomyces cerevisiae is capable of de novo pantothenic acid biosynthesis involving a novel pathway of beta-alanine production from spermine. J. Biol. Chem. 276:10794-10800.

    White, W. H., P. L. Skatrud, Z. Xue, and J. H. Toyn. 2003. Specialization of function among aldehyde dehydrogenases: the ALD2 and ALD3 genes are required for beta-alanine biosynthesis in Saccharomyces cerevisiae. Genetics 163:69-77.

    Yanai, I., J. C. Mellor, and C. DeLisi. 2002. Identifying functional links between genes using conserved chromosomal proximity. Trends Genet. 18:176-179.(Ulrich Genschel)