The majority of Escherichia coli mRNAs undergo post-transcriptional mo(百拇医药)

The majority of Escherichia coli mRNAs undergo post-transcriptional mo

http://www.100md.com 《核酸研究医学期刊》

     Department of Genetics, University of Georgia Athens, GA 30602, USA

    *To whom correspondence should be addressed. Tel: +1 706 542 8000; Fax: +1 706 542 3910; Email: skushner@uga.edu

    ABSTRACT

    Polyadenylation of RNAs by poly(A) polymerase I (PAP I) in Escherichia coli plays a significant role in mRNA decay and general RNA quality control. However, many important features of this system, including the prevalence of polyadenylated mRNAs in the bacterium, are still poorly understood. By comparing the transcriptomes of wild-type and pcnB deletion strains using macroarray analysis, we demonstrate that >90% of E.coli open reading frames (ORFs) transcribed during exponential growth undergo some degree of polyadenylation by PAP I, either as full-length transcripts or decay intermediates. Detailed analysis of over 240 transcripts suggests that Rho-independent transcription terminators serve as polyadenylation signals. Conversely, mRNAs terminated in a Rho-dependent fashion are probably not substrates for PAP I, but can be modified by the addition of long polynucleotide tails through the biosynthetic activity of polynucleotide phosphorylase (PNPase). Furthermore, real-time PCR analysis indicates that the extent of polyadenylation of individual full-length transcripts such as lpp and ompA varies significantly in wild-type cells. The data presented here demonstrates that polyadenylation in E.coli occurs much more frequently than previously envisioned.

    INTRODUCTION

    Since the identification of the structural gene for Escherichia coli poly(A) polymerase I (PAP I) in 1992 (1), a variety of experiments have shown that polyadenylation plays an integral role in E.coli RNA metabolism (2–7). Specifically, the deletion of the structural gene for PAP I (pcnB) leads to a 90% reduction in poly(A) levels along with increased mRNA half-lives (2,5). Conversely, overproduction of PAP I significantly reduces mRNA stability and leads to inviability (5). Furthermore, polyadenylation in E.coli has been implicated in the general RNA quality control of transcripts, helping to remove defective RNAs and stable breakdown products (6). However, unlike in eukaryotes, the importance of polyadenylation in E.coli RNA metabolism has often been downplayed because it is believed that only a limited number of mRNAs are post-transcriptionally modified. For example, although it is estimated that between 3200 and 3300/4290 genes in E.coli are expressed in exponentially growing cells (8,9), only a few mRNAs have been directly shown to be polyadenylated (3,5,10–14).

    A further complication in understanding prokaryotic polyadenylation has been the observation that while polynucleotide phosphorylase (PNPase), a 3' 5' exonuclease, plays an important role in mRNA decay (15), it also functions biosynthetically in vivo to add heteropolymeric tails to the 3' ends of RNA transcripts (11). In fact, PNPase seems to function as the primary mechanism for the post-transcriptional modification of mRNAs in a variety of prokaryotes (16,17). Interestingly, in vivo PAP I synthesized tails exclusively contain A residues and have been found either after Rho-independent transcription terminators or attached to mRNA decay products (5,12,18). In contrast, PNPase synthesized tails are primarily heteropolymeric (they contain all 4 nt but are 50% A) and are usually distributed throughout the coding sequences (5,12,18).

    In order to obtain a better overview of the extent and significance of post-transcriptional modification of E.coli mRNAs, we have carried out a genome-wide analysis to identify polyadenylation targets. We show here that 90% of the ORFs transcribed in exponentially growing cells undergo some degree of polyadenylation. Specific array results were confirmed by a combination of northern blot analysis, kinetic RT–PCR, real-time PCR, and cDNA cloning and sequencing. The data strongly suggest that Rho-independent transcription terminators serve as polyadenylation signals not only for the ORF immediately upstream, but in the case of polycistronic transcripts for all ORFs within the transcription unit. In contrast, operons that are terminated in a Rho-dependent fashion appear to be preferentially modified by PNPase.

    MATERIALS AND METHODS

    Bacterial strains and plasmids

    The E.coli strains used in this study were all derived from MG1693 (thyA715 rph-1) (19). SK7988 (pcnB::kanr thyA715 rph-1) (2), SK10019 (pnp683::strr/spcr thyA715 rph-1) (9) and SK9124 (thyA715 rph-1/pBMK11) (5) have been described previously. Plasmid pBMK11 (Cmr pcnB+) encodes the pcnB gene under the control of the lac promoter (5), while pWSK29 is a low copy number cloning vector (20).

    Growth of bacterial strains and isolation of total RNA

    Bacterial strains were routinely grown in Luria broth supplemented with thymine (50 μg/ml) at 37°C with shaking. When appropriate, chloramphenicol (20 μg/ml) was added to the medium. Expression of the pcnB gene in SK9124 was induced with IPTG (350 μmol) as described before (5). The optical density of the cultures was measure using a Klett-Summerson colorimeter with a green filter (No. 42). Total RNA was isolated from cells grown to 50 Klett units above background (1 x 108 cells/ml) as described before (2). All RNA preparations were further treated with DNase I using the DNA-freeTM kit (Ambion, Austin, TX, USA) to remove any possible DNA contamination.

    cDNA labeling

    33P-labeled strippable cDNAs were prepared using the Endo-FreeTM RT kit (Ambion) as described earlier (9). Oligo(dT)20 and gene-specific primers (GSPs) (Sigma-Genosys, The Woodlands, TX, USA) were used to generate cDNAs to identify and estimate polyadenylated and steady-state mRNA levels, respectively. Initial reverse transcriptions of total RNA from the wild-type (MG1693) and the pcnB (SK7988) strains using oligo(dT)20 primers generated almost identical amounts of cDNAs, as estimated by liquid scintillation counting. This was surprising since a pcnB deletion strain has been shown to have only 10% of the wild-type poly(A) level (2,5). In addition, hybridization of these cDNAs to Panorama E.coli macroarrays yielded almost identical hybridization patterns (data not shown), indicating a significant level of non-specific cDNA synthesis.

    Accordingly, the total RNA (20 μg) were first passed through Dynabeads (Dynabeads? mRNA directTM kit, Dynal?) to not only enrich for polyadenylated RNAs but also to eliminate RNAs that might serve as non-specific primers prior to reverse transcription with oligo(dT)20 primers. Based on liquid scintillation counting, the amount of labeled cDNA increased 3-fold in the wild-type strain (MG1693) and 30-fold in the PAP I-induced strain (SK9124 ) (5) compared to the pcnB mutant (SK7988). These cDNAs were further analyzed on a 6% denaturing polyacrylamide gel that showed no difference in the cDNA profiles obtained with RNA from SK7988 in presence or absence of the oligo(dT)20 primer (Supplementary Figure S1, lanes 1 and 2). In contrast, the amount of cDNAs of various lengths were increased significantly in both MG1693 (Supplementary Figure S1, lane 3) and SK9124 (Supplementary Figure S1, lane 4).

    DNA array experiments

    Panorama E.coli macroarrays (Sigma-Genosys) were used to compare the relative polyadenylated and steady-state mRNA levels in various genetic backgrounds. Membranes were hybridized to the respective cDNAs, washed, and exposed on the same PhosphorImager screen as described previously (9). The membranes hybridized with poly(A)-enriched cDNAs were exposed for 120 h and the membranes hybridized with cDNAs obtained using GSPs were exposed for 24 h. The exposed screens were scanned at a pixel density of 88 μm using a Storm 840 series, Molecular Dynamics PhosphorImager. Each membrane was used up to six times following stripping of the previously hybridized probe using the Strip-EZTM RT kit (Ambion) as per the manufacturer's instructions.

    Background subtraction

    In typical macroarray experiments comparing the steady-state mRNA levels, the pixel counts of a particular gene not expressed under the experimental conditions employed are considered as background and subtracted from the pixel counts of the rest of the genes to obtain net expression. However, since E.coli contains an A/T rich genome, we were concerned about potential nonspecific priming employing short encoded poly(A) tracts and long A/G regions, which are relatively numerous in the organism. To asses the significance of these technical problems, we first carried out a search for internal poly(A) tracts of 9 nt or longer within the E.coli genome using GCG's FIND PATTERNS program. This number was chosen because internal poly(A) tracts of <9 nt would not serve efficiently as templates under the experimental conditions employed for the reverse transcription reactions. Our search identified fewer than 100 such sequences (data not shown). In order to deal with the issue of non-specific priming at long A/G tracts, we attempted to identify a transcript to use for background subtraction that was actively transcribed, was not post-transcriptionally modified (either by PAP I or PNPase) in the wild-type strain, and contained an extensive A/G tract.

    We chose the rplY mRNA, which contains a long A/G track immediately upstream of the AUG start codon that could serve as a template for oligo(dT) primed reverse transcription. In addition, the stability of this transcript was not affected in the absence of either PNPase or RNase II (9), indicating that it was probably not degraded through a poly(A)-dependent decay pathway. To test these assumptions directly, we used RT–PCR to clone oligo(dT)17-primed rplY cDNAs from either MG1693 or SK9124. Subsequent DNA sequencing demonstrated that while the 3' ends of all of the cDNAs contained 14–18 nt of contiguous A residues, they were located primarily within the A/G region immediately upstream of the AUG start codon that might have been generated through non-specific priming to oligo(dT)17 (data not shown). Based on these observations, we used the average pixel count of the rplY spots that hybridized to oligo(dT)20 derived cDNAs for background subtraction in each array.

    Data processing and analysis

    The analysis of the PhosphorImager files and data processing were carried out using Array Vision, version 5.1 (Imaging Research Inc., St Catharines, Ontario, Canada) and Excel, as described previously (9). In brief, relative mRNA abundance (either polyadenylated or steady-state level) for each ORF in each strain was obtained from two replicate experiments providing a total of four determinations. The average values of the four determinations for each ORF were considered as its expression level. The pixel density of the rplY mRNA was used as the background hybridization level for oligo(dT)-dependent cDNA hybridization. For the GSP-dependent cDNA hybridization, the pixel value for the lacZ mRNA was used for background subtraction (9). A t-test was applied to determine the consistency of the ratios across replicate hybridizations. At least a 2-fold change in the spot intensities with a P-value of 0.05 (95% probability) was considered significant and reported in this investigation. As a means of examining the experimental variability between the biological replications, we calculated the Pearson correlation coefficient for each of the strains, which varied between 0.92 and 0.96.

    Identification of Rho-independent transcription terminators

    The presence of a Rho-independent transcription terminator was determined manually by identifying predicted secondary structures within no more than 80 nt downstream from the 3' ends of annotated transcription units following the criteria of Lesnik et al. (22).

    Dot-blot and northern analysis

    Dot-blots were performed to compare the total in vivo poly(A) level as described before (5). The northern blot procedure used to measure the steady-state levels of the transcripts was as described previously (23).

    Primers used

    The AP containing a multiple cloning site (MCS) and the AAP (Abridged Adapter Primer), homologous to the MCS of AP, have been described previously (5). TRACER-3 contains a oligo(dT)20 sequence at its 3' end and a MCS at the 5' end. RACER-2 is complementary to the 5' MCS of TRACER-3. All the primers except the Taqman? probes were synthesized by MWG-Biotech AG. The Taqman? minor grove binder (MGB) probes LPP-P-AB (for lpp) and 23S-P-AB (23S rRNA) containing 5' 6-FAM and 3' nonfluorescent quencher with a MGB were synthesized by Applied Biosystems. The nucleotide sequences of all the primers used in this study are available upon request.

    Quantification of polyadenylated mRNAs by RT–PCR

    The fold-change of polyadenylated rpsO, cspA, cspC and cspE mRNAs was determined using the kinetic RT–PCR method as described previously (5) with the following modifications. The AP primed cDNAs from the wild-type (MG1693) and PAP I-induced (SK9124) strains were amplified using 5' GSPs (RPSO185A for rpsO, CSPA90 for cspA, CSPC-5' for cspC, and CSPE744 for cspE, 5' end-labeled) and AAP (3' primer) for either 6, 12 and 18 cycles for rpsO and cspC or 10, 20 and 30 cycles for cspA and cspE cDNA. Two microlitre of each PCR mixture were separated on a 1.5–2% agarose gel, dried and the amplification products were quantified using a PhosphoImager and ImageQuant software. The fold-change in the polyadenylated transcript levels represents the highest ratio of pixel counts for the amplification products from the PAP I induced strain versus the wild-type control.

    Real-time PCR

    Real-time PCR was used to quantitate the extent of full-length mRNA polyadenylation for the lpp, ompA, yjgG and ymgC mRNAs in an AB7500 real-time PCR System (Applied Biosystems) using the comparative CT (CT) method as described by the manufacturer. All cDNAs for real-time PCR were generated using either the Endo-freeTM RT kit (Ambion) or SuperscriptTM III reverse transcriptase at 48°C. Only 95–100 nt of the 3' end of each cDNA were amplified to permit effective PCR using either the SYBR Green? I reagent (ompA, yjgG, ymgC) or custom TaqMan? gene expression assays as specified by the manufacturer. The ompA, yjgG and ymgC mRNAs were reversed transcribed using either an oligo(dT)20 primer (representing polyadenylated transcripts) or a 3' GSP (OMPA2074 for ompA, YJGG-455R for yjgG and YMGC-355R for ymgC, representing total mRNA). Both cDNAs (2–4 μl) were amplified in separate PCR employing SYBR Green? I PCR Mastermix (Applied Biosystems) in the presence of common 5' and 3' GSPs (OMPA1067 and OMPA2074 for ompA, YJGG-322F and YJGG-455R for yjgG, and YMGC-233F and YMGC-355R for ymgC).

    For lpp, cDNAs were derived from total RNA (2 μg) using either the TRACER-3 primer (representing polyadenylated mRNAs) or a 3' GSP (LPP538-RACER-2 containing the 3' 20 nt of lpp and the 5' MCS identical to TRACER-3, representing total lpp mRNA). Both cDNAs were amplified in separate PCR using TaqMan? Universal PCR Mastermix (Applied Biosytems) in the presence of a common 5' GSP (LPP439), TaqMan? probe (LPP-P-AB) and 3' primer (RACER-2, complementary to the 5' MCS of TRACER-3 and LPP-538-RACER-2). 23S rRNA was used as the endogenous control for both transcripts (details available on request). All reactions were run in duplicate. PCR product sizes were validated by separating 3 μl of each PCR in 8% non-denaturing polyacrylamide gels. The level of polyadenylated transcripts in each genetic background was expressed as the percentage of total transcripts of the corresponding mRNA.

    Cloning and sequencing of cDNAs

    AP primed cDNAs derived from total RNA were amplified using a 5' GSP and 3' AAP and cloned into pWSK29 (20) for DNA sequencing as described before (11). rplY cDNAs were amplified using primers RPLY-CLA (56 nt upstream of the ATG) and 3' AAP. The PCR products were cloned either as ClaI–XbaI (ClaI part of RPLY-CLA primer) or EcoRI–XbaI (EcoRI part of rplY coding sequence) frag for DNA sequencing. Four different GSPs (PCNB-ECOR, PCNB92, PCNB979S and PCNB1379S) were used for pcnB mRNA and one GSP (OMPA-PST) was for ompA mRNA (Figure 3). pcnB cDNAs were cloned either as EcoRI–XbaI or KpnI–XbaI fragments. The ompA cDNAs were cloned either as PstI–XbaI or BamHI–XbaI fragments. The EcoRI, PstI and XbaI sites were introduced into the primers (PCNB-ECOR, OMPA-PST and AAP, respectively), while the KpnI and BamHI sites were part of the pcnB and ompA coding sequences, respectively (Figure 3). All DNA sequencing was carried out using an automated sequencer (Applied Biosystems 3730 x l DNA analyzer).

    RESULTS

    Genome-wide identification of polyadenylated transcripts

    Previous experiments have indicated that PAP I is responsible for 90% of the total polyadenylation in E.coli (2,5), with the rest arising from the biosynthetic activity of PNPase (11). While PAP I only synthesizes homopolymeric poly(A) tails in vivo, PNPase generated tails are primarily adenosine rich heteropolymers (11,12). Since both kinds of tails have been isolated using oligo(dT)-driven RT–PCR (5,11,12), we hypothesized that a genome-wide analysis of cDNAs generated using oligo(dT) primers would facilitate the identification of all mRNA targets that were post-transcriptionally modified by PAP I and/or PNPase. We deemed this comparison would be valid since there is only a minimal growth difference between a wild-type and PAP I deficient strains (a doubling time of 30 min in the wild-type control compared to 32 min for the PAP I mutant, data not shown).

    It should also be noted that many ORFs in E.coli are synthesized as parts of polycistronic transcripts. Thus it was possible that some ORFs detected on the arrays would arise from a single polycistronic transcript that had been polyadenylated. For the analysis presented here, we have assumed that such polycistronic cDNAs constitute a very small fraction of the labeled material for the following reasons. In the first place, it is well known that polycistronic mRNAs are rapidly processed into smaller units, such that under steady-state conditions the primary polycistronic transcript makes up only a very small fraction of the total mRNA for a particular operon. In addition, since the arrays contain each full-length ORF, they will detect both full-length and decay intermediates that have been polyadenylated.

    Accordingly, E.coli macroarrays (Panorama, Sigma-Genosys) were used to assay all 4290 ORFs in a wild-type strain (MG1693). In order to minimize for non-specific hybridization arising from priming in templates with long A/G rich or short internal poly(A) sequences, the average pixel counts of the rplY (encoding the ribosomal protein L25) spots was used for background subtraction (see Materials and Methods). Previous macroarray studies using E.coli GSP-dependent cDNAs have shown that 74–77% (3175–3303) of the ORFs are expressed in exponentially growing wild-type strains (8,9). In our experiments, using oligo(dT)20 derived cDNAs, 68 ± 2% (2917 ± 86) of all ORFs (4290) in the wild-type strain exhibited hybridization levels at least 2-fold above background, indicating that the corresponding transcripts were post-transcriptionally modified by PAP I and/or PNPase. Thus, transcripts from >90% of the ORFs expressed during exponential growth were post-transcriptionally modified.

    All transcribed ORFs are polyadenylated when PAP I is overexpressed

    Previously we have shown that overproduction of PAP I from a controlled expression plasmid for 15 min led to 90 ± 10-fold increase in the total in vivo poly(A) level with no change in the cell's growth rate (5). Accordingly, as a positive control, we compared the oligo(dT)20 generated transcriptomes from a PAP I-induced strain (SK9124) with a wild-type strain (MG1693). As expected, the majority of the ORFs that were detected in the wild-type strain showed increased hybridization (2- to 50-fold) after PAP I induction (Figure 1, Supplementary Figure S2). In fact, the total number of ORFs that were detected increased to 77 ± 2%, which was identical to the number of ORFs detected with GSP-dependent cDNAs in a wild-type strain (8,9). Overall the hybridization intensities for the SK9124 array increased 21 ± 0.1-fold over the wild-type (MG1693) array. The hybridization levels of 90% of the polyadenylated wild-type ORFs increased between 2- and 10-fold, while 5% of the ORFs showed an increase of 10- to 50-fold in poly(A) levels in SK9124 compared to the wild-type control. The poly(A) levels for the rest of the ORFs remained either unchanged or increased <2-fold above the wild-type level. Interestingly, five genes (tolA, yceO, yidJ, yicF and rfaS) showed a significant decrease in level of polyadenylated transcripts following PAP I induction.

    Figure 1 Scatter plot of the signal intensities of all the ORFs showing increased polyadenylation levels following transient PAP I induction in SK9124 compared to the wild-type strain (MG1693). Each point represents average pixel counts from two replicate experiments (four spots).

    PAP I is responsible for the post-transcriptional modification of the majority of E.coli mRNAs

    In order to determine if PNPase played a significant role in the post-transcriptional modification of E.coli mRNAs, we hybridized an array with cDNAs obtained from a pcnB strain (SK7988), since PNPase is solely responsible for the post-transcriptional addition of polynucleotide tails in the absence of PAP I (11). In this case, only 48 ORFs (1.1%) were detected in the pcnB deletion strain (Supplementary Table S1). The pixel counts of these ORFs were significantly above background, but were much lower (1.5- to 6.3-fold) compared to the wild-type strain. Since it was possible that these species were detected because of cDNA synthesis resulting from inefficient priming utilizing A/G rich regions or short encoded poly(A) tracts within their coding sequences, we examined these ORFs for the presence of such sequences. The longest A/G tracts that were observed varied between 5 and 13 nt. In contrast, the rplY mRNA that was used for background subtraction contained an A/G tract of 21/26 nt (see Materials and Methods). In addition, no significant internal poly(A) tracts were present in any of the 48 ORFs.

    Although this observation indicated that the post-transcriptional modification of these ORFs arose through the activity of PNPase (11), the only common feature for the majority of these transcripts (45/48) was that they were terminated in a Rho-dependent fashion either as a monocistronic transcript or as part of a polycistronic transcript. In addition, five were IS inserts (four IS5 and one IS1) and five encoded ribosomal protein genes (rplN, rplE, rplF, rpsE and rpmB). Eight of the 48 transcripts were monocistronic. At least one of the mRNAs (ompA) has also been shown to be a substrate for PAP I (2,5).

    Validation of array data

    Even with the most careful experimental design and the application of rigorous statistical analysis, some of the data in a global gene expression analysis may be in error because of possible technical as well as biological problems (9,24,25). Accordingly, we sought to validate our results by a direct comparison of the macroarray results with the published data on the polyadenylation of individual E.coli mRNAs. For example, it has been shown by cDNA sequencing that the 3' ends of the lpp, rpsO, trxA, rpsT and rmf mRNAs are modified by either PAP I or PNPase (5,10–12,14,18). In fact, all of these ORFs except rmf showed above background levels of hybridization in the wild-type array (MG1693). Since the rmf mRNA has been shown to only contain short (<6 nt) poly(A) tails (10), the stringent enrichment for polyadenylated transcripts (see Materials and Methods) may have led to the exclusion of this ORF. Furthermore, the acnA transcript, which was shown by direct sequencing not to contain poly(A) tails (10), was also not detected in the array probed with oligo(dT)20 generated cDNAs (data not shown).

    Next we compared the polyadenylation levels of selected transcripts in the wild-type and PAP I-induced strain using kinetic RT–PCR. The increases in the level of polyadenylated transcripts for lpp (11-fold) and rpsT (5-fold) after PAP I induction as indicated by the macroarray data were consistent with previous RT–PCR estimations under identical conditions (5,11,18). Similarly, the increase in the amount of polyadenylated rpsO mRNA (ribosomal protein S20), a previously identified substrate for PAP I (5) was identical in both the macroarray and RT–PCR experiments (Table 1). We also compared the levels of polyadenylated cspA, cspC and cspE transcripts, since out of the nine csp (cold shock protein) genes (cspA–cspI) in E.coli, these are expressed at 37°C (26,27) and are predicted to be polyadenylated (28). The kinetic RT–PCR results were in good agreement with the array data for these genes (Table 1). In addition, the cspB, cspD, cspF, cspG, cspH and cspI mRNAs were, as predicted, not detected on the arrays in either of the strains (data not shown).

    Table 1 Change in polyadenylated transcript levels between wild-type (MG1693) and PAP I induced (SK9124) strains as determined using array and kinetic RT–PCR analysis

    Polyadenylation levels of full-length transcripts

    The array data presented above helped us to identify all the ORFs that were polyadenylated (either as full-length or breakdown products). However, it neither showed the total level of polyadenylation nor the fraction of full-length polyadenylated transcripts for individual ORFs. While we would ideally like to know the extent of polyadenylation for all the ORFs, at the present time it is both technically challenging and cost prohibitive to make this determination. Accordingly, we choose two model transcripts (lpp and ompA) to determine what fraction of their full-length mRNAs were polyadenylated using real-time PCR. These two mRNAs were picked because we have previously shown that they are both targets for PAP I (2,5). In addition, ompA was also one of the 48 ORFs detected in the pcnB deletion strain (Supplementary Table S1). We included the yjgG and ymgC mRNAs in the real-time PCR study as negative controls. These transcripts were not polyadenylated based on their below background level pixel counts in the wild-type array after hybridization to oligo(dT)20-primed cDNAs, but had above background pixel counts after hybridization with GSP-dependent cDNAs (data not shown).

    For direct comparison with the array data, oligo(dT)20-primed cDNAs were used in the real-time PCR experiments, as described in the Materials and Methods section. In agreement with the array data, no polyadenylated yjgG and ymgC mRNAs were detected in the wild-type strain (Table 2), while very low levels were detected following a 15 min induction of PAP I (Table 2). In contrast, ompA showed a higher percentage of polyadenylated transcripts compared to lpp in both the wild-type (MG1693) and pcnB deletion (SK7988) strains (Table 2). Polyadenylation of full-length transcripts for both mRNAs increased significantly after PAP I induction (SK9124, Table 2). It should be noted that quantification of polyadenylated full-length lpp mRNAs, using oligo(dT)17-primed cDNAs generated at a lower annealing temperature (42°C versus 48°C) resulted in a higher percentage of polyadenylated transcripts in the wild-type and PAP I-induced strains (Table 2).

    Table 2 Quantification of polyadenylated full-length transcripts using real-time PCRa

    Relationship between steady-state mRNA levels and increases in polyadenylation

    Various studies have suggested that increased polyadenylation leads to more rapid decay of E.coli mRNAs (2,3,5). These results would predict a concomitant reduction in the steady-state levels of the mRNAs that are polyadenylated. To test this hypothesis directly, we hybridized the arrays with cDNAs derived from either MG1693 (wild-type) or SK9124 (PAP I-induced) using GSPs. Surprisingly, the steady-state transcript levels of the majority of ORFs (68 ± 2%) were identical, within experimental error, in the wild-type and PAP I-induced strains. Only 8% of the ORFs showed either a decrease or increase in their steady-state levels that were 2-fold in the PAP I-induced strain compared to the wild-type control. Within this group of ORFs, the steady-state transcript levels of 97 genes were decreased, while 228 ORFs showed increased levels. Interestingly, some of the transcripts that showed significant changes in their steady-state transcript levels were either not polyadenylated or showed no change in poly(A) level (Supplementary Tables S2A and 2B).

    In order to determine if the changes in steady-state levels that were observed using the arrays were valid, we used northern blot analysis to directly examine a number of specific mRNAs. Following 15 min of PAP I induction, mRNAs from three different classes were analyzed: (i) Those that had unchanged steady-state levels; (ii) Those that had increased steady-state levels; and, (iii) Those that had decreased steady-state transcript levels. As shown in Table 3, there was excellent agreement between the array data and the northern blot results for all of the mRNAs that were tested.

    Table 3 Comparison of steady-state levels of specific mRNAs in a PAP I-induced strain compared to the wild-type control (MG1693) as determined by DNA macroarray and northern blot analysis

    However, these results seemed to contradict our expectations of reduced steady-state levels based on the shorter half-lives we observed after a 15 min induction (5). In particular, we previously reported that the half-lives of specific transcripts such as lpp, ompA, trxA, and rpsO were decreased 1.5- to 4-fold under conditions in which there was an 90-fold increase in the in vivo poly(A) level (5). As such, we carefully reexamined the half-life data and noticed that changes were not observed until 5–8 min after the addition of rifampicin (20–25 min of increased PAP I activity) (5). Accordingly, we hypothesized that the 15 min period over which PAP I was overproduced in the presence of ongoing transcription was too short to allow new steady-state levels to be established. To investigate this possibility directly, the steady-state mRNA levels of lpp, ompA, trxA and rpsO were determined for up to 45 min after PAP I induction. Under these circumstances, the growth rate of the PAP I induced strain decreased marginally from 30 min before induction to 35 min after induction (data not shown), while the steady-levels of all the transcripts tested decreased between 2.5- and 5.0-fold (Figure 2). These data suggest that more than 15 min of PAP I overexpression is required for transcripts to reach new steady-state levels.

    Figure 2 Comparison of steady-state levels of representative mRNAs in wild-type (MG1693) and PAP I induced (SK9124) strains employing northern blot analysis. Cell cultures were grown to 50 Klett units (1 x 108cells/ml) above background (0 min). Subsequently IPTG (350 μM) was added and the cultures were grown for additional times. Total RNA was isolated at times (min after IPTG addition) indicated on the top of the blot. Five microgram of total RNA was loaded in each lane and separated in 6% polyacrylamide/7 M urea gels. The transcripts were probed as described in Materials and Methods. The relative quantity of each full-length transcript was determined using a Molecular Dynamics PhosphorImager. The wild-type level of each mRNA at 0 min was set at 1 and the corresponding fold-changes are noted at the bottom of each lane.

    Polyadenylation patterns of transcripts encoding related proteins

    Another question we wanted to address related to how polyadenylation affected mRNAs encoding proteins involved in the same cellular process. We choose ribosomal protein mRNAs for this study as they represent a group of ORFs that are highly expressed but have diverse organizational structures. In addition, one of the mRNAs (rpsO) has served as a model substrate for the in vivo analysis of polyadenylation (3,11,29). Seven of the 55 ribosomal protein genes are transcribed as monocistronic mRNAs, while the rest of the genes are embedded in 14 different operons including one containing 11 genes (Table 4). In addition, five of the monocistronic and 11 of the polycistronic mRNAs terminate with Rho-independent transcription terminators. Interestingly, in wild-type E.coli 50/55 of the ribosomal protein ORFs were polyadenylated, irrespective of their mode of termination. Only five mRNAs (rplY, rplT, rpmG, rpsM and rpmJ) had no detectable poly(A) tails within the experimental parameters of the array experiment. In addition, transcripts of genes within the same operon tended to demonstrate comparable changes in their polyadenylation levels following PAP I induction (Table 4). In contrast, there were no significant changes (<2-fold) in steady-state mRNA levels for the majority of the ribosomal protein genes after PAP I induction except for rpsU (–2.0-fold), rplT (+3.2-fold), and rplQ (–2.0-fold) (data not shown).

    Table 4 Polyadenylation patterns of ribosomal protein mRNAs

    Rho-independent transcription terminators appear to be targets for polyadenylation

    Previous experiments have indicated that a Rho-independent transcription terminator might play a significant role in mRNA polyadenylation in wild-type E.coli (12). To determine if this were a general pattern for E.coli transcripts, we analyzed the 3' nt sequences for the presence or absence of a Rho-independent transcription terminator of more than 200 ORFs with the highest pixel counts from both polyadenylated and non-polyadenylated transcripts (Table 5; Supplementary Tables S3 and S4). The majority of the transcripts in each group were part of operons, some of which contained up to 12 ORFs. In agreement with our previous results (12), of the ORFs showing high levels of post-transcriptional modification, 72% were linked to a Rho-independent transcription terminator either as a monocistronic or part of a polycistronic mRNA. In most cases, every gene within a polycistronic transcript that was terminated in a Rho-independent fashion was post-transcriptionally modified . In contrast, all but one of the abundant transcripts analyzed that were not post-transcriptionally modified lacked a Rho-independent transcription terminator (Table 5; Supplementary Table S4).

    Table 5 Analysis of ORFs for the presence or absence of Rho-independent transcription terminators and post-transcriptional modification

    Analysis of post-transcriptional modification of specific transcripts containing or lacking a Rho-independent transcription terminator

    The data presented in Table 5 did not provide insights into the nature of the post-transcriptional modifications associated with each of the transcripts. However, our previous sequencing experiments with transcripts terminating in a Rho-independent fashion (lpp and rpsO) showed that the majority of the poly(A) tails were homopolymeric and occurred after the terminator (5,11,12). In contrast, a transcript terminated in a Rho-dependent fashion (trxA) contained mainly heteropolymeric tails distributed throughout the coding sequence (12).

    Here we have extended this analysis to two additional transcripts derived from the wild-type strain (MG1693) using RT–PCR cloning and DNA sequencing of the 3' ends. In one case, we analyzed the ompA mRNA, which contains a distinct Rho-independent transcription terminator and whose half-life is directly related to the level of polyadenylation (5). The second transcript chosen was the pcnB folK operon, in which the pcnB coding sequence overlaps the downstream folK gene and the operon is terminated in a Rho-dependent fashion (Figure 3).

    Figure 3 Graphical presentation of the post-transcriptional modification sites in the pcnB-folK (A) and ompA (B) mRNAs. cDNAs from MG1693 (wild-type) were cloned and sequenced as described in Materials and Methods section. The 3' end positions of transcripts with post-transcriptional modifications (homo- or heteropolymeric tails) are shown in brackets. The half-arrows indicate the position of the 5' primers used for PCR amplification. The restriction sites for KpnI (K), PstI (P), and BamHI (B) are shown. The intercistronic overlap (ATGA) between the pcnB translation stop codon and the folK translation start codon is shown. The figure is not drawn to scale.

    As predicted, all 30 independently derived pcnB cDNA clones contained heteropolymeric tails ranging in length from 27 to over 600 nt (seven clones <50 nt; seven clones, 50–100 nt; seven clones, 101–200 nt; seven clones, 201–300 nt; one clone, 358 nt; and one clone >600 nt; data not shown). They were distributed generally in the 3'-region of the pcnB and throughout folK coding region (Figure 3A). Since the heteropolymeric tails in wild-type E.coli have been shown to be added by PNPase (11), we also cloned and sequenced pcnB cDNA clones isolated from a pnp683 strain (SK10019). As expected, all 12 of the independently derived cDNA clones had homopolymeric poly(A) tails ranging in size from 17 to 27 nt (data not shown), which were located in the same regions where the heteropolymeric tails were observed in the wild-type strain (Figure 3A).

    With ompA, a very different result was obtained (Figure 3B). In this case, 77% (23/30) of the clones sequenced contained poly(A) tails ranging in length from 17 to 31 nt. The remaining 23% (7/30) of the clones contained heteropolymeric tails (21–68 nt). Ten of the tails, including one heteropolymeric one, were found after the Rho-independent transcription terminator. The rest of the tails were distributed throughout the coding region, with the majority being located near the 5' end of the transcript.

    DISCUSSION

    While polyadenylation has been implicated in processing, quality control and RNA decay in both prokaryotes (2,5,6,30) and eukaryotes (31), its significance in bacteria has previously been minimized because of the limited knowledge of how many mRNAs undergo post-transcriptional modification. Here we have provided strong evidence for the first time that polyadenylation of mRNAs is a very general phenomenon (90% of the ORFs) in exponentially growing E.coli. In fact, it appears that any transcribed mRNA can be polyadenylated, since the number of ORFs with polyadenylated transcripts increased to 100% after a transient increase in PAP I level. While the array data did not measure the extent to which each transcript was polyadenylated, the 2- to 50-fold increase in the hybridization intensity for specific ORFs after PAP I overproduction (Figure 1, Supplementary Figure S2) suggests that the apparent low levels of polyadenylated transcripts observed using olido(dT) affinity chromatography (32) results from low intracellular levels of PAP I rather than limited accessibility to potential substrates.

    In addition, it should be noted that both the level and average length of poly(A) tails is reduced dramatically by the degradative activity of 3' 5' exonucleases such as PNPase and/or RNase II (2,10,18,33), resulting in the majority of the poly(A) tails being <15 nt in length in the wild-type strain (5,11). Thus it seems likely that the extent of polyadenylation in exponentially growing cells is significantly underestimated because the average tail-length, kept short by the activities of PNPase and RNase II, is not sufficiently long enough to be enriched using a stringent oligo(dT) approach. The 1.7-fold increase in the detection of full-length polyadenylated lpp mRNAs using real-time PCR when the cDNAs were synthesized using less stringent conditions supports this hypothesis (Table 2).

    Furthermore, the detection of only 48/3300 expressed ORFs in the absence of PAP I (pcnB, Supplementary Table S1) confirmed our earlier results obtained with poly(A) sizing assays that suggested that PAP I accounts for 90% of the poly(A) tails in exponentially growing E.coli (2,5). Does this observation mean that PNPase only plays a minor role in the post-transcriptional modification of mRNAs? We think that the answer is probably no. In the first place, cloning and sequencing analysis of the lpp, rpsO, ompA, trxA and pcnB-folK mRNAs, (5,11,12, Figure 3) strongly suggest that transcripts without a Rho-independent transcription terminator are modified primarily by PNPase. Since 50% of the annotated protein encoding transcriptional units are predicted to be terminated in a Rho-dependent fashion (22), there are a large number of potential targets for post-transcriptional modification by PNPase. In addition, it is also likely that PNPase synthesized tails containing short sequences of contiguous A residues were excluded during enrichment with Dynabeads . Furthermore, in the absence of PAP I, the polymerizing efficiency of PNPase may be reduced considering that these proteins normally exist as a complex in wild-type E.coli (12).

    Finally, based on sequencing of cDNAs derived from a variety of mRNAs, we have observed that every mRNA studied so far can be a substrate for PNPase catalyzed polynucleotidylation (5,11,12) (Figure 3). While the current sample size is still relatively small, it would appear that any transcript that is a substrate for the 3' 5' nucleolytic activity of PNPase can also be post-transcriptionally modified with the addition of heteropolymeric tails. It thus seems probable that the addition of heteropolymeric tails to mRNA decay intermediates is quite common even though its biological function is still not currently understood. What is particularly interesting is that most these tails are longer than 100 nt in length compared to the majority of homopolymeric tails that are shorter than 50 nt (5,12) (data not shown).

    The data presented above also lend further support to our earlier observation that Rho-independent transcription terminators play a significant role in the polyadenylation of E.coli mRNAs by PAP I (12). For example, examination of over 100 ORFs that were highly expressed but not polyadenylated in wild-type E.coli showed that only one of them (0.9%) contained a Rho-independent transcription terminator (Table 5, Supplementary Table S4). In contrast, an identical analysis of over 130 highly polyadenylated ORFs revealed that 72% of them were terminated in a Rho-independent fashion (Table 5, Supplementary Table S3). In addition, the direct sequencing of cDNAs demonstrated the presence of mainly poly(A) tails on the ompA mRNA which contains a Rho-independent transcription terminator compared to the absence of such tails on the pcnB-folK dicistronic mRNA which terminates in a Rho-dependent fashion (Figure 3). These data are in agreement with our previous sequencing analysis of the lpp, rpsO and trxA mRNAs (5,11,12).

    Our results further suggest that if a polycistronic operon containing a Rho-independent transcription terminator is a substrate for PAP I, the processed portions of the transcript are also substrates for the enzyme (Table 4, Supplementary Table S3). One possible explanation for this phenomenon relates to the fact that polyadenylation at Rho-independent transcription terminators appears to involve the action of a multi-protein complex that includes PAP I, PNPase and the RNA binding protein Hfq (12). If this complex had some form of transient interaction with the RNase E based degradosome, it would be possible that the processing products of a polycistronic transcript would be transferred to the polyadenylation complex. The recent paper suggesting that Hfq can also interact with RNase E (34) provides some support for the idea of swapping of RNA molecules between RNase E and PAP I through an Hfq mediated exchange. In contrast, RNA molecules terminated in a Rho-dependent fashion are not substrates for Hfq, resulting in the lack of polyadenylation for the full-length molecule or its decay intermediates. Taken together, we conclude that a transcript with a Rho-independent transcription terminator is likely to be polyadenylated in vivo. In contrast, there is an <28% chance for polyadenylation of a transcript terminated in a Rho-dependent fashion (Table 5).

    A somewhat unexpected observation concerned the relationship between steady-state transcript levels and polyadenylation. In particular, the steady-state levels of only 327 transcripts (8%) ORFs changed more than 2-fold following 15 min of PAP I induction (Supplementary Tables S2A and 2B). Furthermore, we were unable to find any direct correlation between the changes in the steady-state level for a particular transcript and its apparent level of polyadenylation. Thus, how does one explain the unchanged steady-state transcript levels of 68% of the ORFs including lpp, ompA, rpsO and trxA mRNAs (Table 3), when it was previously shown that half-lives decreased significantly following a 15 min induction of PAP I (5).

    The data presented in Figure 2 have provided some insights that may partially answer this question. Specifically, under conditions where transcription is permitted to continue, it takes 30 min for the new steady-state levels to be achieved. Once this adaptation has occurred, decreases between 2.5- and 5.0-fold were observed for all four mRNAs tested by northern analysis (Figure 2). However, to address this question unequivocally, it will be necessary to carry out a genome-wide experiment using RNAs isolated after a 30 min induction.

    In conclusion, it should be noted that our macroarray data have not addressed the important question relating to what fraction of full-length mRNAs are polyadenylated in wild-type E.coli. The fact that there were significant differences in the pixel counts of the various polyadenylated transcripts suggested that there might be variations in the fraction of each transcript that was post-transcriptionally modified. This idea was supported by the real-time PCR analysis that showed 3-fold difference in the polyadenylation of the full-length lpp and ompA mRNAs (Table 2). Taken together, it would appear that polyadenylation occurs more frequently than previously envisioned, probably helping the cell distinguish productive from nonproductive RNAs, thereby increasing the efficiency of the RNA degradative machinery.

    SUPPLEMENTARY DATA

    Supplementary Data are available at NAR Online.

    ACKNOWLEDGEMENTS

    This work was supported in part by a grant (GM57220) to S.R.K. from the National Institute of General Medical Sciences. Funding to pay the Open Access publication charges for this article was provided by GM57220.

    REFERENCES

    Cao, G.-J. and Sarkar, N. (1992) Identification of the gene for an Escherichia coli poly(A) polymerase Proc. Natl Acad. Sci. USA, 89, 10380–10384 .

    O'Hara, E.B., Chekanova, J.A., Ingle, C.A., Kushner, Z.R., Peters, E., Kushner, S.R. (1995) Polyadenylylation helps regulate mRNA decay in Escherichia coli Proc. Natl Acad. Sci. USA, 92, 1807–1811 .

    Hajnsdorf, E., Braun, F., Haugel-Nielsen, J., Régnier, P. (1995) Polyadenylylation destabilizes the rpsO mRNA of Escherichia coli Proc. Natl Acad. Sci. USA, 92, 3973–3977 .

    Coburn, G.A. and Mackie, G.A. (1996) Differential sensitivities of portions of the mRNA for ribosomal protein S20 to 3'-exonucleases is dependent on oligoadenylation and RNA secondary structure J. Biol. Chem, . 271, 15776–15781 .

    Mohanty, B.K. and Kushner, S.R. (1999) Analysis of the function of Escherichia coli poly(A) polymerase I in RNA metabolism Mol. Microbiol, . 34, 1094–1108 .

    Li, Z., Pandit, S., Deutscher, M.P. (1998) Polyadenylation of stable RNA precursors in vivo Proc. Natl Acad. Sci. USA, 95, 12158–12162 .

    Reimers, S., Pandit, S., Deutscher, M.P. (2002) RNA quality control: Degradation of defective transfer RNA EMBO J, . 21, 1132–1138 .

    Tao, H., Bausch, C., Richmond, C., Blattner, F.R., Conway, T. (1999) Functional genomics: Expression analysis of Escherichia coli growing on minimal and rich media J. Bacteriol, . 181, 6425–6440 .

    Mohanty, B.K. and Kushner, S.R. (2003) Genomic analysis in Escherichia coli demonstrates differential roles for polynucleotide phosphorylase and RNase II in mRNA abundance and decay Mol. Microbiol, . 50, 645–658 .

    Aiso, T., Yoshida, H., Wada, A., Ohki, R. (2005) Modulation of mRNA stability participates in stationary-phase-specific expression of ribosome modulation factor J. Bacteriol, . 187, 1951–1958 .

    Mohanty, B.K. and Kushner, S.R. (2000) Polynucleotide phosphorylase functions both as a 3'–5' exonuclease and a poly(A) polymerase in Escherichia coli Proc. Natl Acad. Sci. USA, 97, 11966–11971 .

    Mohanty, B.K., Maples, V.F., Kushner, S.R. (2004) The Sm-like protein Hfq regulates polyadenylation dependent mRNA decay in Escherichia coli Mol. Microbiol, . 54, 905–920 .

    Marujo, P.E., Hajnsdorf, E., Le Derout, J., Andrade, R., Arraiano, C.M., Regnier, P. (2000) RNase II removes the oligo(A) tails that destabilize the rpsO mRNA of Escherichia coli RNA, 6, 1185–1193 .

    Coburn, G.A. and Mackie, G.A. (1998) Reconstitution of the degradation of the mRNA for ribosomal protein S20 with purified enzymes J. Mol. Biol, . 279, 1061–1074 .

    Donovan, W.P. and Kushner, S.R. (1986) Polynucleotide phosphorylase and ribonuclease II are required for cell viability and mRNA turnover in Escherichia coli K-12 Proc. Natl Acad. Sci. USA, 83, 120–124 .

    Rott, R., Zipor, G., Portnoy, V., Liveanu, V., Schuster, G. (2003) RNA polyadenylation and degradation in cyanobacteria are similar to the chloroplast but different from Escherichia coli J. Biol. Chem, . 278, 15771–15777 .

    Sohlberg, B., Huang, J., Cohen, S.N. (2003) The Streptomyces coelicolor polynucleotide phosphorylase homologue, and not the putative poly(A) polymerase, can polyadenylate RNA J. Bacteriol, . 185, 7273–7278 .

    Mohanty, B.K. and Kushner, S.R. (2000) Polynucleotide phosphorylase, RNase II and RNase E play different roles in the in vivo modulation of polyadenylation in Escherichia coli Mol. Microbiol, . 36, 982–994 .

    Arraiano, C.M., Yancey, S.D., Kushner, S.R. (1988) Stabilization of discrete mRNA breakdown products in ams pnp rnb multiple mutants of Escherichia coli K-12 J. Bacteriol, . 170, 4625–4633 .

    Wang, R.F. and Kushner, S.R. (1991) Construction of versatile low-copy-number vectors for cloning, sequencing and expression in Escherichia coli Gene, 100, 195–199 .

    Uemura, Y., Isona, S., Isono, K. (1990) Cloning, characterization, and physical location of the rplY gene which encodes ribosomal protein L25 in Escherichia coli K12 Mol. Gen. Genet, . 226, 341–344 .

    Lesnik, E.A., Sampath, R., Levene, H.B., Henderson, T.J., McNeil, J.A., Ecker, D.J. (2001) Prediction of rho-independent transcriptional terminators in Escherichia coli Nucleic Acids Res, . 29, 3583–3594 .

    Mohanty, B.K. and Kushner, S.R. (2002) Polyadenylation of Escherichia coli transcripts plays an integral role in regulating intracellular levels of polynucleotide phosphorylase and RNase E Mol. Microbiol, . 45, 1315–1324 .

    Richmond, C.S., Glasner, J.D., Mau, R., Jin, H., Blattner, F.R. (1999) Genome-wide expression profiling in Escherichia coli K-12 Nucleic Acids Res, . 27, 3821–3835 .

    Machl, A.W., Schaab, C., Ivanov, I. (2002) Improving DNA array data quality by minimising ‘neighbourhood’ effects Nucleic Acids Res, . 30, 127 .

    Mitta, M., Fang, L., Inouye, M. (1997) Deletion analysis of cspA of Escherichia coli: requirement of the AT-rich UP element for cspA transcription and the downstream box in the coding region for its cold shock induction Mol. Microbiol, . 26, 321–335 .

    Bae, W.H., Xia, B., Inouye, M., Severinov, K. (2000) Escherichia coli CspA-family RNA chaperones are transcription antiterminators Proc. Natl Acad. Sci. USA, 97, 7784–7789 .

    Yamanaka, K. and Inouye, M. (2001) Selective mRNA degradation by polynucleotide phosphorylase in cold shock adaptation in Escherichia coli J. Bacteriol, . 183, 2808–2816 .

    Haugel-Nielsen, J., Hajnsdorf, E., Régnier, P. (1996) The rpsO mRNA of Escherichia coli is polyadenylated at multiple sites resulting from endonucleotlyic processing and exonucleolytic degradation EMBO J, . 15, 3144–3152 .

    Li, Z., Reimers, S., Pandit, S., Deutscher, M.P. (2002) RNA quality control: degradation of defective transfer RNA EMBO J, . 21, 1132–1138 .

    Chanfreau, G.F. (2005) Cutting genetic noise by polyadenylation-induced RNA degradation Trends Cell Biol, . 15, 635–637 .

    Cao, G.-J. and Sarkar, N. (1992) Poly(A) RNA in Escherichia coli: Nucleotide sequence at the junction of the lpp transcript and the polyadenylate moiety Proc. Natl Acad. Sci. USA, 89, 7546–7550 .

    Cao, G.J., Kalapos, M.P., Sarkar, N. (1997) Polyadenylated mRNA in Escherichia coli: Modulation of poly(A) levels by polynucleotide phosphorylase and ribonuclease II Biochimie, 79, 211–220 .

    Morita, T., Maki, K., Aiba, H. (2005) RNase E-based ribonucleoprotein complexes: mechanical basis of mRNA destabilization mediated by bacterial noncoding RNAs Genes Dev, . 19, 2276–2186 .(Bijoy K. Mohanty and Sidney R. Kushner*)

http://www.100md.com/html/DirDu/2007/02/17/36/89/59.htm