当前位置: 首页 > 期刊 > 《核酸研究》 > 2004年第16期 > 正文
编号:11369753
Determination and augmentation of RNA sequence specificity of the Nova
http://www.100md.com 《核酸研究医学期刊》
     1 Laboratory of Molecular Neuro-Oncology, 2 Laboratory of Molecular Biophysics and 3 Howard Hughes Medical Institute, The Rockefeller University, 1230 York Avenue, New York, NY 10021, USA

    * To whom correspondence should be addressed. Tel: +1 212 327 7460; Fax: +1 212 327 7109; Email: darnelr@mail.rockefeller.edu

    ABSTRACT

    The Nova onconeural antigens are implicated in the pathogenesis of paraneoplastic opsoclonus-myoclonus-ataxia (POMA). The Nova antigens are neuron-specific RNA-binding proteins harboring three repeats of the K-homology (KH) motif; they have been implicated in the regulation of alternative splicing of a host of genes involved in inhibitory synaptic transmission. Although the third Nova KH domain (KH3) has been extensively characterized using biochemical and crystallographic techniques, the roles of the KH1 and KH2 domains remain unclear. Furthermore, the specificity determinants that distinguish the Nova KH domains from those of the closely related hnRNP E and hnRNP K proteins are undefined. We demonstrate through the use of RNA selection and biochemical analysis that the sequence specificity of the Nova KH1/2 domains is similar to that of Nova KH3. We also show that the mutagenesis of a Nova KH domain to render it similar to the KH domains of the heterogeneous nuclear ribonucleoprotein E (hnRNP E) and hnRNP K allow it to recognize longer RNA sequences. These data yield important insights into KH domain function and suggest a strategy by which to engineer KH domains with novel sequence preferences.

    INTRODUCTION

    RNA-binding proteins (RBPs) are widely involved in the post-transcriptional processing of RNAs, playing key roles in exon–intron splicing, polyadenylation, nuclear export, subcellular localization, translational control, stabilization/degradation and sequence editing (1). In general, these phenomena are directed by the presence of specific sequences within the RNAs. Recruitment and assembly of the multicomponent complexes that process the RNAs involve the recognition of these sequences by RBPs. Specificity of protein–RNA interactions therefore lies at the heart of the regulation of the cell's activities. An understanding of the mechanisms by which the proteins encode specificity is essential in order to decipher the usually complex, multistep schemes by which RNA processing events are regulated.

    Among the various RNA-binding motifs that have been described, the K-homology (KH) domain is one of the most commonly found, used by numerous proteins representing all genera of life. Originally identified in heterogeneous nuclear ribonucleoprotein K (hnRNP K) (2), from which they derive their name, KH domains span 70 amino acids that fold into a conserved ??? motif, including an invariant Gly-X-X-Gly loop between the first and second -helices and a loop of variable length between the second and third ?-strands (3,4). KH domain proteins for which biological functions and physiological RNA targets have been described include the Nova proteins, implicated in the regulation of pre-mRNA splicing (5,6); the hnRNP E and hnRNP K proteins, implicated in mRNA stabilization and translational control (7–9); the zipcode-binding protein 1 (ZBP-1), implicated in mRNA subcellular localization (10,11); and the fragile X mental retardation syndrome protein (FMRP), implicated in translational regulation (12–15). Several KH domain proteins have been shown to interact with single-stranded DNAs, including DDP1 (16), far upstream element binding protein (17,18) and hnRNP K (19,20). A number of these proteins contain multiple KH domains, although the purpose served by multiple domains within a single protein remains unclear.

    Similar to the hnRNP E and hnRNP K proteins, the Nova-1 and Nova-2 proteins each possess three KH domains, with the first and second domains in tandem arrangement, followed by a large spacer region, with the third domain near the C-terminal end of the protein. The Nova proteins are restricted in expression to the central nervous system (21,22). Patients with paraneoplastic opsoclonus-myoclonus-ataxia (POMA), a neurodegenerative syndrome whose hallmark feature is a loss of inhibitory motor control, harbor high-titer autoantibodies specific for the KH domains, and these antibodies disrupt the ability of Nova to bind to RNA (21,23,24). POMA is believed to arise through the inappropriate targeting of Nova-expressing neurons by the immune system (25). Nova knockout mice have lethal neurodegenerative phenotypes suggestive of POMA, with defects seen in the alternative splicing of a number of genes involved in inhibitory synaptic transmission, including inhibitory glycine receptor subunit 2 (GlyR2), GABAA receptor subunit 2 (GABAAR2), gephryin and Jnk2 (5,6).

    Both the characterization of putative physiological targets and RNA selection experiments indicate a preference for cytosine-rich sequences by the hnRNP E and hnRNP K proteins (7,9,26–30). In contrast, RNA selection with Nova proteins revealed a predilection for the tetranucleotide UCAU, with Nova-1 particularly preferring multiple repeats of UCAU (22,24). This finding was validated by the biochemical characterization of UCAU-repeat intronic sequences in the GlyR2 and GABAAR2 pre-mRNAs through which Nova-1 appears to regulate the alternative splicing of the relevant exons (5,31), and by a screening for Nova targets using CLIP (cross-linking/IP), which identified additional YCAY-rich splicing targets (6). RNA selection with the C-terminal portion of Nova spanning the third KH domain resulted in the identification of a high-affinity consensus stem–loop ligand with a UCAY tetranucleotide in the loop (32). High-resolution X-ray crystallographic studies of Nova KH3 bound to this consensus RNA confirmed that the protein interacts specifically with the UCAY element (33). The mechanism by which the KH3 domain binds and recognizes its RNA ligand, using the conserved invariant Gly-X-X-Gly and variable loops as the arms of a ‘molecular vise’ to pinion the single-stranded RNA, with side chains from the first and second -helices and second and third ?-strands serving to recognize the RNA bases, has subsequently been shown also to be utilized by the branchpoint sequence binding protein/splicing factor 1 protein in the recognition of its target, the intron branch point sequence (34,35).

    Despite the progress in understanding the mechanism of KH domain–RNA interactions, the means by which RBPs harboring multiple KH domains specify the target sequences through which they regulate RNA metabolism are not fully understood. It remains unclear why the Nova proteins and the hnRNP E and hnRNP K proteins, which have a high degree of sequence and structural homology, recognize distinct sequence classes. To explore this issue, we have undertaken a structure/function analysis of KH domains of these proteins.

    MATERIALS AND METHODS

    Protein expression and purification

    Nova-1 protein expression constructs were PCR-engineered and -mutagenized from the original murine Nova-1 cDNA clone. The wild-type Nova-1 KH1/2 construct, spanning the canonical KH1 and KH2 domains as defined in Lewis et al. (4) and excluding 24 intervening amino acids corresponding to the alternatively spliced exon 4 (residues 49–149, 174–241), was engineered in the pGEX-6P plasmid, expressed in Escherichia coli as glutathione S-transferase (GST)-fusion proteins, and purified by sequential glutathione–Sepharose chromatography, PreScission protease cleavage to remove GST tags and Blue-Sepharose chromatography (all reagents were from Amersham Pharmacia). The wild-type and mutant Nova-1 extended KH3 domains (residues 423–510) were prepared in an identical fashion. The full-length murine hnRNP E1 coding sequence was cloned from a mouse genomic DNA preparation. Wild-type and mutant hnRNP E1 KH1/2 proteins (residues 13–166) were PCR-engineered and -mutagenized in the pGEX-6P plasmid, expressed in E.coli as GST-fusion proteins and purified by glutathione–Sepharose chromatography.

    RNA selection experiments

    The synthetic oligonucleotide templates 5'-TCCCGCTCGTCGTCT CCGCATCGTCCTCCCT-3' and 5'-TCCCGCTCGTCGTCT CCGCATCGTCCTCCCT-3', where ‘N’ indicates random incorporation of all four nucleotides, were each prepared for first round transcription by using Klenow fragment (Amersham Pharmacia) and the oligonucleotide primer 5GL, 5'-GAAATTAATACGACTCACTATAGGGAGGACGATGCGG-3'. Each template was transcribed with recombinant T7 polymerase (Stratagene) and UTP (Amersham Pharmacia) to yield several nanomoles of full-length product, which was size-purified with 10% denaturing PAGE and used for the first round selection. RNA selection was performed essentially as described previously (32,36). Briefly, protein–RNA-binding reactions were carried out in buffer SBB using an excess of RNA to protein, and partitioning was performed with 0.45 μM pore size nitrocellulose filters (Millipore). Selected RNA was extracted from filters with urea/phenol/chloroform treatment and reverse transcribed with the oligonucleotide primer 3GL, 5'-TCCCGCTCGTCGTCTG-3', and AMV-RT (Promega). PCR amplification of the selected clones for subsequent rounds of transcription and selection used mildly mutagenic conditions, with primers 5GL and 3GL, 1 mM dNTPs and 7.5 mM Mg(OAc)2. All oligonucleotides were from QIAGEN Operon.

    Preparation of RNAs

    RNA selection clones serving as templates for RNA synthesis were PCR-amplified with primers 5GL and 3GL, and transcribed with T7 polymerase (Stratagene) and UTP (Amersham Pharmacia). The RNA products were size-purified with 10 or 15% denaturing PAGE. Wild-type and mutant 10048 RNAs (based on 23mer GCGCGGAUCAGUCACCCAAGCGC template) and 17mer polycytidylate RNA (C17) were prepared by end-labeling commercially synthesized RNAs (Dharmacon Research) with ATP (Amersham Pharmacia), followed by size-purification with 20% denaturing PAGE.

    Nitrocellulose filter binding assays

    Binding dissociation constants of protein–RNA interactions were measured by a nitrocellulose filter binding assay (37). An aliquot of 50 μl reactions containing 50–100 fmol of RNA body-labeled or end-labeled with 32P and Nova-1 KH1/2, KH3 or hnRNP E1 proteins in 3-fold dilutions ranging from micromolar to nanomolar concentrations were mixed in buffer SBB and incubated for at least 10 min at room temperature, followed by filtering through 0.45 μm pore size nitrocellulose filters (Millipore) and washing. Dissociation constants were determined graphically by plotting the fraction of bound RNA versus the log of the protein concentration (38). Filter binding assays were repeated for verification of dissociation constants.

    RESULTS

    RNA selection for Nova-1 KH1/2

    We set out to analyze the RNA-binding properties of the Nova KH1 and KH2 domains. We had previously found that the recombinant Nova KH1 domain was largely insoluble and that the Nova KH2 domain failed to function as a high-affinity RNA-interacting module on its own (unpublished data). We generated the recombinant Nova-1 KH1 and KH2 domains as a single unit to facilitate high-affinity RNA binding, taking advantage of the domains' tandem arrangement in the full-length protein. Different isoforms of Nova-1 variously possess or lack a stretch of 24 amino acids between KH1 and KH2 encoded by an alternatively spliced exon (exon 4). Noting the absence of these 24 residues in Nova-2 protein, we engineered a Nova-1 KH1/2-expressing construct spanning the canonical KH1 and KH2 domains as defined by Lewis et al. (4) but excluding exon 4 (residues 49–149, 174–241).

    We then carried out RNA selection with expressed and purified Nova KH1/2. KH1/2 protein bound the round 0 pool of RNA, i.e. a randomized RNA library, with Kd 5 μM (Figure 1B), suggesting that the protein was functional for RNA binding and possessed a low background affinity for the RNA pool in the binding conditions of the RNA selection experiment. To determine the sequence preferences of the KH1/2 domains, we used an RNA library with 40 random positions and performed RNA selection for eight rounds. We observed a consistently improving affinity of Nova KH1/2 for the RNA pools through progressive rounds of selection (data not shown). We chose to clone and sequence the RNA pool after the eighth round of selection, as we had begun to observe an increasing level of background binding to the nitrocellulose filters used for RNA selection, a phenomenon previously seen with the KH3 RNA selection (32). Forty percent of the clones (10/25) could be organized and aligned into a strong consensus group (Group 1, Figure 1A). Several of these clones were represented multiple times in the group, suggesting that the RNA selection experiment was converging upon a set of optimal binders. We performed nitrocellulose filter binding assays with representative clones from the consensus group against the KH1/2 protein, measuring Kd 300–500 nM (Figure 1B and data not shown). A distinct set of clones (6/25) could be organized into a second consensus group (Group 2, Figure 1A). Group 2 clones bound with poor affinity (Kd 2 μM or worse; Figure 1B and data not shown) and were not pursued further. Although these RNA selection results identified RNA ligands (Group 1) with at most 10-fold improvement in binding over random sequences, this order of signal-to-noise ratio is similar to what was previously seen with Nova KH3 RNA selection (Kd 500 nM) (32). We cannot rule out the possibility that there exist higher affinity ligands with rare representation in our RNA pool that could have emerged with further selection.

    Figure 1. (A) The results of RNA selection experiment with Nova-1 KH1/2 protein. Twenty-five clones were sequenced from the RNA pool after eight rounds of selection; shown are the clones conforming to the two consensus groups, Group 1 and Group 2. Lowercase letters indicate the flanking fixed sequences of the RNAs. The core conserved element of each consensus is indicated in boldface type. Potential base pairing sequences are underlined. (B) Nitrocellulose filter binding assays of a representative clone conforming to the Group 1 consensus (shown on the left), a 23mer version of the Group 1 consensus termed 10048 (see Figure 3), a representative Group 2 clone , and the beginning Round 0 RNA pool from the RNA selection experiment against Nova-1 KH1/2 protein. Depicted in the representation of the Group 1 consensus is the stem–loop configuration predicted by the mfold RNA secondary structure software. Potential extra base pairs are indicated by the dashed lines. The tandem UCAN elements in the loop region are marked.

    Figure 3. Hairpin structures of 10048 (KH1/2 consensus) and 10021 (KH3 consensus) on the left-hand side, tables of the mutational analysis of each RNA on the right-hand side. Point mutations indicated are categorized into three groups according to their effects on the dissociation constant of the KH1/2-10048 or KH3-10021 interactions. Data for KH3-10021 interactions are replicated from (32).

    Notably, Group 1 RNAs display complementary stretches of nucleotides (underlined in Figure 1A) that upon base pairing would adopt a common hairpin configuration, which was independently predicted by the mfold RNA secondary structure prediction software (39,40). Such a predicted stem–loop structural consensus is similar to that seen in structural studies of the KH3 RNA selection experiment (clone 10021), although that RNA structure was notable for the presence of a longer-than-expected stem due to non-canonical base pairing (32). Similar non-canonical base pairing may lengthen the stem in the Group 1 RNA consensus; another possibility suggested by mfold is base pairing of two consecutive guanines in the proximal loop to two consecutive cytosines in the distal loop to create an extended stem with an AW bulge (Figure 1B). A second feature of the predicted consensus is an absolutely conserved eight-base UCAGUCAC sequence in the loop, suggestive of two adjacent UCAY-like or UCAN recognition sites (Figure 1B). Group 2 RNAs are also predicted to fold into a hairpin configuration, with a single UCAU element present in the loop (Figure 1A). Although Group 2 RNAs bind to KH1/2 with lower affinity than Group 1 RNAs, they likewise point to a predilection of KH1/2 for UCAY-containing sequences.

    We generated a 23-base version of the Group 1 consensus as a minimal RNA template with which to assess KH1/2 binding. This RNA, termed 10048, encodes a predicted stem–loop RNA with the UCAGUCAC element in the loop and a stem that is 4 bp (mfold v. 3.0; G = –5.1 kcal/mol) in length (Figure 3). Nova-1 KH1/2 bound 10048 with a Kd similar to that of a full-length consensus clone (400 nM; Figure 1B).

    Nova KH1/2 has a similar sequence preference as KH3

    Using 10048 as a template, we generated point mutants covering all possible single base changes at each position in the conserved UCAGUCAC sequence. Each of these mutants was assessed for Nova-1 KH1/2 binding using a nitrocellulose filter binding assay. As shown in Figure 2, most nucleotide changes at the second, third, sixth and seventh positions in the eight-base sequence (UCAGUCAC) severely disrupted binding by KH1/2. The first, fourth and fifth positions (flanking the central CA dinucleotides) were more accommodating of change, particularly with regard to pyrimidines.

    Figure 2. Nitrocellulose filter binding assays of wild-type and mutant 10048 RNAs against Nova-1 KH1/2 protein. Each mutant 10048 RNA harbors a single nucleotide change in the UCAGUCAC sequence. The eight panels include all possible mutants in each of the eight positions of the UCAGUCAC sequence.

    In the previous work with the Nova KH3 domain, we found that the central CA dinucleotide in the UCAY-containing RNA consensus, 10021, was intolerant of mutation (32), and crystallographic studies showed that these nucleotides mediate most of the contacts with the protein (33). The flanking nucleotides, in contrast, were somewhat tolerant to mutation. Indeed, if we group the mutations for both the KH1/2 ligand, 10048, and the KH3 ligand, 10021, into three categories—those with mild inhibition of RNA binding, those with moderate inhibition and those with severe inhibition—there emerges a striking similarity between the patterns of mutations for the second UCAN element in 10048 (UCAC) and the UCAY element in 10021 (UCAC; Figure 3). Taken together, these findings indicate that the KH1 domain, the KH2 domain or both have the same core tetranucleotide sequence specificity as the Nova KH3 domain. Moreover, they suggest that the Nova domains bind and recognize RNA by a highly similar molecular mechanism, i.e. by the use of analogous amino acid–nucleotide contacts.

    Given full-length Nova-1 protein's binding preference for UCAU-repeat RNA, we hypothesize that the Nova KH domains engage in coordinate binding to UCAU-repeat elements in the physiological RNA targets of the protein. In separate studies, we have determined that all three Nova KH domains contribute to Nova's regulation of alternative splicing (K. Musunuru, C. E. Engelhard, C. E. Fraser, R. Zhong and R. B. Darnell, manuscript in preparation). Many KH domain proteins contain three or more KH domains, with vigilin being the extreme example with 15 KH domains; it seems likely that many of these proteins use their domains in a coordinate manner to interact with their ligands.

    A conserved arginine in hnRNP E KH domains is important for RNA binding

    Despite their structural similarities to the Nova proteins, the hnRNP E and hnRNP K proteins exhibit a binding preference for polycytidylate elements in their physiological targets (e.g. the DICE control element in the 15-lipoxygenase mRNA 3'-untranslated region) and in their RNA selection ligands (29,30). As a first step toward understanding the difference in sequence preferences between the Nova proteins and the hnRNP E and hnRNP K proteins, we generated an alignment of the KH domains of the mammalian Nova-1 and Nova-2 proteins, the four mammalian hnRNP E variants and hnRNP K, the Drosophila orthologs of Nova, hnRNP E and hnRNP K (Pasilla, Mub and Bancal, respectively), and the yeast Pbp2p protein that appears to represent a phylogenetic ancestor of the other proteins (Figure 4A). In examining this alignment, we observed that the residue in position 32 is invariably arginine in the KH domains of the mammalian and Drosophila orthologs of hnRNP E and hnRNP K, but it is not conserved in the Nova orthologs. This pattern implied a functional role for Arg-32 in the former proteins but not in the latter. The location of the corresponding residue (glutamine) in the Nova KH3-RNA co-crystal structure places it on the exterior face of the second -helix, in proximity to the molecular vise but in a position where it does not contact the RNA ligand, due to the trajectory of the sugar-phosphate backbone of the stem–loop sweeping away from the domain (Figure 4B). A different trajectory for the RNA, extending the protein–RNA interaction surface further along the KH domain (see dashed arrow in Figure 4B), would allow Arg-32 to make contacts with RNA. These observations suggested that Arg-32 might be important for the RNA specificity of the hnRNP E and hnRNP K proteins.

    Figure 4. (A) Sequence alignment of all KH domains of every annotated Nova, hnRNP E and hnRNP K isoform and ortholog in mammals, Drosophila and Saccharomyces cerevisiae available in NCBI GenBank. The secondary structure elements of the canonical KH domain (three -helices and three ?-strands) as well as the two conserved loops, the invariant Gly-X-X-Gly loop (I) and the loop of variable length (V) are shown at the top. Also indicated is position 32, with the conserved arginines in that position are shaded and boxed. (B) Representation of the KH3-RNA co-crystal structure, showing the actual and potential RNA-binding surfaces. The solid arrow indicates the actual trajectory of the RNA, and the dashed arrow indicates the potential trajectory of RNA that would place it in contact with an extended surface of the KH domain, including the amino acid residue in position 32.

    We therefore tested RNA binding of recombinant wild-type hnRNP E1 KH1/2 protein (analogous to the Nova-1 KH1/2 protein) or mutant protein with arginine to lysine (RK) changes engineered in position 32 of each of its KH domains. Since minimal RNA ligands that bind hnRNP E1 KH1/2 with high affinity have not yet been identified, we tested binding of the proteins against 17mer polycytidylate RNA (C17) in nitrocellulose filter binding assays. We found that the R(32)K mutant protein displays moderate inhibition of C17 binding compared to the wild-type protein (Kd 500 versus 50 nM; Figure 5A). These data confirm that Arg-32 contributes to RNA binding by hnRNP E1, presumably by interacting with one or more cytosine bases in the RNA ligand. We therefore postulate that the KH domains of hnRNP E and hnRNP K utilize a surface extending beyond the molecular vise to bind RNA.

    Figure 5. (A) Nitrocellulose filter binding assays of polycytidylate (C17) RNA against wild-type hnRNP E1 KH1/2 protein and mutant protein harboring R(32)K mutations in both KH domains. (B) Nitrocellulose filter binding assays of UCAY-containing ligand of Nova-1 KH3 .

    RNA selection for Nova-1 KH3 Q(32)R mutant

    To further evaluate the possibility that Arg-32 provides an extended RNA-binding surface in some KH domains, we assessed the effect of a Q(32)R substitution in Nova KH3. The Q(32)R mutation had a relatively small effect on KH3 binding to its RNA ligand (wild-type Kd 400 nM versus mutant Kd 700 nM, Figure 5B), consistent with the Nova KH3-RNA co-crystal structure showing no involvement of Gln-32 in RNA binding. We reasoned, however, that the Q(32)R change might have two significant consequences: to give the Nova KH3 domain an extended RNA-binding surface and give it the potential to recognize additional bases beyond the UCAY tetranucleotide motif, and to make the KH3 domain more similar to the hnRNP E and hnRNP K domains, with an enhanced affinity for cytosine-rich RNA. To test this idea, we carried out RNA selection with the Nova-1 KH3 Q(32)R mutant, using conditions similar to those used for the wild-type KH3 domain in previously published work (32). After eight rounds of RNA selection using a library with 25 random positions, we found that about half of the sequenced clones (18/40) could be organized and aligned into a strong consensus group (Figure 6A), with the consensus comprising a six-base UCAUAA motif.

    Figure 6. (A) The results of RNA selection experiment with Nova-1 KH3 Q(32)R mutant protein. Forty clones were sequenced from the RNA pool after eight rounds of selection; shown are the clones conforming to the consensus. Lowercase letters indicate the flanking fixed sequences of the RNAs. The core conserved element of the consensus is indicated in boldface type. (B) Nitrocellulose filter binding assays of Nova-1 KH3 Q(32)R consensus clone against Nova-1 Q(32)R and wild-type KH3 proteins. (C) Nitrocellulose filter binding assays of Nova-1 KH3 Q(32)R protein against the consensus clone with various mutations in the UCAUAA motif, as well as polycytidylate (C17) RNA.

    We performed nitrocellulose filter binding assays with a representative clone (#4 from Figure 6A) from the consensus group against the Q(32)R mutant KH3 protein and the wild-type protein. The Q(32)R protein bound the consensus RNA significantly better (Kd 400 nM) than the wild-type proteins (Kd > 2 μM; Figure 6B), indicating that Arg-32 is critical for recognition of the UCAUAA motif. Conversely, various mutations along the six-base motif all impair binding by the Q(32)R protein (Figure 6C), including mutations of the newly selected consensus adenine residues, establishing the requirement of the extended UCAUAA sequence for optimal RNA binding. These data are consistent with the Q(32)R KH3 domain engaging the RNA via an extended surface (comprising the molecular vise and the second -helix) and using Arg-32 to augment the protein's RNA-binding specificity.

    Finally, we tested the ability of the Q(32)R mutant KH3 protein to bind to 17mer polycytidylate RNA (C17) by nitrocellulose filter binding assay and were unable to observe any significant affinity (Figure 6C). Thus, it would appear that rather than conferring a similarity to the hnRNP E and hnRNP K domains, the Q(32)R mutation in the Nova KH domain unexpectedly confers a binding preference for a novel sequence not previously observed for either Nova KH domains or those of hnRNP E and hnRNP K. A complete understanding of the sequence specificity of the latter proteins awaits X-ray or NMR structures of their KH domains in complex with cognate cytosine-rich RNA ligands.

    DISCUSSION

    Our studies point to the utility of RNA selection as a means by which to determine the sequence preferences of uncharacterized KH domains. Whereas a number of groups have used RNA selection to help in identifying physiological targets of a number of KH domain proteins, including the Nova proteins (22,24), hnRNP E (29), FMRP (41), PSI (42), Sam68 (43) and vigilin (44), the use of the technique to study individual domains or subsets of domains from such proteins has been more limited. Our work demonstrates that at least two of the Nova KH domains share a preference for UCAY tetranucleotide sequences. In addition, we have previously shown using RNA selection that the third and fourth KH domains of ZBP-1 (which harbors a total of four KH domains) confers the protein's specificity for its physiological ligand, the zipcode element in the 3'-untranslated region of the ?-actin mRNA (45). In these studies, RNA selection has proven to be particularly advantageous in identifying optimal high-affinity RNA ligands with which to undertake structural analysis of KH domains bound to RNA.

    We have previously used this strategy to determine high-resolution X-ray crystal structures of the Nova KH3 domain bound to UCAY-containing ligands (33) and suggest that this approach can be more widely applied to the study of KH domain proteins and, indeed, RBPs of all types. Recently, we have applied this strategy to co-crystalize a 25-nucleotide hairpin RNA conforming to the Nova KH1/2 Group 1 consensus identified here (i.e., harboring the UCAGUCAC sequence) together with the Nova-1 KH1/2 domains (L. Malinina, M. Teplova, K. Musunuru, A. Teplov, S. K. Burley, R. B. Darnell and D. J. Patel, manuscript in preparation). The crystal structure confirms that the RNA ligand adopts a hairpin configuration and that the UCAY element engages in critical interactions with the KH1/2 protein, as predicted by our mutagenesis data.

    Although our attempts at using structure-based sequence alignments to dissect the determinants of RNA recognition among KH domains were unsuccessful, our efforts nevertheless suggest a strategy by which to engineer KH domains with altered sequence preferences. Mapping of KH domain sequences onto structural models of KH–RNA complexes identifies those residues that engage in actual and potential contacts with RNA. One can undertake mutagenesis of those residues either to alter the character of the residues so that they are likely to interact with different nucleotide bases or, if they do not already interact with RNA, to alter the favorability of the local surface for supporting such interactions. Even if it proves difficult to predict a priori the consequences of introducing the mutations, one can use RNA selection to provide a read-out of the specificity of the altered KH domain. By proceeding in this fashion, we were able to extend the binding preference of the Nova KH3 domain from a four-nucleotide sequence to a six-nucleotide sequence. We envisage using our novel KH3 domain and other domains of this kind as molecular tools with which to better understand the physiological function of the Nova proteins and other RBPs.

    ACKNOWLEDGEMENTS

    We thank Kirk Jensen, Jennifer Darnell, Stephen Burley, Dinshaw Patel, Caroline Groft, Rahul Deo, Joseph Marcotrigiano, Marianna Teplova and Lucy Malinina for useful discussions and critical reading of the manuscript. This work was supported in part by the National Institutes of Health and the Howard Hughes Medical Institute. K.M. was supported by the Weill Cornell/Rockefeller/Sloan-Kettering Tri-Institutional MD-PhD program, NIH MSTP grant GM07739 and NIH (NINDS) grant R01 NS040955 and NS34389. R.B.D. is an investigator of the Howard Hughes Medical Institute.

    REFERENCES

    Gesteland,R.F., Cech,T.R. and Atkins,J.F. ( (1999) ) The RNA World, 2nd edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.

    Siomi,H., Matunis,M.J., Michael,W.M. and Dreyfuss,G. ( (1993) ) The pre-mRNA binding K protein contains a novel evolutionarily conserved motif. Nucleic Acids Res., , 21, , 1193–1198.

    Musco,G., Stier,G., Joseph,C., Castiglione Morelli,M.A., Nilges,M., Gibson,T.J. and Pastore,A. ( (1996) ) Three-dimensional structure and stability of the KH domain: molecular insights into the fragile X syndrome. Cell, , 85, , 237–245.

    Lewis,H.A., Chen,H., Edo,C., Buckanovich,R.J., Yang,Y.Y., Musunuru,K., Zhong,R., Darnell,R.B. and Burley,S.K. ( (1999) ) Crystal structures of Nova-1 and Nova-2 K-homology RNA-binding domains. Structure, , 7, , 191–203.

    Jensen,K.B., Dredge,B.K., Stefani,G., Zhong,R., Buckanovich,R.J., Okano,H.J., Yang,Y.Y. and Darnell,R.B. ( (2000) ) Nova-1 regulates neuron-specific alternative splicing and is essential for neuronal viability. Neuron, , 25, , 359–371.

    Ule,J., Jensen,K.B., Ruggiu,M., Mele,A., Ule,A. and Darnell,R.B. ( (2003) ) CLIP identifies Nova-regulated RNA networks in the brain. Science, , 302, , 1212–1215.

    Kiledjian,M., Wang,X. and Liebhaber,S.A. ( (1995) ) Identification of two KH domain proteins in the alpha-globin mRNP stability complex. EMBO J., , 14, , 4357–4364.

    Gamarnik,A.V. and Andino,R. ( (1997) ) Two functional complexes formed by KH domain containing proteins with the 5' noncoding region of poliovirus RNA. RNA, , 3, , 882–892.

    Ostareck,D.H., Ostareck-Lederer,A., Wilm,M., Thiele,B.J., Mann,M. and Hentze,M.W. ( (1997) ) mRNA silencing in erythroid differentiation: hnRNP K and hnRNP E1 regulate 15-lipoxygenase translation from the 3' end. Cell, , 89, , 597–606.

    Ross,A.F., Oleynikov,Y., Kislauskis,E.H., Taneja,K.L. and Singer,R.H. ( (1997) ) Characterization of a beta-actin mRNA zipcode-binding protein l. Mol. Cell. Biol., , 17, , 2158–2165.

    Zhang,H., Eom,T., Oleynikov,Y., Shenoy,S.M., Liebelt,D.A., Dictenberg,J.B., Singer,R.H. and Bassell,G.J. ( (2001) ) Neurotrophin-induced transport of a beta-actin mRNP complex increases beta-actin levels and stimulates growth cone motility. Neuron, , 31, , 261–275.

    Pieretti,M., Zhang,F., Fu,Y., Warren,S., Oostra,B., Caskey,C. and Nelson,D. ( (1991) ) Absence of expression of the FMR-1 gene in fragile X syndrome. Cell, , 66, , 817–822.

    Ashley,C.T., Wilkinson,K.D., Reines,D. and Warren,S.T. ( (1993) ) FMR1 protein: conserved RNP family domains and selective RNA binding. Science, , 262, , 563–566.

    Siomi,H., Siomi,M., Nussbaum,R. and Dreyfuss,G. ( (1993) ) The protein product of the fragile X gene, FMR1, has characteristics of an RNA-binding protein. Cell, , 74, , 291–298.

    Bardoni,B., Schenck,A. and Mandel,J.L. ( (2001) ) The fragile X mental retardation protein. Brain Res. Bull., , 56, , 375–382.

    Cortes,A. and Azorin,F. ( (2000) ) DDP1, a heterochromatin-associated multi-KH-domain protein of Drosophila melanogaster, interacts specifically with centromeric satellite DNA sequences. Mol. Cell. Biol., , 20, , 3860–3869.

    He,L., Weber,A. and Levens,D. ( (2000) ) Nuclear targeting determinants of the far upstream element binding protein, a c-myc transcription factor. Nucleic Acids Res., , 28, , 4558–4565.

    Braddock,D.T., Louis,J.M., Baber,J.L., Levens,D. and Clore,G.M. ( (2002) ) Structure and dynamics of KH domains from FBP bound to single-stranded DNA. Nature, , 415, , 1051–1056.

    Gaillard,C., Cabannes,E. and Strauss,F. ( (1994) ) Identity of the RNA-binding protein K of hnRNP particles with protein H16, a sequence-specific single strand DNA-binding protein. Nucleic Acids Res., , 22, , 4183–4186.

    Braddock,D.T., Baber,J.L., Levens,D. and Clore,G.M. ( (2002) ) Molecular basis of sequence-specific single-stranded DNA recognition by KH domains: solution structure of a complex between hnRNP K KH3 and single-stranded DNA. EMBO J., , 21, , 3476–3485.

    Buckanovich,R.J., Posner,J.B. and Darnell,R.B. ( (1993) ) Nova, the paraneoplastic Ri antigen, is homologous to an RNA-binding protein and is specifically expressed in the developing motor system. Neuron, , 11, , 657–672.

    Yang,Y.Y., Yin,G.L. and Darnell,R.B. ( (1998) ) The neuronal RNA binding protein Nova-2 is implicated as the autoantigen targeted in POMA patients with dementia. Proc. Natl Acad. Sci. USA, , 95, , 13254–13259.

    Buckanovich,R.J., Yang,Y.Y. and Darnell,R.B. ( (1996) ) The onconeural antigen Nova-1 is a neuron-specific RNA-binding protein, the activity of which is inhibited by paraneoplastic antibodies. J. Neurosci., , 16, , 1114–1122.

    Buckanovich,R.J. and Darnell,R.B. ( (1997) ) The neuronal RNA binding protein Nova-1 recognizes specific RNA targets in vitro and in vivo. Mol. Cell. Biol., , 17, , 3194–3201.

    Musunuru,K. and Darnell,R.B. ( (2001) ) Paraneoplastic neurologic disease antigens: RNA-binding proteins and signaling proteins in neuronal degeneration. Annu. Rev. Neurosci., , 24, , 239–262.

    Matunis,M.J., Michael,W.M. and Dreyfuss,G. ( (1992) ) Characterization and primary structure of the poly(C)-binding heterogeneous nuclear ribonucleoprotein complex K protein. Mol. Cell. Biol., , 12, , 164–171.

    Aasheim,H.C., Loukianova,T., Deggerdal,A. and Smeland,E.B. ( (1994) ) Tissue specific expression and cDNA structure of a human transcript encoding a nucleic acid binding protein related to the pre-mRNA binding protein K. Nucleic Acids Res., , 22, , 959–964.

    Leffers,H., Dejgaard,K. and Celis,J.E. ( (1995) ) Characterisation of two major cellular poly(rC)-binding human proteins, each containing three K-homologous (KH) domains. Eur. J. Biochem., , 230, , 447–453.

    Thisted,T., Lyakhov,D.L. and Liebhaber,S.A. ( (2001) ) Optimized RNA targets of two closely related triple KH domain proteins, heterogeneous nuclear ribonucleoprotein K and alphaCP-2KL, suggest distinct modes of RNA recognition. J. Biol. Chem., , 276, , 17484–17496.

    Reimann,I., Huth,A., Thiele,H. and Thiele,B.J. ( (2002) ) Suppression of 15-lipoxygenase synthesis by hnRNP E1 is dependent on repetitive nature of LOX mRNA 3'-UTR control element DICE. J. Mol. Biol., , 315, , 965–974.

    Dredge,B.K. and Darnell,R.B. ( (2003) ) Nova regulates GABA(A) receptor gamma2 alternative splicing via a distal downstream UCAU-rich intronic splicing enhancer. Mol. Cell. Biol., , 23, , 4687–4700.

    Jensen,K.B., Musunuru,K., Lewis,H.A., Burley,S.K. and Darnell,R.B. ( (2000) ) The tetranucleotide UCAY directs the specific recognition of RNA by the Nova KH3 domain. Proc. Natl Acad. Sci. USA, , 97, , 5740–5745.

    Lewis,H.A., Musunuru,K., Jensen,K.B., Edo,C., Chen,H., Darnell,R.B. and Burley,S.K. ( (2000) ) Sequence-specific RNA binding by a Nova KH domain: implications for paraneoplastic disease and the fragile X syndrome. Cell, , 100, , 323–332.

    Berglund,J., Chua,K., Abovich,N., Reed,R. and Rosbash,M. ( (1997) ) The splicing factor BBP interacts specifically with the pre-mRNA branchpoint sequence UACUAAC. Cell, , 89, , 781–787.

    Liu,Z., Luyten,I., Bottomley,M.J., Messias,A.C., Houngninou-Molango,S., Sprangers,R., Zanier,K., Kramer,A. and Sattler,M. ( (2001) ) Structural basis for recognition of the intron branch site RNA by splicing factor 1. Science, , 294, , 1098–1102.

    Tuerk,C. and Gold,L. ( (1990) ) Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science, , 249, , 505–510.

    Carey,J., Cameron,V., de Haseth,P.L. and Uhlenbeck,O.C. ( (1983) ) Sequence-specific interaction of R17 coat protein with its ribonucleic acid binding site. Biochemistry, , 22, , 2601–2610.

    Irvine,D., Tuerk,C. and Gold,L. ( (1991) ) SELEXION. Systematic evolution of ligands by exponential enrichment with integrated optimization by non-linear analysis. J. Mol. Biol., , 222, , 739–761.

    Mathews,D.H., Sabina,J., Zuker,M. and Turner,D.H. ( (1999) ) Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol., , 288, , 911–940.

    Zuker,M., Mathews,D.H. and Turner,D.H. ( (1999) ) Algorithms and thermodynamics for RNA secondary structure prediction: a practical guide. In Barciszewski,J. and Clark,B.F.C. (eds), RNA Biochemistry and Biotechnology. Kluwer Academic Publishers, Dordrecht, The Netherlands.

    Darnell,J.C., Jensen,K.B., Jin,P., Brown,V., Warren,S.T. and Darnell,R.B. ( (2001) ) Fragile X mental retardation protein targets G quartet mRNAs important for neuronal function. Cell, , 107, , 489–499.

    Amarasinghe,A.K., MacDiarmid,R., Adams,M.D. and Rio,D.C. ( (2001) ) An in vitro-selected RNA-binding site for the KH domain protein PSI acts as a splicing inhibitor element. RNA, , 7, , 1239–1253.

    Lin,Q., Taylor,S.J. and Shalloway,D. ( (1997) ) Specificity and determinants of Sam68 RNA binding. Implications for the biological function of K homology domains. J. Biol. Chem., , 272, , 27274–27280.

    Kanamori,H., Dodson,R.E. and Shapiro,D.J. ( (1998) ) In vitro genetic analysis of the RNA binding site of vigilin, a multi-KH-domain protein. Mol. Cell. Biol., , 18, , 3991–4003.

    Farina,K.L., Huttelmaier, S., Musunuru,K., Darnell, R. and Singer,R.H. ( (2003) ) Two ZBP1 KH domains facilitate beta-actin mRNA, granule formation and cytoskeletal attachment. J. Cell. Biol., , 160, , 77–87.(Kiran Musunuru1,2 and Robert B. Darnell1)