当前位置: 首页 > 医学版 > 期刊论文 > 临床医学 > 微生物临床杂志 > 2005年 > 第8期 > 正文
编号:11258219
DNA Sequence-Based Subtyping and Evolutionary Analysis of Selected Salmonella enterica Serotypes
     Department of Food Science

    Department of Population Medicine and Diagnostic Sciences, Cornell University, Ithaca, New York 14853

    ABSTRACT

    While serotyping and phage typing have been used widely to characterize Salmonella isolates, sensitive subtyping methods that allow for evolutionary analyses are essential for examining Salmonella transmission, ecology, and evolution. A set of 25 Salmonella enterica isolates, representing five clinically relevant serotypes (serotypes Agona, Heidelberg, Schwarzengrund, Typhimurium, and Typhimurium var. Copenhagen) was initially used to develop a multilocus sequence typing (MLST) scheme for Salmonella targeting seven housekeeping and virulence genes (panB, fimA, aceK, mdh, icdA, manB, and spaN). A total of eight MLST types were found among the 25 isolates sequenced. A good correlation between MLST types and Salmonella serotypes was observed; only one serotype Typhimurium var. Copenhagen isolate displayed an MLST type otherwise typical for serotype Typhimurium isolates. Since manB, fimA, and mdh allowed for the highest subtype discrimination among the initial 25 isolates, we chose these three genes to perform DNA sequencing of an additional 41 Salmonella isolates representing a larger diversity of serotypes. This "three-gene sequence typing scheme" allowed discrimination of 25 sequence types (STs) among a total of 66 isolates; STs correlated well with serotypes and allowed within-serotype differentiation for 9 of the 12 serotypes characterized. Phylogenetic analyses showed that serotypes Kentucky and Newport could each be separated into two distinct, statistically well supported evolutionary lineages. Our results show that a three-gene sequence typing scheme allows for accurate serotype prediction and for limited subtype discrimination among clinically relevant serotypes of Salmonella. Three-gene sequence typing also supports the notion that Salmonella serotypes represent both monophyletic and polyphyletic lineages.

    INTRODUCTION

    Salmonella is an important zoonotic pathogen. According to the Centers for Disease Control and Prevention, an estimated 1.4 million cases of human disease due to nontyphoidal salmonellosis occur annually in the United States (20). Within the genus Salmonella, almost 2,500 serotypes can be differentiated using the standard Kauffman-White scheme (4). Serotypes within subspecies I (Salmonella enterica subsp. enterica) are responsible for the vast majority of salmonellosis infections in warm-blooded animals (7). These serotypes differ widely in a variety of features, most notably their host range and the severity and type of disease they typically cause (24). For example, Salmonella enterica serotype Typhimurium causes gastroenteritis in a multitude of hosts, whereas Salmonella enterica serotype Dublin causes enteric fever and induces abortion primarily in cattle (33).

    While serotyping has been widely used to differentiate Salmonella subtypes, this method has limited discriminatory power and does not reveal the genetic relationships of strains within the same or different serotypes (32). More-discriminatory methods for subtyping of Salmonella isolates include phage typing (8) as well as pulsed-field gel electrophoresis (PFGE) (1). Multilocus enzyme electrophoresis (MLEE) has also been used successfully to subtype Salmonella isolates and to study the evolution and population genetics of various Salmonella serotypes (2, 28, 31, 32). However, MLEE is technically difficult and hard to standardize between laboratories and thus does not represent a subtyping method suitable for routine surveillance (34). The advent of automated DNA sequencing technology has led to the development and implementation of DNA sequence-based subtyping techniques, such as multilocus sequence typing (MLST). MLST is based on the concepts of MLEE except that allelic types are determined from nucleotide sequences of housekeeping genes rather than by the electrophoretic mobilities of the enzymes they encode (19). One key advantage of MLST over MLEE and other banding-pattern-based subtyping techniques is that the sequence data generated are nonambiguous and can be readily compared between laboratories, thus facilitating global, large-scale surveillance (19, 36). MLST methods have been used to subtype and explore the evolutionary relationships of a variety of bacterial pathogens, including Campylobacter jejuni, Vibrio cholerae, Listeria monocytogenes, Streptococcus agalactiae, and Salmonella enterica (9, 15-17, 30).

    The changing epidemiology of Salmonella infections (21) and the emergence of new Salmonella strains (e.g., multidrug resistant Salmonella serotype Typhimurium DT 104 and multidrug-resistant Salmonella enterica serotype Newport) make it imperative to develop new Salmonella subtyping methods that not only allow for sensitive subtype discrimination but also provide data that can be used for evolutionary analyses of Salmonella. In addition, molecular subtyping methods for Salmonella should also allow for serotype prediction, thus obviating the need for maintenance of specialized serotype reagents for Salmonella. Thus, our goal was to develop an MLST scheme for Salmonella enterica serotypes that (i) provides sensitive subtype discrimination, (ii) reliably predicts Salmonella serotypes, and (iii) provides data that can be used for evolutionary analyses.

    MATERIALS AND METHODS

    Salmonella isolates. An initial set of 25 Salmonella isolates (supplemental Table S1, available at http://www.foodscience.cornell.edu/wiedmann/Sukhnanand%20Supplementary.txt) representing 5 isolates each of Salmonella serotype Agona, Salmonella serotype Heidelberg, Salmonella serotype Schwarzengrund, Salmonella serotype Typhimurium, and Salmonella serotype Typhimurium var. Copenhagen was used for the development of a seven-gene MLST scheme (19, 36) as described below. Most of these isolates were obtained from cattle as part of a field study of serogroup B Salmonella infections in dairy herds in New York State (35), representing the most prevalent serotypes found in this study. Some isolates in this set were obtained from other hosts, including birds (n = 4), horses (n = 1), and other mammals (n = 1).

    An additional 41 Salmonella isolates were added to our initial collection of 25 isolates to yield a collection of 66 Salmonella isolates (supplemental Table S2, available at http://www.foodscience.cornell.edu/wiedmann/Sukhnanand%20Supplementary.txt) representing a greater serotype diversity. We specifically used 2001 Public Health Laboratory Information System (PHLIS) surveillance data (6) to identify the five Salmonella serotypes most commonly isolated from human clinical cases, animal clinical cases, and environmental sources. The serotypes represented (in addition to the serotypes represented among the initial 25 isolates) included serotypes Montevideo, Newport, Kentucky, Enteritidis, Dublin, Senftenberg, and Javiana. Bovine and avian isolates representing most of these serotypes were obtained from the Cornell University Animal Health Diagnostic Laboratory Salmonella strain collection. One human Salmonella serotype Senftenberg isolate and six human Salmonella serotype Javiana isolates were obtained from the New York State Department of Health. A three-gene sequence-typing scheme was used to characterize these additional 41 isolates.

    Salmonella serotyping of animal isolates was performed at the National Veterinary Services Laboratories (USDA-APHIS-VS, Ames, IA); serotyping of human isolates was performed at the New York State Department of Health.

    Gene selection and primer design. A total of seven genes located around the Salmonella chromosome were chosen as targets for the initial development of a seven-gene MLST scheme (Table 1). Primers were obtained from the published literature (3, 17) or designed based on published sequences available in GenBank (Table 2). Primer design was performed using the PrimerSelect software program (DNAStar, Madison, WI). Primers designed for panB, fimA, icdA, spaN, and aceK amplified the complete coding domain sequence, while the published primers for manB and mdh amplified only parts of the respective open reading frames (65.1 and 90.5%, respectively).

    PCR primers for amplification of manB and mdh from the additional 41 isolates added to our initial set of 25 isolates to yield greater serotype diversity utilized a second set of primers (F2 and R3a for manB; ssF and ssR for mdh [Table 2]). These primers were designed based on published sequences as well as on the sequence data obtained for these two genes for the initial 25 isolates (for manB).

    DNA sequencing. Salmonella lysates for PCR amplification were prepared as described by Furrer et al. (10). All PCR amplifications were performed with either Thermus aquaticus (Taq) DNA polymerase (Promega, Madison, WI) or AmpliTaq Gold (Applied Biosystems, Foster City, CA). PCR conditions used for the initial set of 25 isolates are listed in Table 3. PCR conditions for the additional 41 isolates were optimized based on the conditions listed in Table 3; for example, annealing temperatures had to be reduced for some serotypes in order to achieve PCR amplification.

    PCR products were purified using the QIAquick PCR purification kit (QIAGEN, Inc., Chatsworth, CA). DNA was quantified using the Fluorescent DNA Quantitation kit (Bio-Rad, Hercules, CA) and a Fusion automated microplate reader (Perkin-Elmer Life and Analytical Sciences, Inc., Boston, MA). Sequencing of PCR products was performed at Cornell University's BioResource Center using an ABI 3700 automated sequencer (Applied Biosystems, Foster City, CA) and forward and reverse PCR primers. Additional internal primers (Table 2) were used to determine the complete coding sequences for icdA, spaN, and aceK with double coverage. DNA sequences were proofread and assembled with the Seqman software program (DNAStar, Madison, WI). Sequences were aligned using the Clustal W algorithm in MegAlign (DNAStar).

    If the PCR product yield was insufficient for sequencing, PCR products were cloned into pCR 2.1-TOPO using the TOPO TA cloning kit (Invitrogen, Carlsbad, CA). Plasmids were purified using the QIAquick Plasmid Prep kit (QIAGEN, Inc., Chatsworth, CA), and the presence of inserts was confirmed by EcoRI restriction digestion. Plasmid DNA was quantified with the NanoDrop (Rockland, DE) ND-1000 spectrophotometer, and plasmids were sequenced using M13 forward and reverse primers. Three plasmid inserts were sequenced for each isolate and gene in order to allow for correction of sequence errors due to nucleotide misincorporation during PCR amplification. This cloning-and-sequencing approach had to be used to generate DNA sequences for manB for eight isolates (FSL S5-267, FSL S5-273, FSL S5-280, FSL S5-358, FSL S5-364, FSL S5-365, FSL S5-366, and FSL S5-367) and for mdh for two isolates (FSL S5-273 and FSL S5-291).

    MLST typing. While full MLST targeting seven different genes (19, 36) was performed on an initial set of 25 isolates, a more economical three-gene sequence typing scheme was developed and applied to our collection of 66 Salmonella isolates. Allele assignments for individual genes were determined using alignments of the complete coding sequence or available coding region for a given gene. Different alleles were assigned to sequences that differed by at least one nucleotide. Sequence types (STs) were determined from a concatenated code of allele assignments for individual genes.

    Evolutionary analyses. Descriptive evolutionary statistics, such as G+C content and percentage of polymorphism, were calculated using the DNASTAR software package or DnaSP (version 3.99) (29). The dN/dS (number of nonsynonymous substitutions per nonsynonymous site/number of nonsynonymous substitutions per synonymous site) ratios for each gene were calculated using the Molecular Evolution Genetics Analysis software program (MEGA, version 2.1) (18). Genes with dN/dS ratios greater than 0.1 were tested for positive selection using the Phylogenetic Analysis by Maximum Likelihood software program (PAML, version 3.13) (37). The RETICULATE software program (13) was used to test for evidence of reticulate evolution by using separate alignments of the concatenated sequences for both the seven-gene MLST (25 isolates) and the three-gene sequence-typing scheme (66 isolates).

    To determine the appropriate model of evolution for each gene, we first generated likelihood ratio scores for each gene in PAUP, version 4.0b10 (Sinauer Associates, Sunderland, MA) using modelblock3 (obtained from http://workshop.molecularevolution.org/software/modeltest/modelblock3.php). Likelihood ratio scores were then exported into the MODELTEST software program (26) to determine the correct model of evolution using a hierarchical likelihood ratio test. Once the correct model of evolution was determined, maximum-likelihood phylogenetic trees were built with and without a molecular clock imposed in PAUP. Likelihood scores generated from these trees were used to conduct a likelihood ratio test to determine whether the nucleotide substitutions observed for a given gene followed a molecular clock. Phylogenetic trees were built using both the maximum-likelihood and Bayesian methods. Maximum-likelihood trees were constructed in PAUP using 100 bootstrap replicates. Bayesian phylogenetic trees were built using MRBAYES (version 3.0) (12). For each tree, Markov chains were run at least three independent times to determine the proper "burn-in" time. Posterior probabilities, which represent the probability that a specific node is observed, were recorded. All phylogenetic trees were rooted using sequences from an Escherichia coli O157:H7 strain (GenBank accession no. NC002695) (11), which served as an outgroup. Phylogenetic trees generated from either MRBAYES or PAUP were viewed using TreeView, version 32 (22). In all trees, the branch length of the outgroup was collapsed by an order of 10 or 100 so as to best view the topology of the tree.

    World Wide Web-based data access. Detailed isolate source information, as well as all sequence data and allele assignments from this study, are accessible through the PathogenTracker, version 2.0, website (http://www.pathogentracker.net).

    RESULTS

    Genetic diversity and allelic profiles of MLST genes. DNA sequence data generated for the seven MLST target genes included a total of 6,665 bp; primers designed for icdA and panB (Table 2), however, did not amplify these genes for the five Salmonella serotype Schwarzengrund isolates tested. The seven-gene MLST scheme allowed differentiation of eight MLST types among the 25 Salmonella isolates tested (Table 4). Sequence data for individual genes allowed differentiation of four to seven allelic types for the respective genes (Table 4). The sequence variation for each gene ranged from 1.6 to 4.9%; panB exhibited the highest sequence variation (4.9% [Table 5]). Sequence variation was not positively correlated with the discriminatory ability of the gene. For example, panB, which showed the highest sequence variation, differentiated only four allelic types.

    Based on the initial MLST results described above, we selected the three genes that provided the best subtype discrimination (fimA, mdh, and manB) as the basis for characterizing an additional set of 41 Salmonella isolates representing greater serotype diversity. Among the initial 25 isolates, fimA, mdh, and manB allowed differentiation of five, five, and seven allelic types, respectively. Based on these three genes, we were able to define eight STs among our initial set of 25 isolates; this was the same as the number of MLST types that could be defined based on data for all seven genes. We thus concluded that these genes would provide appropriate targets for a more economical "three-gene sequence-typing scheme." This three-gene sequence-typing scheme allowed differentiation of a total of 25 STs among our larger set of 66 isolates (Table 6). Individual genes allowed discrimination of 11 to 17 allelic types, and the sequence variation for each gene ranged from 2.5 to 5.8% (Table 5).

    Sequencing of manB for the additional 41 Salmonella isolates revealed consistent double peaks in the electropherograms for three avian Salmonella serotype Montevideo isolates, which led us to hypothesize that these strains may carry a duplicated manB gene. PCR amplification and subsequent cloning and sequencing of PCR products indeed revealed the presence of two distinct manB alleles within each of these three isolates, which we designated manB1 and manB2. Subsequent PCR with allele-specific primers followed by sequencing of PCR products further confirmed the presence of this manB duplication. Phylogenetic analyses of all manB alleles (Fig. 1) showed that the manB1 allele found among these isolates was identical to the single manB allele found in the three bovine serotype Montevideo isolates. The manB2 allele found in the three avian isolates, on the other hand, was identical to the manB allele present in all six serotype Javiana isolates.

    Differentiation of Salmonella serotypes by MLST typing. Seven-gene MLST typing of the initial 25 Salmonella isolates showed that MLST types were 100% predictive for serotypes Agona, Heidelberg, and Schwarzengrund. Among serotype Typhimurium and serotype Typhimurium var. Copenhagen isolates, only one serotype Typhimurium var. Copenhagen isolate displayed an MLST type otherwise typical for Typhimurium isolates; repeat serotyping of this serotype Typhimurium var. Copenhagen isolate confirmed this serotype designation. For some serotypes, MLST typing was also able to discriminate subtypes within a given serotype. Serotypes Agona and Typhimurium both contained multiple MLST types (two and three, respectively).

    Three-gene sequence-typing data for our set of 66 isolates further confirmed that STs were unique for all serotypes other than the one serotype Typhimurium var. Copenhagen isolate discussed above. Within-serotype differentiation was observed for 9 out of 12 serotypes analyzed (Table 6); serotypes Typhimurium, Dublin, Javiana, and Newport each contained three STs.

    Evolutionary characteristics of MLST genes. Models of evolution determined for each gene are listed in Table 7 along with results from molecular clock tests. More than half of our genes follow the Hasegawa-Kishino-Yano (HKY) model. This model allows the frequencies of each nucleotide to differ and allows transitions and transversions to have different substitution rates (23). fimA, aceK, and icdA follow variants of the Kimura models (K80, K81), which are constrained versions of the HKY model, i.e., nucleotide frequencies are assumed to be equal (23). Most genes also follow a molecular clock, which implies that the underlying mutation rates for these genes are constant (23).

    dN/dS ratios for the housekeeping genes aceK, icdA, mdh, and panB were <0.1 (Table 5). Since manB and the virulence genes spaN and fimA showed dN/dS ratios of >0.1, we used a likelihood ratio test (implemented in PAML) analysis as a statistical method to determine whether any of these three genes contained amino acid residues that are under positive selection (38). While none of these genes showed evidence of significant positive selection, eight and nine positively selected sites within manB and spaN, respectively, were identified. Since there was no indication of overall positive selection among these genes, we regarded these findings as preliminary, requiring additional confirmation on larger data sets.

    Evolutionary relationships among Salmonella serotypes. A concatenated alignment of all seven MLST genes sequenced for 25 isolates was used to test for evidence of reticulate evolution (i.e., recombination and/or repeated mutation) using RETICULATE (Fig. 2). The overall compatibility was 0.99, with a neighborhood similarity score of 0.98; the neighborhood similarity score was significantly higher than that for a randomized matrix, indicating that the overall pattern of compatibility and incompatibility between sites is not random and that the order of sites along the nucleotide alignment has increased clustering of compatible and incompatible sites. Reticulate analysis was also performed for a concatenated alignment of the fimA, mdh, and manB sequences obtained for all 66 isolates (Fig. 3). The overall compatibility was 0.77, with a neighborhood similarity score of 0.72; this neighborhood similarity score was also significantly higher than that for the corresponding randomized matrix. Based on these data, we concluded that only a limited number of incompatible sites was present between and within genes, thus allowing for construction of meaningful phylogenetic trees based on concatenated sequence data.

    Construction of phylogenetic trees based on concatenated sequences was performed using Bayesian and maximum-likelihood methods. A maximum-likelihood phylogenetic tree for our initial 25 Salmonella isolates, based on a concatenated sequence of all seven MLST genes, showed monophyletic lineages for serotypes Schwarzengrund, Heidelberg, and Agona as well as a fourth monophyletic lineage containing serotypes Typhimurium and Typhimurium var. Copenhagen (Fig. 4). These monophyletic lineages were all supported by both high posterior probabilities (1.00) and high bootstrap values. Bayesian phylogenetic analysis of concatenated fimA, mdh, and manB sequences representing all sequence types also showed that most Salmonella serotypes represent monophyletic lineages (Fig. 5). One exception is Salmonella serotype Newport, which groups into two statistically well supported separate lineages; one of these lineages (which clustered with Salmonella serotype Agona) contained only bovine isolates, whereas the other lineage contained only avian isolates. In addition, Salmonella serotype Kentucky also grouped into two distinct lineages. Our phylogenetic analyses furthermore identified two major branches of Salmonella serotypes. Serotypes Typhimurium, Typhimurium var. Copenhagen, Dublin, Enteritidis, and Agona were found on one branch, and the rest of the serotypes fell on the other branch. However, the posterior probabilities showed that these two branches were not well supported (<0.50). Our analyses also identified multiple lineages supported by posterior probabilities larger than 0.90, including two lineages which included multiple serotypes; one of these lineages included serotypes Dublin and Enteritidis (both of which are classified into serogroup D1), while the other included serotypes Javiana, Montevideo, and Schwarzengrund as well as two serotype Kentucky isolates, representing a variety of different serogroups (Fig. 5).

    DISCUSSION

    Using an initial collection of 25 Salmonella isolates, we developed a seven-gene MLST scheme targeting a combination of housekeeping and virulence genes. Based on the initial data obtained with this seven-gene MLST scheme, we chose three genes with the highest discriminatory ability to develop and apply a more economical three-gene sequence-typing scheme using 66 Salmonella isolates, representing 12 serotypes. Our results show that (i) a three-gene sequence-typing scheme allows for serotype prediction and for limited subtype discrimination within serotypes and (ii) Salmonella serotypes represent both monophyletic and polyphyletic lineages.

    A three-gene sequence-typing scheme allows for serotype prediction and for limited subtype discrimination within a serotype. While MLST sequence typing schemes have been published for a variety of different pathogens (5, 9, 15, 16), only limited information is available on the use of MLST methods for subtype discrimination of salmonellae (17). Overall, our data indicate that both a seven-gene MLST and a three-gene sequence-typing scheme allow limited within-serotype discrimination for salmonellae; both schemes allowed for discrimination of only eight STs among our initial set of 25 Salmonella enterica isolates representing five serotypes. Sensitive subtyping of salmonellae with high within-serotype discriminatory ability has been documented for a variety of other subtyping techniques, such as MLEE, phage typing, and PFGE (1, 8, 33), indicating that these methods may provide for more-sensitive subtype discrimination for Salmonella enterica. In addition to a low level of discriminatory power, Salmonella MLST schemes also face the challenge of designing appropriate primers that allow for PCR amplification and sequencing of isolates representing at least the majority of common serotypes. For example, both the panB and icdA primers described here did not allow for amplification of the respective genes in serotype Schwarzengrund isolates. Similarly, Kotetishvili et al. (17) reported that the four primer sets used in their Salmonella sequence-typing study allowed for successful amplification of the respective genes in only 75 to 94% of the Salmonella isolates tested. The lack of MLST Salmonella papers in the primary literature, despite the recent boom in MLST subtyping, might be related to these types of technical issues. However, the availability of genome sequences for various Salmonella serotypes (24) may aid in the design of better universal primers for sequence-based Salmonella subtyping.

    Even though sequence typing allowed for only limited within-serotype subtype discrimination, analyses of the seven-gene MLST and three-gene sequence-typing data both provided reliable prediction of serotypes. Only 1 out of 24 three-gene STs contained isolates from more than one serotype; isolates within this ST represented serotypes Typhimurium and Typhimurium var. Copenhagen, two very similar serotypes (27). Even sequencing of seven genes did not allow for differentiation of these serotypes. In other studies, differentiation between serotype Typhimurium and serotype Typhimurium variants has been possible with high-resolution subtyping techniques, such as phage typing (27). Interestingly, as discussed in more detail below, analyses of data from the three-gene sequence typing also allowed the definition of two distinct monophyletic lineages within serotypes Kentucky and Newport, indicating that sequence typing provides improved subtype discrimination as well as relevant evolutionary and biological information beyond that associated with serotyping.

    Based on our data reported here, we propose that a three-gene sequence-typing scheme targeting fimA, manB, and mdh allows for prediction of the most common Salmonella enterica serotypes as well as for limited within-serotype discrimination (9 of the 12 serotypes in our study included multiple three-gene STs). The genes targeted in this scheme were shown to offer subtype discrimination equal to that of an initial seven-gene MLST; they include a virulence gene (fimA) as well as housekeeping genes that have previously been shown to be useful for determining phylogenetic relationships between various Salmonella subspecies (mdh [3]) and that have allowed for sensitive subtype discrimination among clinical and environmental isolates (manB [17]). Interestingly, our data also showed that gene duplication of manB has occurred in at least some serotype Montevideo isolates, complicating analyses of manB sequence data for these isolates.

    Salmonella serotypes represent both monophyletic and polyphyletic lineages. Unlike banding-pattern-based subtyping methods (e.g., PFGE, ribotyping, randomly amplified polymorphic DNA), DNA-sequencing-based subtyping data can also be used to probe the evolutionary history of the isolates sequenced. Initial analyses for reticulate evolution revealed only limited evidence of reticulate evolution (recombination or repeated mutation) in concatenated alignments of the seven genes sequenced for 25 isolates or the three genes sequenced for 66 isolates. This is consistent with previous studies, which indicated that S. enterica basically shows a clonal population structure (2, 3). Since deviations from neutral selection can also hinder an accurate phylogenetic signal, we also tested for evidence of positive selection among the genes sequenced. dN/dS ratios revealed no evidence of positive selection within four of the housekeeping genes, which were included in our gene selection according to standard MLST practice (19). The virulence genes spaN and fimA as well as manB showed dN/dS ratios of >0.1; however, hypothesis testing of positive selection by PAML found no evidence for significant positive selection within these genes. Since the genes used in both our seven-gene MLST and our three-gene sequence-typing scheme did not show statistically significant evidence for either reticulate evolution or positive selection, we concluded that a concatenated sequence could be used to infer the phylogeny of our Salmonella isolates (14).

    Phylogenetic analyses demonstrated that the majority of serotypes included in our study represented monophyletic lineages. Only serotypes Newport and Kentucky represented polyphyletic lineages. This is consistent with a previous report (2), which also showed, based on MLEE data, that some Salmonella serotypes, including serotype Newport, represent polyphyletic lineages. This study (2) also reported that the two major lineages of serotype Newport differ in their frequency of association with disease in humans versus animals. Interestingly, in our study, isolates in one lineage were associated exclusively with isolation from avian sources (STs 12 and 13), while the other lineage (ST 11) represented bovine isolates. The clustering of serogroups in our phylogenetic tree based on the three-gene sequence-typing data also correlated well with a previous phylogenetic analysis of Salmonella enterica isolates based on gene content microarray analyses (25); for example, like our data, the phylogenetic analyses reported by this group (25) also grouped the serogroup D1 serotype Javiana into a distinct lineage separated from the other D1 serotypes Dublin and Enteritidis. This further supports the argument that the evolutionary relationships revealed by analysis of sequence data from our three-gene sequence typing correctly represent Salmonella phylogenetic relationships.

    Conclusions. Our data show that a seven-gene MLST as well as more economical three-gene sequence-typing schemes allow for reliable prediction of the most common Salmonella serotypes and allow for some subtype classification within Salmonella serotypes. While further verification of our data on isolates representing a larger serotype diversity will be necessary, our data indicate that sequence-based subtyping may have the potential to replace classical serotyping. Our data also appear to indicate that MLST schemes have a limited ability to allow for sensitive subtype discrimination within Salmonella serotypes, such as that achieved with other subtyping methods such as PFGE and phage typing. In contrast to our findings, Kotetishvili et al. (17) reported that a four-gene sequence-typing scheme allowed for more-sensitive subtype discrimination than PFGE, including sensitive within-serotype discrimination. The data of Kotetishvili et al. (17) indicated that even sequencing of a single gene (manB), which was also used in our study reported here, allowed for sensitive subtype discrimination within many serotypes. These discrepancies between our studies will need to be resolved to allow for a conclusive decision on the value of MLST for Salmonella subtyping. Our observation of manB gene duplication, however, emphasizes the need for careful evaluation of sequencing electropherograms when interpreting DNA-sequencing data for this gene.

    ACKNOWLEDGMENTS

    This project was funded in part by federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under contract N01-AI-30054. In addition, this project was also funded in part by USDA Special Research Grant 2002-34459-11758.

    We thank Kendra Nightingale and Ruth Zadoks for helpful discussions and Nellie Dumas and the New York State Department of Health for human Salmonella isolates.

    REFERENCES

    Baggesen, D. L., D. Sandvang, and F. M. Aarestrup. 2000. Characterization of Salmonella enterica serovar typhimurium DT104 isolated from Denmark and comparison with isolates from Europe and the United States. J. Clin. Microbiol. 38:1581-1586.

    Beltran, P., J. M. Musser, R. Helmuth, J. J. Farmer III, W. M. Frerichs, I. K. Wachsmuth, K. Ferris, A. C. McWhorter, J. G. Wells, A. Cravioto, and R. K. Selander. 1988. Toward a population genetic analysis of Salmonella: genetic diversity and relationships among strains of serotypes S. choleraesuis, S. derby, S. dublin, S. enteritidis, S. heidelberg, S. infantis, S. newport, and S. typhimurium. Proc. Natl. Acad. Sci. USA 85:7753-7757.

    Boyd, E. F., K. Nelson, F. S. Wang, T. S. Whittam, and R. K. Selander. 1994. Molecular genetic basis of allelic polymorphism in malate dehydrogenase (mdh) in natural populations of Escherichia coli and Salmonella enterica. Proc. Natl. Acad. Sci. USA 91:1280-1284.

    Brenner, F. W., R. G. Villar, F. J. Angulo, R. Tauxe, and B. Swaminathan. 2000. Salmonella nomenclature. J. Clin. Microbiol. 38:2465-2467.

    Cai, S., D. Y. Kabuki, A. Y. Kuaye, T. G. Cargioli, M. S. Chung, R. Nielsen, and M. Wiedmann. 2002. Rational design of DNA sequence-based strategies for subtyping Listeria monocytogenes. J. Clin. Microbiol. 40:3319-3325.

    Centers for Disease Control and Prevention. 2001. Salmonella Annual Summary 2001. [Online.] http://www.cdc.gov/ncidod/dbmd/phlisdata/salmtab/2001/SalmonellaAnnualSummary2001.pdf.

    Chan, K., S. Baker, C. C. Kim, C. S. Detweiler, G. Dougan, and S. Falkow. 2003. Genomic comparison of Salmonella enterica serovars and Salmonella bongori by use of an S. enterica serovar Typhimurium DNA microarray. J. Bacteriol. 185:553-563.

    Demczuk, W., G. Soule, C. Clark, H.-W. Ackermann, R. Easy, R. Khakhria, F. Rodgers, and R. Ahmed. 2003. Phage-based typing scheme for Salmonella enterica serovar Heidelberg, a causative agent of food poisonings in Canada. J. Clin. Microbiol. 41:4279-4284.

    Dingle, K. E., F. M. Colles, D. R. Wareing, R. Ure, A. J. Fox, F. E. Bolton, H. J. Bootsma, R. J. Willems, R. Urwin, and M. C. Maiden. 2001. Multilocus sequence typing system for Campylobacter jejuni. J. Clin. Microbiol. 39:14-23.

    Furrer, B., U. Candrian, C. Hoefelein, and J. Luethy. 1991. Detection and identification of Listeria monocytogenes in cooked sausage products and in milk by in vitro amplification of haemolysin fragments. J. Appl. Bacteriol. 70:372-379.

    Hayashi, T. M., K. Makino, M. Ohnishi, K. Kurokawa, K. Ishii, K. Yokoyama, C. G. Han, E. Ohtsubo, K. Nakayama, T. Murata, M. Tanaka, T. Tobe, T. Iida, H. Takami, T. Honda, C. Sasakawa, N. Ogasawara, T. Yasunaga, S. Kuhara, T. Shiba, M. Hattori, and H. Shinagawa. 2001. Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. DNA Res. 8:11-22.

    Huelsenbeck, J. P., and F. Ronquist. 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17:754-755.

    Jakobsen, I. B., and S. Easteal. 1996. A program for calculating and displaying compatibility matrices as an aid in determining reticulate evolution in molecular sequences. Comput. Appl. Biosci. 12:291-295.

    Jakobsen, I. B., S. R. Wilson, and S. Easteal. 1997. The partition matrix: exploring variable phylogenetic signals along nucleotide sequence alignments. Mol. Biol. Evol. 14:474-484.

    Jones, N., J. F. Bohnsack, S. Takahashi, K. A. Oliver, M. S. Chan, F. Kunst, P. Glaser, C. Rusniok, D. W. Crook, R. M. Harding, N. Bisharat, and B. G. Spratt. 2003. Multilocus sequence typing system for group B streptococcus. J. Clin. Microbiol. 41:2530-2536.

    Kotetishvili, M., O. C. Stine, Y. Chen, A. Kreger, A. Sulakvelidze, S. Sozhamannan, and J. G. Morris, Jr. 2003. Multilocus sequence typing has better discriminatory ability for typing Vibrio cholerae than does pulsed-field gel electrophoresis and provides a measure of phylogenetic relatedness. J. Clin. Microbiol. 41:2191-2196.

    Kotetishvili, M., O. C. Stine, A. Kreger, J. G. Morris, Jr., and A. Sulakvelidze. 2002. Multilocus sequence typing for characterization of clinical and environmental Salmonella strains. J. Clin. Microbiol. 40:1626-1635.

    Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244-1245.

    Maiden, M. C. J., J. A. Bygraves, E. Feil, G. Morelli, J. E. Russell, R. Urwin, Q. Zhang, J. Zhou, K. Zurth, D. A. Caugant, I. M. Feavers, M. Achtman, and B. G. Spratt. 1998. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc. Natl. Acad. Sci. USA 95:3140-3145.

    Mead, P. S., L. Slutsker, V. Dietz, L. F. McCaig, J. S. Bresee, C. Shapiro, P. M. Griffin, and R. V. Tauxe. 1999. Food-related illness and death in the United States. Emerg. Infect. Dis. 5:607-625.

    Olsen, S. J., R. Bishop, F. W. Brenner, T. H. Roels, N. Bean, R. V. Tauxe, and L. Slutsker. 2001. The changing epidemiology of salmonella: trends in serotypes isolated from humans in the United States, 1987-1997. J. Infect. Dis. 183:753-761.

    Page, R. D. 1996. TreeView: an application to display phylogenetic trees on personal computers. Comput. Appl. Biosci. 12:357-358.

    Page, R. D., and E. C. Holmes. 1998. Molecular evolution: a phylogenetic approach. Blackwell Science, Oxford, United Kingdom.

    Porwollik, S., and M. McClelland. 2003. Lateral gene transfer in Salmonella. Microb. Infect. 5:977-989.

    Porwollik, S., E. F. Boyd, C. Choy, P. Cheng, L. Florea, E. Proctor, and M. McClelland. 2004. Characterization of Salmonella enterica subspecies I genovars by use of microarrays. J. Bacteriol. 186:5883-5898.

    Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817-818.

    Rabsch, W., H. L. Andrews, R. A. Kingsley, R. Prager, H. Tschape, L. G. Adams, and A. J. Baumler. 2002. Salmonella enterica serotype Typhimurium and its host-adapted variants. Infect. Immun. 70:2249-2255.

    Reeves, M. W., G. M. Evins, A. A. Heiba, B. D. Plikaytis, and J. J. Farmer III. 1989. Clonal nature of Salmonella typhi and its genetic relatedness to other salmonellae as shown by multilocus enzyme electrophoresis, and proposal of Salmonella bongori comb. nov. J. Clin. Microbiol. 27:313-320.

    Rozas, J., and R. Rozas. 1999. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174-175.

    Salcedo, C., L. Arreaza, B. Alcala, L. de la Fuente, and J. A. Vazquez. 2003. Development of a multilocus sequence typing method for analysis of Listeria monocytogenes clones. J. Clin. Microbiol. 41:757-762.

    Selander, R. K., P. Beltran, N. H. Smith, R. M. Barker, P. B. Crichton, D. C. Old, J. M. Musser, and T. S. Whittam. 1990. Genetic population structure, clonal phylogeny, and pathogenicity of Salmonella paratyphi B. Infect. Immun. 58:1891-1901.

    Selander, R. K., P. Beltran, N. H. Smith, R. Helmuth, F. A. Rubin, D. J. Kopecko, K. Ferris, B. D. Tall, A. Cravioto, and J. M. Musser. 1990. Evolutionary genetic relationships of clones of Salmonella serovars that cause human typhoid and other enteric fevers. Infect. Immun. 58:2262-2275.

    Selander, R. K., N. H. Smith, J. Li, P. Beltran, K. E. Ferris, D. J. Kopecko, and F. A. Rubin. 1992. Molecular evolutionary genetics of the cattle-adapted serovar Salmonella dublin. J. Bacteriol. 174:3587-3592.

    Urwin, R., and M. C. Maiden. 2003. Multi-locus sequence typing: a tool for global epidemiology. Trends Microbiol. 11:479-487.

    Warnick, L. D., K. Kanistanon, P. L. McDonough, and L. Power. 2003. Effect of previous antimicrobial treatment on fecal shedding of Salmonella enterica subsp. enterica serogroup B in New York dairy herds with recent clinical salmonellosis. Prev. Vet. Med. 56:285-297.

    Wiedmann, M. 2002. Subtyping of bacterial foodborne pathogens. Nutr. Rev. 60:201-208.

    Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555-556.

    Yang, Z., R. Nielsen, N. Goldman, and A. M. Pedersen. 2000. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431-449.(Sharinne Sukhnanand, Sam )