当前位置: 首页 > 期刊 > 《核酸研究》 > 2005年第22期 > 正文
编号:11369573
Distortion of quantitative genomic and expression hybridization by Cot
http://www.100md.com 《核酸研究医学期刊》
     Laboratories of Genomic Disorders, Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine KS, USA 1Human Molecular Genetics, Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine KS, USA

    *To whom correspondence should be addressed. Tel: +1 816 983 6511; Fax: +1 816 983 6515; Email: progan@cmh.edu

    ABSTRACT

    Cross-hybridization of repetitive sequences in genomic and expression arrays is reported to be suppressed with repeat-blocking nucleic acids (Cot-1 DNA). Contrary to expectation, we demonstrated that Cot-1 also enhanced non-specific hybridization between probes and genomic targets. When added to target DNA, Cot-1 enhanced hybridization (2.2- to 3-fold) to genomic probes containing conserved repetitive elements. In addition to repetitive sequences, Cot-1 was found to be enriched for linked single copy (sc) sequences. Adventitious association between these sequences and probes distort quantitative measurements of the probes hybridized to desired genomic targets. Quantitative microarray hybridization studies using Cot-1 DNA are also susceptible to these effects, especially for probes that map to genomic regions containing conserved repetitive sequences. Hybridization measurements with such probes are less reproducible in the presence of Cot-1 than for probes derived from sc regions or regions containing divergent repeat elements, a finding with significant ramifications for genomic and expression microarray studies. We mitigated the requirement for Cot-1 either by hybridizing with computationally defined sc probes lacking repeats or by substituting synthetic repetitive elements complementary to sequences in genomic probes.

    INTRODUCTION

    Genome-wide analysis of gene expression and locus copy number has been facilitated by microarray and array-based comparative genomic hybridization. Persistent questions regarding reproducibility of these techniques have been raised by cross-validation studies in different laboratories (1–5). Strategies to mitigate variability in the results obtained from replicate studies have focused on standardizing technical factors, such as array production, RNA synthesis, labeling, hybridization, scanning and data analysis (6–8). Zakharkin et al. (9) suggest that biological differences among samples are the largest source of this variability and these other factors contribute to a lesser degree.

    The use of repetitive sequence-enriched (Cot-1) DNA to suppress non-specific cross hybridization between repetitive elements present in the probe with other locations in the genome (or transcriptome) is a common requirement for most microarray hybridization studies. In humans, the Cot-1 fraction is highly concentrated in families of interspersed repetitive elements, such as SINEs and LINEs (10,11). Commercial procedures for Cot-1 DNA preparation iterate denaturation and re-annealing of genomic DNA, and are monitored by enrichment for Alu elements (3-fold excess over the corresponding level in the normal genome) and L1 elements (4-fold excess). Current quality control procedures do not determine the precise composition of Cot-1 DNA.

    While the Cot-1 fraction appears to suppress repetitive sequence hybridization, it also increases experimental noise (12). We investigated the possibility that differences in Cot-1 composition could be a major source of variability in results from genomic hybridization studies. Here, the role of Cot-1 in genomic hybridization is elucidated by quantitative microsphere hybridization (QMH) (H. L. Newkirk, P. K. Rogan, M. Miralles and J. H. Knoll, manuscript submitted) using sequence-defined, genomic single copy (sc) probes (13) and probes composed of contiguous sc and repetitive genomic sequences. We find that Cot-1 promotes the formation of stable duplexes containing paralogous repetitive sequences often unrelated to the probe, thereby altering accurate quantification of sc sequence hybridization. We eliminated this effect by developing probes lacking repetitive sequences and by suppressing cross-hybridization with synthetic repetitive elements.

    MATERIALS AND METHODS

    Quantitative microsphere hybridization

    Probe selection, synthesis and microsphere conjugation

    Sc or mixed sc and repetitive sequence probes were designed as described previously (13–15). They include: two chromosome 9q34 probes with sc intervals (i) ABL1a (May 2004, chr 9: 130 623 551–130 625 854) with divergent AluJo/Sx/L2 repeats and (ii) ABL1b (chr 9: 130 627 353–130 628 735) with divergent AluJo repeats from within ABL1, designed and validated previously (H. L. Newkirk, P. K. Rogan, M. Miralles and J. H. Knoll, manuscript submitted, 14), (iii) a 1823 bp chromosome 9 probe with Alu/MER1 repetitive sequences, ABL1AluMER1 (chr 9: 130 621 702–130 623 525), (iv) a 98 bp sc segment of a TEKT3 intron (chr 17: 15 149 108–15 149 206) and (v) a 101 bp sc segment of a PMP22 intron (chr 17: 15 073 475–15 073 576), (vi) a 93 bp sc segment of a HOXB1 intron (chr 17: 43 964 237–43 964 330). Probes for genomic reconstruction experiments included: (vii) HOXB1b (chr 17: 43 963 396–43 965 681) and (viii) C1QTNF7 (chr 4: 15 141 452–15 141 500). Repetitive sequences found within probes were defined as divergent, based on percent sequence differences (>12%), percent deletion (>4%) and/or percent insertion (>4%) relative to consensus family members (www.girinst.org). Probes were synthesized and coupled to microspheres as described previously (H. L. Newkirk, P. K. Rogan, M. Miralles and J. H. Knoll, manuscript submitted).

    Genomic target preparation

    Genomic template was prepared using four methanol-acetic acid fixed cell pellets derived from cytogenetic preparations of bone marrow samples as described previously (H. L. Newkirk, P. K. Rogan, M. Miralles and J. H. Knoll, manuscript submitted). One microgram of genomic and pUC19 DNA was nick-translated with biotin-16 dUTP to obtain products of 100–350 bp and 50–300 bp in length, respectively (16). One microgram of each Cot-1 DNA (from Manufacturers I and R) was nick-translated with digoxygenin-11 dUTP to obtain products of 50–300 bp.

    Hybridization reactions and flow cytometry

    Nick-translated DNA (50 ng) was diluted in 40 μl of 1.5x TMAC hybridization buffer (3-mol/l tetramethylammonium chloride, 50 mmol/l Tris–HCl, pH8.0, 1 g/l Sarkosyl) containing 10 000 probe-coupled microspheres. Reactions were assembled with the components listed in Table 1. Hybridization and detection of reactions were carried out as described (H. L. Newkirk, P. K. Rogan, M. Miralles and J. H. Knoll, manuscript submitted). In brief, reactions were denatured for 3 min and hybridized overnight at 50°C. Hybridized microspheres were washed and stained with a reporter molecule, streptavidin phycoerythrin (SPE; Molecular Probes, Eugene, OR) and/or anti-digoxygenin-fluorescein isothiocyanate (FITC; Molecular Probes), followed by flow cytometric analysis (FACSCalibur, Becton Dickinson, San Jose, CA). Approximately 5000 microspheres were analyzed per reaction. Hybridization was quantified from the SPE and/or FITC mean fluorescence intensity (measured in channels FL2 and/or FL1, respectively), which corresponds to the quantities of genomic target (FL2) and of Cot-1 DNA (FL1) bound by probe. Calibration studies with conjugated probes and labeled targets containing identical sequences demonstrated that changes in mean fluorescence intensity were linearly related to the amount of target hybridized. The FL1 and FL2 channel background fluorescence was separately determined in each hybridization experiment using a negative control containing all reaction components except target DNA. Optimal photomultiplier voltages were set as described previously; data collection and analysis were performed with manufacturer-supplied CellQuest software (H. L. Newkirk, P. K. Rogan, M. Miralles and J. H. Knoll, manuscript submitted).

    Table 1 QMH (mean fluorescence)

    Recovery of probe-hybridized DNA fragments

    Aliquots (35 μl) of genomic hybridizations with ABL1a (Table 1, reactions 19 and 20) were washed with 250 μl of 0.1x SSC, 1% SDS and pelleted by centrifugation (13 000 g), and repeated twice. The hybridized genomic sequences were heat denatured at 95°C for 5 min and snap-cooled followed by centrifugation (13 000 g) at 4°C for 3 min. Recovered sequences were used as target for quantitative PCR (QPCR) and for hybridization to microsphere-coupled ABL1AluMER1 (Table 1, reactions 21 and 22).

    Synthetic repetitive DNA

    Synthetic repetitive DNA was prepared from genomic regions selected based on the families of repetitive sequences contained within them, since each is enriched in the Cot-1 manufacturing process. However, any representative genomic region containing sc regions adjacent to moderate or high copy number repetitive elements could have been employed. To demonstrate that repeat elements in genomic probes could be suppressed at locations beyond the desired target interval, we prepared a probe containing a 1.1 kb LTR element centered between two 400 bp sc regions on chromosome 4p (chr 4: 15 139 704–15 141 581) located upstream of the C1QTNF7 gene (Figure 2A). Subsequently, repetitive sequences situated within the ABLa probe region for blocking this repeat element were synthesized; ABL1a contains a 280 bp AluJo repeat, a 300 bp AluSx repeat and an 830 bp L2 element segment (Figure 2C). We also used a 2286 bp segment on chromosome 17q located 5' of HOXB1 containing a 306 bp AluSx repeat and 154 bp L1 truncated sequence (chr 17: 43 963 396–43 965 681) as a probe (Figure 2B). Primers that amplified unique sequences immediately flanking these elements (Table 2, HOXB1AluL1 and C1QTNF7LTR) were developed for PCR amplification of each repeat sequence and of the target product (Table 2, HOXB1b and C1QTNF7). Genomic DNA (Promega) probes were amplified using Pfx (Invitrogen); amplification products were then electrophoresed and extracted by micro-spin column centrifugation. Hybridization reactions (Table 1, reactions 23–33) evaluated the effect of the synthetic repetitive PCR products hybridized to homologous PCR product, and/or genomic DNA, in the presence and absence of Cot-1 DNA. Reactions were hybridized, washed, stained with SPE and then analyzed by flow cytometry.

    Figure 2 Synthetic repetitive products and probes used in suppression of cross-hybridization to genomic templates. (A) Synthetic repetitive sequence from a 1.1 kb LTR element between two 400 bp sc intervals upstream of the C1QTNF7 gene on chromosome 4p. The genomic target region for the sc microsphere-coupled probe is downstream of the LTR element. (B) A single synthetic repetitive DNA product contained both the AluSx (306 bp) and L1 (154 bp) repetitive sequences within a 2286 bp segment on chromosome 17q upstream of HOXB1 (HOXB1b). The probe coupled to the microsphere included both sc and repetitive sequences. (C) Individual synthetic DNA products derived from ABL1a contained a 280 bp AluJo repeat, a 300 bp AluSx repeat and an 830 bp L2 element segment. The microsphere-coupled probe contained the entire 2303 bp region.

    Table 2 Probes and primers used in this study

    Quantitative PCR

    QPCR and data analysis were performed using the Chromo4 QPCR system (BioRad Laboratories, Hercules, CA). Primers and amplified intervals were verified for unique genomic representation using BLAT (17) and WU-BLAST (Table 2). Each 50 μl reaction contained 0.5 μM of each primer, 50 ng Cot-1 template or positive control human genomic DNA (Promega, Madison, WI) and 25 μl 2XQTSybrG master mix (Qiagen, Valencia, CA). Genomic DNA was nicked using DNAse to generate fragments from 50 to 300 bp, and a negative control contained all reaction components except for DNA. Thermal cycling conditions were 95°C for 15 min, 45 cycles of amplification (94°C for 15 s, 61°C for 30 s (data acquisition), 72°C for 30 s), followed by 72°C for 5 s with a decrease in temperature by 20°C every second for the generation of a melt curve. A calibration curve used to determine the amount of input target sequence in the recovered genomic template was generated by varying the amounts of normal genomic template (1, 2, 4, 10 and 20 ng) and by determining the CT values for each reaction.

    The composition of sequences recovered from the ABL1a product hybridization (1 μl; Table 1, reactions 21 and 22) was determined by QPCR. Primer sets were utilized that amplified several sequences from within the ABL1 region, which were not necessarily homologous to this probe, including ABL1a and ABL1c (chr 9: 130 709 665–130 711 469), ABL1d (chr 9: 130 699 324–130 700 596) as well as primers specific for other unlinked genomic regions such as DNJA3Alu (chr 16: 4 421 138–4 421 200), containing an Alu repeat located 5' of the DNAJ3 gene, TEKT3 and HOXB1 (Table 2). Reactions were performed as described above. A positive control (human genomic DNA) was run for each primer set to represent the initial quantity of genomic DNA originally added to QMH reactions (50 ng). Molar ratios of target sequences recovered from QMH were determined from the quantity of initial template in test samples (interpolated from its CT value cross-referenced against the standard calibration curve) in the presence and absence of Cot-1 DNA.

    RESULTS

    Quantitative microsphere hybridization with Cot-1 DNA

    A FISH-validated, mixed sc and repetitive sequence probe, ABL1a, from the 5' end of IVS1b of the ABL1 gene containing divergent AluJo/Sx/L2 repeats (chr 9: 130 623 551–130 625 854) was hybridized with nick-translated genomic DNA (H. L. Newkirk, P. K. Rogan, M. Miralles and J. H. Knoll, manuscript submitted). Although we had expected that commercially prepared Cot-1 DNA would suppress repetitive sequence hybridization, in replicate hybridizations of ABL1a with nick-translated genomic DNA, the mean fluorescence (or hybridization) intensity of labeled genomic target was consistently and significantly increased by 2.2-fold when Cot-1 was included in replicate hybridizations of ABL1a with nick-translated genomic DNA (Table 1, reactions 1 and 2). Sc probes derived from chromosome 17 genes, PMP22 (chr 17: 15 073 475–15 073 576) and TEKT3 (chr 17: 15 149 108–15 149 206), showed smaller but reproducible increases of 1.08- and 1.14-fold in hybridization intensity in the presence of Cot-1 DNA (Table 1, reactions 3–6). These experiments suggested that the effects due to Cot-1 are related to the composition of repetitive sequences surrounding these sc intervals. An sc probe from HOXB1 (chr 17: 43 964 237–43 964 330) consistently exhibited a small decrease in hybridization intensity with addition of Cot-1 DNA (Table 1, reactions 7–12) with a 0.84- to 0.92-fold decrease in hybridization intensity for genomic samples tested. The HOXB1 interval is practically devoid of repetitive sequences (UCSC Genome Browser, May 2004 assembly; http://genome.ucsc.edu). The region circumscribing ABL1a contains highly dense, conserved and abundant interspersed SINE (AluJo, AluSx) and less conserved LINE (L2) elements. The TEKT3 and PMP22 intervals contain shorter, less abundant and more divergent classes of repeat elements (MIR, MER and L2).

    The degree to which addition of Cot-1 DNA altered target hybridization to the ABL1a probe was determined by comparing hybridizations of biotin-labeled target DNA (detected with streptavidin-phycoerythrin in the FL2 channel), a biotin-labeled negative control target (pUC19 plasmid) and each of these with digoxygenin-labeled Cot-1 DNA (detected by FITC-conjugated anti-digoxygenin in the FL1 channel). The presence of Cot-1 resulted in a 2-fold increase in the mean fluorescence intensity for ABL1a hybridized to biotin-labeled homologous genomic target sequence. However, the amount of labeled Cot-1 sequence bound substantially exceeded that necessary for suppression of repetitive sequences in ABL1a, based on a 50-fold increase in intensity relative to reactions in which Cot-1 sequences were omitted (Table 1, reactions 13 and 14). Cot-1 binding appears sequence-specific, since hybridization of ABL1a to pUC19 exhibited background level signals (<101) regardless of whether Cot-1 was present (Table 1, reaction 15). These findings suggest that homologous sequences in Cot-1 are directly binding to the ABL1a probe. Because the ABL1a sequence presumably represents only a small proportion of the Cot-1 target, it alone cannot account for the increase in observed hybridization.

    To determine if the increased signal was related to the quantity of Cot-1 DNA, varying amounts of digoxygenin-labeled Cot-1 DNA added to a fixed quantity (50 ng) of biotin-labeled genomic target were hybridized to ABL1b, which is a mixed sc and repetitive probe. ABL1b contains two divergent AluJo repetitive sequences (chr 9: 130 627 353–130 628 735). By doubling the amount of Cot-1 from 50 to 100 ng in the reaction, probe hybridization to Cot-1 increased by 1.8-fold and to homologous target by 1.3-fold (Table 1, reactions 16 and 17). Similarly, 150 ng of labeled Cot-1 DNA increased hybridization to ABL1b by 3-fold over 50 ng Cot-1, and by 1.5-fold to target DNA (Table 1, reactions 17 and 18). Even though the stochiometric addition of Cot-1 DNA dilutes the homologous biotinylated target between 2- and 4-fold, the corresponding hybridization intensity is unexpectedly increased 1.5-fold.

    The correlation of Cot-1 concentration with hybridization intensity suggested that this reaction component promoted the formation of duplex structures containing other sequences besides the probe and desired genomic target. To determine the composition of Cot-1 derived sequences bound to probes, products were denatured and recovered after hybridization to ABL1a-coupled microspheres (Table 1, reactions 19 and 20). These products were used as target sequences in subsequent hybridizations to a non-overlapping sc and repetitive microsphere-conjugated probe, ABL1AluMER1, containing Alu elements (AluJb, AluSq, Charlie1 and AluSx) and MER1 sequences localized 2.3 kb centromeric to ABL1a. Given the genomic location of ABL1AluMER1, we did not expect it to be present in the recovered nick-translated genomic products. However, the labeled Cot-1 fraction was found to be the source of the recovered ABL1AluMER1 sequence, based on a 11-fold increase in mean fluorescence intensity in the FL1 channel (Table 1, reactions 21 and 22). Repetitive sequences adjacent to hybridized ABL1a in Cot-1 DNA appear to nucleate hybridization to genomic sequences by forming networks of repetitive and sc sequence elements (Figure 1, examples 1 and 2). We evaluated this possibility by QPCR analysis of sequences present in recovered hybridization products.

    Figure 1 Potential structures produced in QMH hybridization in the presence of Cot-1 DNA. Example 1 depicts the hybridization of a microsphere-conjugated probe hybridizing to target genomic sc sequence. Example 2 illustrates the hybridization of the probe to sc sequences contained within the Cot-1 DNA fraction. Example 3 indicates a duplex formed between a sc sequence derived from Cot-1 DNA which bridged hybridization to a repetitive sequence within Cot-1 DNA. Example 4 shows a similar event in which the partial duplexes are bridged by unlinked sc sequences within the target genome.

    Analysis of hybridized sequences by QPCR

    We determined the content of sc sequences in Cot-1 that were homologous to our probes by QPCR. A 100 bp sc segment of ABL1a was amplified from 500 ng samples of Cot-1 DNA and control genomic DNA (Table 2). Based on their respective CT values, the Cot-1 fractions from Manufacturers I and R exhibited a 14- and 2-fold increase, respectively, in the amount of ABL1a hybridized (or a 2.5 and 1.7 molar increase) relative to its normal genomic composition (Table 3, reactions 1-3). ABL1a sequences were recovered after hybridization to determine the composition of genomic and Cot-1 derived sequences hybridized to this probe (Table 1, reactions 21 and 22). ABL1a sequences were increased by 128-fold in a hybridized sample containing both target and Cot-1 DNA (Table 3, reactions 13 and 14). Recovered sequences identical to ABL1AluMER1 from hybridizations containing Cot-1 were 139-fold more abundant than that found in duplicate reactions lacking Cot-1 (Table 3, reactions 15 and 16). We also detected repetitive sequences that are closely related to ABL1AluMER1 in recovered hybridization products. An Alu element with 92% similarity, DNJA3Alu (5' to DNJA3 gene; chr 16: 4 421 138–4 421 200), was found in the hybridization reaction containing Cot-1, but not in the reaction lacking Cot-1, indicating that Cot-1 was the source of this contaminating sequence (Table 3, reactions 17 and 18). Other sc genomic segments were not detected in the products recovered from hybridization to the ABL1a probe.

    Table 3 Quantification of recovered hybridization targets

    Cot-1 derived sequences hybridized to RRP4-1.6a, a sequence linked to ABL1 (Table 2), contained both homologous sc and repetitive sequences, despite the fact that this sc probe had been validated by FISH (13). Moderately and highly abundant MIR, L2 and L1 repeat elements surround this sequence in the genome. QPCR demonstrated higher concentrations of repetitive sequences recovered from upstream (5') and downstream (3') amplicons relative to a short RRP4-1.6a product derived from within the sc interval (Table 3, reactions 4–12). Comparison of CT values indicates that sc sequences bordering genomic repeats (RRP4-1.6a5' and RRP4-1.6a3') are only 6.8-fold more abundant in genomic DNA than in the Cot-1 fraction for Manufacturer R (and similarly, for Manufacturer I). As expected, the internal sc RRP4-1.6a sequence is considerably more abundant in genomic DNA than in Cot-1 (24-fold), but nevertheless can still be detected in Cot-1 (Table 3, reactions 7–9). Enrichment for SINEs and LINEs during Cot-1 preparation results in accretion of linked sc sequences, which during hybridization can potentially anneal to the conjugated probe or to actual sc target sequences in labeled genomic DNA.

    Suppression of cross-hybridization with synthetic repetitive DNA

    We reversed the hybridization effect of Cot-1 DNA at three different genomic loci by substituting an excess of purified, synthetic DNA(s) prepared specifically from the repetitive elements adjacent to sc sequences (Figure 2). A 1.9 kb amplification product was synthesized containing a LTR-like repetitive element and a sc sequence upstream of C1QTNF7 on chromosome 4. The addition of the purified synthetic LTR-like element, C1QTNF7LTR, had no effect on the self-hybridization of this product to coupled microspheres, whereas the addition of Cot-1 DNA increased the mean fluorescence by 1.2-fold (Table 1, reactions 23–25). We also used C1QTNF7LTR to block hybridization of repetitive sequences in nick-translated genomic DNA in the presence and absence of Cot-1 DNA and obtained similar results (Table 1, reactions 26–28). Hybridization of AluSx and L1 repetitive sequences was suppressed within an 2.3 kb region on chromosome 17 upstream of the HOXB1 locus (HOXB1b) using a synthetic PCR product, HOXB1AluL1, containing these sequences. Hybridization of the HOXB1b PCR product and corresponding microsphere-coupled probe in the presence of the HOXB1AluL1 effectively blocked repetitive sequence within amplified target, and, in fact, reduced hybridization intensity by 0.3-fold, presumably because of the reduction in target length (Table 1, reactions 32 and 33). Hybridization of repetitive sequences was also effectively suppressed in comparable genomic hybridizations to ABL1a coupled to microspheres by addition of synthetic Alu and L2 elements from within this target region (Table 1, reactions 29–31).

    Impact of Cot-1 in microarray hybridization

    The nearly universal inclusion of Cot-1 for repeat sequence suppression in published hybridization studies raises the question of how this reagent affects quantitative measures of expression and/or genomic copy number. We evaluated the variability in dual-label hybridization intensities across a set of replicate target samples hybridized to arrays of cloned probes in expression studies which utilized Cot-1 (source data from the GEO database: http://www.ncbi.nlm.nih.gov/projects/geo). We first analyzed results from cDNA probes of genes used in our microsphere hybridization assay (including ABL1a, HOXB1 and TEKT3), then subsequently the hybridization profiles for several gene sequences located in genomic environments distinguished by their repetitive sequence composition, i.e. that were either densely (Table 4, bottom) or sparsely (Table 4, top) populated with repetitive sequences. Replicate Cy3/Cy5 intensity ratios are significantly more variable for sequences occurring within repeat-dense genomic intervals relative to probes derived from genomic regions containing fewer, more divergent repetitive sequences. For example, ABL1 was found to exhibit both increased and decreased expression using the same test sample in different replicates (e.g. Database record GDS751/20213 which displays a sample variance of 0.30, corresponding to P = 0.18; Table 4), analogous to the distortion in hybridization we observed with microsphere conjugated-ABL1a. In contrast, HOXB1 showed little variability in log ratio intensities among replicate expression array studies using the same test sample (GDS223/31555; sample variance = 0.001 corresponding to P < 0.0001), consistent with our results for this locus. This suggests that sc sequences in Cot-1 hybridize to probes, nucleating the formation of mixed sc and repetitive sequence networks that capture labeled repetitive sequences from target cDNA. In microarray studies, Cot-1 thus distorts the hybridization of cloned probes enriched for interspersed repetitive sequences by forming complex hybridization networks in a manner analogous to what is observed in QMH.

    Table 4 Variation among replicate microarray studies for different genes

    DISCUSSION

    We have demonstrated that sequences present in Cot-1 DNA can significantly alter the amount of labeled genomic target detected in hybridization reactions with homologous probes. Rather than suppressing cross-hybridization, Cot-1 enhanced hybridization to probes containing repetitive sequences by as much as 3-fold. Our results suggest that unlabeled Cot-1 DNA sequences bridge sc and repetitive sequences in sequence specific probes and complementary target sequences. Repetitive sequences linked to homologous sc sequences in the Cot-1 fraction can nucleate subsequent hybridization of labeled repetitive sequences in genomic targets. The addition of Cot-1 DNA to probe hybridizations with labeled genomic templates catalyzes the formation of a network of heteroduplexes homologous to the probe and elsewhere in the genome (Figure 1, example 2). ‘Partial’ duplexes containing both sc and repetitive sequences (Figure 1, example 3) are facilitated by the addition of Cot-1 DNA through labeled sc genomic targets to linked repetitive elements (Figure 1, example 4). Labeled repetitive sequences linked to sc genomic target DNA sequences can also alter hybridization intensities, but not to the same extent that Cot-1 does, owing to its enrichment for both sc and interspersed repetitive sequences.

    Since the advent of microarray and array CGH technologies, many researchers have noted concerns about experimental reproducibility (4,5). Perhaps the largest source of variation in relation to cross-hybridization stems from repetitive sequences (7). However, many researchers believe this issue is addressed by blocking repetitive elements with Cot-1 DNA prior to hybridizing cDNA to an array (6,7). Dong et al. (18) found ‘some regions of non-repetitive sequences were sufficiently homologous to repetitive sequences to hybridize to the human Cot-1 DNA fraction’ and proposed that this was responsible for skewing hybridization intensities in their microarray results. Cot-1 affects the reproducibility of hybridization assays by promoting the formation of repetitive sequence bridges between probes and unrelated, labeled genomic targets. It also contains sc sequences that compete with labeled targets for probe sites. A more extensive genome-wide analysis is warranted to identify other genomic regions that are more likely to be susceptible to this source of systematic error.

    The repetitive component in Cot-1 DNA is fractionated based on re-association kinetics rather than being explicitly defined based on sequence composition. Because it is not contaminated with sc sequences, sequence-defined synthetic repetitive DNA is more effective at blocking cross-hybridization by repetitive sequences in probes to paralogous repetitive genomic targets. Another advantage of a locus-specific synthetic reagent is that divergent repetitive sequences or repeat families that are underrepresented in the Cot-1 DNA fraction can be synthesized, thereby providing a more accurate and comprehensive repertoire of genomic repeat sequences free of sc sequences.

    Nevertheless, replacement of Cot-1 with a synthetic repetitive DNA reagent (patent pending) that includes all known repetitive elements throughout the genome is probably precluded based on the cost and logistical challenges inherent in its preparation. We suggest that further processing of the Cot-1 fraction may provide a means of significantly reducing the proportion of contaminating sc sequences. Since, the re-annealed sequences in Cot-1 are predominantly repetitive in nature, these sequences will co-purify with linked single-stranded sequences which are comprised of sc and non-overlapping repetitive components, adjacent to the re-annealed repetitive sequences in the genome. Treatment of these mixed duplex and single-stranded structures with an obligatory processive exonuclease (i.e. such as Mung Bean nuclease or exonuclease I) should trim single-stranded sequences protruding from duplex DNA. These enzymes should not cleave at mismatched nucleotides (which are common among related members of the same repetitive sequence family) within single-stranded bubbles or within nicked duplexes; however, they will digest single-stranded repetitive and/or sc sequences. This will particularly impact the representation of repetitive elements that commonly show 5' or 3' genomic truncation (i.e. such as observed in L1 retrotransposons) (11). Loss of these sequences could be mitigated by addition of the corresponding synthetic DNA reagents. However, it should be noted that this treatment of Cot-1 DNA will not eliminate all sc sequences as some of the sc sequences may have re-annealed. Re-annealing of sc sequences, however, would not be kinetically favored.

    We suggest that substitution of a partial or completely synthetic blocking reagent composed of defined repetitive sequences in place of Cot-1 DNA could improve the reproducibility of expression microarray and array comparative genomic hybridizations. This should ultimately lead to standardization of experimental conditions in these widely used procedures. Sc probe technology itself could circumvent the requirement for suppression of repetitive sequence cross-hybridization by selecting sc probes that maximize the genomic separation between neighboring repetitive elements.

    NOTE ADDED IN PROOF

    A preliminary study demonstrated that cross-hybridization of repetitive sequences is significantly diminished by addition of of Mung Bean nuclease-treated Cot-1 DNA to labeled genomic DNA, as proposed. Results were confirmed by comparing CT values of single copy sequences in nuclease treated and untreated Cot-1 DNA.

    ACKNOWLEDGEMENTS

    We thank Drs Christopher Gocke and Thomas Schneider for their valuable comments and Dr David Zwick, Ruth Morgan and Lisa Wheeler of the Children's Mercy Hospital Flow Cytometry laboratory for their expert advice. Research was supported in part by 1R21CA95167-02 and by the Katherine B. Richardson Trust (Grant #4186). Funding to pay the Open Access publication charges for this article was provided by the Katherine B. Richardson Trust.

    REFERENCES

    Oostlander, A., Meijer, G., Ylstra, B. (2004) Microarray-based comparative genomic hybridization and its applications in human genetics Clin. Genet, . 66, 488–495 .

    van Hijum, S.A., de Jong, A., Baerends, R.J., Karsens, H.A., Kramer, N.E., Larsen, R., den Hengst, C.D., Albers, C.J., Kok, J., Kuipers, O.P. (2005) A generally applicable validation scheme for the assessment of factors involved in reproducibility and quality of DNA-microarray data BMC Genomics, 6, 77 .

    Bammler, T., Beyer, R.P., Bhattacharya, S., Boorman, G.A., Boyles, A., Bradford, B.U., Bumgarner, R.E., Bushel, P.R., Chaturvedi, K., Choi, D., et al. (2005) Standardizing global gene expression analysis between laboratories and across platforms Nature Methods, 2, 351–356 .

    Sherlock, G. (2005) Of FISH and chips Nature Methods, 2, 329–330 .

    Dobbin, K.K., Beer, D.G., Meyerson, M., Yeatman, T.J., Gerald, W.L., Jacobson, J.W., Conley, B., Buetow, K.H., Heiskanen, M., Simon, R.M., et al. (2005) Interlaboratory comparability study of cancer gene expression analysis using oligonucleotide microarrays Clin. Cancer Res, . 11, 565–572 .

    Li, X., Weikuan, G., Mohan, S., Baylink, D. (2002) DNA microarrays: their use and misuse Microcirculation, 9, 13–22 .

    Wren, J., Kulkarni, A., Joslin, J., Butow, R., Garner, H. (2002) Cross-hybridization on PCR-spotted microarrays IEEE Eng. Med. Biol, . 71–75 .

    Marshall, E. (2004) Getting the noise out of gene arrays Science, 306, 630–631 .

    Zakharkin, S., Kim, K., Mehta, T., Chen, L., Barnes, S., Scheirer, K., Parrish, R., Allison, D., Page, G. (2005) Sources of variation in Affymetrix microarray experiments BMC Bioinformatics, 6, 214–225 .

    Britten, R. and Kohne, D. (1968) Repeated sequences in DNA Science, 161, 529–540 .

    Rogan, P.K., Pan, J., Weissman, S.M. (1987) L1 repeat elements in the human epsilon-G gamma globin gene intergenic region: sequence analysis and concerted evolution within this family Mol. Biol. Evol, . 4, 327–42 .

    Carter, N., Fiegler, H., Piper, J. (2002) Comparative analysis of comparative genomic hybridization microarray technologies: report of a workshop sponsored by the Wellcome Trust Cytometry, 49, 43–48 .

    Rogan, P., Cazcarro, P., Knoll, J. (2001) Sequence-based design of single-copy genomic DNA probes for fluorescence in situ hybridization Genome Res, . 11, 1086–1094 .

    Rogan, P. and Knoll, J. (2003) Sequence-based in situ detection of chromosomal abnormalities at high resolution Am. J. Med. Genet, . 121, 245–257 .

    Knoll, J.H.M. and Rogan, P.K. US Patent 6,828,097 (2004) .

    Knoll, J. and Lichter, P. (2005) In situ hybridization to metaphase chromosomes and interphase nuclei In Dracopoli, N., Haines, J., Korf, B., Moir, D., Morton, C., Seidman, C., Seidman, J., Smith, D. (Eds.). Current Protocols in Human Genetics, NY Green-Wiley Vol. 1, Unit 4.3 .

    Kent, W. (2002) BLAT—the BLAST-like alignment tool Genome Res, . 12, 656–664 .

    Dong, S., Wang, E., Hsie, L., Cao, Y., Chen, X., Gingeras, T.R. (2001) Flexible use of high-density oligonucleotide arrays for single-nucleotide polymorphism discovery and validation Genome Res, . 11, 1418–1424 .(Heather L. Newkirk, Joan H.M. Knoll and )