Joel Rozowsky, Ghia Euskirchen, Raymond K Auerbach, Zhengdong D Zhang, Theodore Gibson, Robert Bjornson, Nicholas Carriero, Michael Snyder, Mark B Gerstein.
Nature Biotechnology 27, 66-75 (2009) | doi: 10.1038/nbt.1518 | PMID: 19122651
Chromatin immunoprecipitation (ChIP) followed by tag sequencing (ChIP-seq) using high-throughput next-generation instrumentation is fast, replacing chromatin immunoprecipitation followed by genome tiling array analysis (ChIP-chip) as the preferred approach for mapping of sites of transcription-factor binding and chromatin modification. Using two deeply sequenced data sets for human RNA polymerase II and STAT1, each with matching input-DNA controls, we describe a general scoring approach to address unique challenges in ChIP-seq data analysis. Our approach is based on the observation that sites of potential binding are strongly correlated with signal peaks in the control, likely revealing features of open chromatin. We develop a two-pass strategy called PeakSeq to compensate for this. A two-pass strategy compensates for signal caused by open chromatin, as revealed by inclusion of the controls. The first pass identifies putative binding sites and compensates for genomic variation in the 'mappability' of sequences. The second pass filters out sites not significantly enriched compared to the normalized control, computing precise enrichments and significances. Our scoring procedure enables us to optimize experimental design by estimating the depth of sequencing required for a desired level of coverage and demonstrating that more than two replicates provides only a marginal gain in information.
Shin-ichi Hashimoto, Wei Qu, Budrul Ahsan, Katsumi Ogoshi, Atsushi Sasaki, Yoichiro Nakatani, Yongjun Lee, Masako Ogawa, Akio Ametani, Yutaka Suzuki, Sumio Sugano, Clarence C. Lee, Robert C. Nutter, Shinichi Morishita, Kouji Matsushima.
PLoS ONE 4, e4108 (2009) | doi: 10.1371/journal.pone.0004108 | PMID: 19119315
Massively parallel, tag-based sequencing systems, such as the SOLiD system, hold the promise of revolutionizing the study of whole genome gene expression due to the number of data points that can be generated in a simple and cost-effective manner. We describe the development of a 5′–end transcriptome workflow for the SOLiD system and demonstrate the advantages in sensitivity and dynamic range offered by this tag-based application over traditional approaches for the study of whole genome gene expression. 5′-end transcriptome analysis was used to study whole genome gene expression within a colon cancer cell line, HT-29, treated with the DNA methyltransferase inhibitor, 5-aza-2′-deoxycytidine (5Aza). More than 20 million 25-base 5′-end tags were obtained from untreated and 5Aza-treated cells and matched to sequences within the human genome. Seventy three percent of the mapped unique tags were associated with RefSeq cDNA sequences, corresponding to approximately 14,000 different protein-coding genes in this single cell type. The level of expression of these genes ranged from 0.02 to 4,704 transcripts per cell. The sensitivity of a single sequence run of the SOLiD platform was 100–1,000 fold greater than that observed from 5′end SAGE data generated from the analysis of 70,000 tags obtained by Sanger sequencing. The high-resolution 5′end gene expression profiling presented in this study will not only provide novel insight into the transcriptional machinery but should also serve as a basis for a better understanding of cell biology.
Nalvo F. Almeida, Shuangchun Yan, Magdalen Lindeberg, David J. Studholme, David J. Schneider, Bradford Condon, Haijie Liu, Carlos J. Viana, Andrew Warren, Clive Evans, Eric Kemen, Dan MacLean, Aurelie Angot, Gregory B. Martin, Jonathan D. Jones, Alan Collmer, Joao C. Setubal, Boris A. Vinatzer.
Molecular Plant-Microbe Interactions 22, 52-62 (2009) | DOI: 10.1094/MPMI-22-1-0052 | PMID: 19061402
Diverse gene products including phytotoxins, pathogen-associated molecular patterns, and type III secreted effectors influence interactions between Pseudomonas syringae strains and plants, with additional yet uncharacterized factors likely contributing as well. Of particular interest are those interactions governing pathogen-host specificity. Comparative genomics of closely related pathogens with different host specificity represents an excellent approach for identification of genes contributing to host-range determination. A draft genome sequence of Pseudomonas syringae pv. tomato T1, which is pathogenic on tomato but nonpathogenic on Arabidopsis thaliana, was obtained for this purpose and compared with the genome of the closely related A. thaliana and tomato model pathogen P. syringae pv. tomato DC3000. Although the overall genetic content of each of the two genomes appears to be highly similar, the repertoire of effectors was found to diverge significantly. Several P. syringae pv. tomato T1 effectors absent from strain DC3000 were confirmed to be translocated into plants, with the well-studied effector AvrRpt2 representing a likely candidate for host-range determination. However, the presence of avrRpt2 was not found sufficient to explain A. thaliana resistance to P. syringae pv. tomato T1, suggesting that other effectors and possibly type III secretion system–independent factors also play a role in this interaction.
Yolanda Schaerli, Robert C. Wootton, Tom Robinson, Viktor Stein, Christopher Dunsby, Mark A. A. Neil, Paul M. W. French, Andrew J. deMello, Chris Abell, Florian Hollfelder.
Anal. Chem. 81, 302–306 (2009) | DOI: 10.1021/ac802038c | PMID: 19055421
We present a high throughput microfluidic device for continuous-flow polymerase chain reaction (PCR) in water-in-oil droplets of nanoliter volumes. The circular design of this device allows droplets to pass through alternating temperature zones and complete 34 cycles of PCR in only 17 min, avoiding temperature cycling of the entire device. The temperatures for the applied two-temperature PCR protocol can be adjusted according to requirements of template and primers. These temperatures were determined with fluorescence lifetime imaging (FLIM) inside the droplets, exploiting the temperature-dependent fluorescence lifetime of rhodamine B. The successful amplification of an 85 base-pair long template from four different start concentrations was demonstrated. Analysis of the product by gel-electrophoresis, sequencing, and real-time PCR showed that the amplification is specific and the amplification factors of up to 5 × 106-fold are comparable to amplification factors obtained in a benchtop PCR machine. The high efficiency allows amplification from a single molecule of DNA per droplet. This device holds promise for convenient integration with other microfluidic devices and adds a critical missing component to the laboratory-on-a-chip toolkit.