Masaca's Blog 2

独り言・日記・愚痴・戯言・備忘録・・・。なんとでもお呼び下され(笑)。

サンディエゴ出張(4)

2009-03-21 16:14:29 | Abroad
成田空港に着きました。

サンフランシスコからのフライトは、途中でベルトサインが何度か出ましたが、さほどひどく揺れることなく、快適でした。一つ難を挙げるとすれば、隣に座る三人家族の小さな男の子がほとんどの時間、断続的に泣いてくれまして、まともに寝ることが出来なかったことでしようか。まあ、小さな子どものことなので仕方がありませんが、傍目に明らかに寝たくないけど眠たいという感じで、無理に寝かされねばならなかったので、ちょっと可哀想でした。まだ、一人だからいいですが、二人、三人でぐずられたらパニックでしょうね。

で、只今、直行バス待ちです。夕方、7時前には帰宅出来そうです。

しっかし、やっぱり学会の海外出張は疲れるわ…。

追記)機内で見た映画
往路:おくりびと
復路:007/慰めの報酬、チェンジリング

サンディエゴ出張(3)

2009-03-21 01:47:59 | Abroad
書くと言っておきながら、すでに今日は帰国当日…。

学会が午後から始まる火曜日は、前回お会いできなかったAさんとようやく会うことができまして、短い時間ながらもSan Diego市内をドライブしながら案内していただき、昼食までご一緒させていただきました。Aさん、本当にありがとうございました。驚いたのは、そのAさんがなんと!来週には帰国されるとのこと。なんでも、全く知らなかったのですが、すでにお子さんが生まれたんだそうで、これを機に帰国しようということになったんだそうです。帰国後の職も、この不況の中でしっかりと見つけられたそうで、さすがは米国12年の間滞在中にしっかり業績を上げられただけのことはあるなと、感心しきりです。

昼食後は、会場のホテルまで送っていただき、そこでAさんとはお別れ。で、いよいよ学会の開始です。セッション開始前にBostonでお世話になったPMさんやSLさんと久しぶりに再会しました。初日は昨年と同様に、Next Generation Sequencerメーカー4社のセッションとディスカッション。各社それぞれに性能向上と成果、今後の方向性を示しておりましたが、特筆するような新たな情報はありませんでした。いい加減、この手の学会が多すぎて発表頻度が高いので仕方がないのでしょうが、ちょっと拍子抜けでした。その後、レセプションパーティーで、昨年はほとんど見かけなかった日本人数人と知り合いになることができました。今年は思いもかけないところから参加していらっしゃる方がいたので、ちょっと驚きでした。レセプションの後、再度PMさん、SLさんと落ち合い、会場のホテルにある、昨年も利用したレストランのテラス席でお食事しながら、仕事の進捗などについていろいろとお話をすることができました。その後、宿泊ホテルが異なる私は、1.2マイルくらい知れているだろうということで、歩いて戻ることにしたのですが、まっすぐ北に向かってホテルの敷地内を進んでいくと、すべてフェンスで閉じられておりまして、敷地を出ることができず、仕方なくロータリー前に戻って正規の出口から出直したので、余計に時間がかかってしまい、30分以上も重いMacBookの入ったバッグを担ぎながら歩く羽目になってしまいました。ホテルの部屋に着いて、その日は速攻でシャワーを浴びて寝たのですが、時すでに午後10時過ぎ…。これまた疲れる一日でした。

学会二日目は、午前7時半からBreakfastセッションが始まるので、それに間に合うように朝早くから起きて、徒歩で会場に向かいました。さすがに明るい中だと、芝生を突っ切って最短距離を進むことができたので、20分ほどで会場に到着しました。会場でフルーツやクロワッサン、コーヒーなどを食べながら、企業のセッションを聞き、そのまましばらくして、この日のセッションが始まりました。二日目がメインで、予定がぎっちり詰まっています。昼もランチョンセッションで昼食を食べながら話を聞く状態。短いコーヒーブレイクと昼食後の小一時間の休憩しかなく、時差ぼけの頭と格闘しながら話を聞き、休憩時間はリゾートの強烈な日差しの中、ちょうどよい気温で気持ちのよいビーチ沿いの歩道を散歩しておりました。午後、しかもコーヒーブレイクの後のセッションが、今回の一番の目的のセッションでして、次々世代の解析機器の開発状況や機能紹介、ディスカッションがありまして、いろいろと情報を得ることができました。その後、再びレセプションパーティーでPMさんといろんな方々とお話をすることができました。さらに、この日の夜は、先日電話会議であらかじめ約束していたGSさんと話をすることになっておりまして、パーティーの途中で抜け出して落ち合い、昨晩と々レストランで食事しながら仕事の打ち合わせをしたのでした。GSさんもお会いするとかなり気さくな方でして、驚いたことに、Bostonでお世話になったWJEさんの元部下だったとのこと…。さらに、先週はPKさんと別のミーティングで会っていたとのことで、世間は狭いなぁ、などと感心したものでした。で、GSさんが親切にも自分のレンタカーでホテルまで送ってくださったので、この日は暗い公園を歩くことなく帰還できました。

三日目。この日もBreakfastセッションが午前7時半からありまして、またもや早起き…。しかし、そろそろ疲れもピークに達している上に、自分の畑違いの内容のセッションが多くて、途中で居眠りしながらなんとか切り抜けました。そして、ランチョンセッション、最後のケーススタディーセッションを終えて、無事に学会終了。やっとのことで、仕事が終了しました。で、夕方からは、知り合いになった某社の方々と、La Jollaまで行きまして、海岸沿いを散歩してから夕飯を食べて、ホテルに帰還しました。で、翌日の帰国のために荷物をスーツケースに詰め込んで、万全の状態にして就寝。ところが、午前7:49という早い時間のフライトのために、早起きせねばならず、その緊張のためか午前2時に目が覚めてしまい、そこから寝たり起きたりの繰り返し…。時差ぼけが大いに関わっているのですが、もうボロボロ…。

で、今朝。午前5時過ぎにホテルをチェックアウトして、呼んでおいたタクシーに乗って空港へ直行しまして、ユナイテッドのチケットカウンターへ。すると、よく分からないながらも予定していたフライトに何かトラブルがあるらしく、他のお客さんと別のカウンターに回されまして、そこで手続きを済ませて渡されたのは、いつもと違うチケットとは思えない紙切れ…。けど仕方がないのでそのまま保安検査を済ませて搭乗ゲート前に行くと、すでに6:18 AM発のSan Francisco行の搭乗がほとんど終わりかけの状態…。1時間半も早い便に間に合うように到着してしまったようです。仕方がないので、搭乗がほとんど終わって余裕のカウンターの女性に、先ほど渡されたよく分からない紙を渡そうとすると、いきなり名前を呼ばれました。驚いて「Yes」と答えながら紙を渡すと、搭乗側通路に来いとジェスチャーで言われまして、「???」という感じで回り込むとなんと!今、搭乗が終わりかけている6:18発のSan Francisco行のチケットを渡されまして、「Go」と言われてしまいました。で、「This flight?」と聞くと「Yes!」という明快な返事…。なぜか分かりませんが、とんでもなく早い便に乗せられてしまいました。で、搭乗開始よりも3時間以上も早くSan Francisco国際空港に到着。往きに歩いた長い長い通路を辿ってようやく一番端にある搭乗口に到着しました。けど、全日空のカウンターにもロビーにも人は見あたらず…。ロビーでこのエントリーを書いていると、ようやく午前9時になってチケットカウンターに女性が現れまして、アナウンスを始めたので、真っ先に並んでチケットをゲットして、今、ロビーで再び書いているところ…。

午前11:50発の成田行で翌21日午後3時過ぎに成田に到着する予定です。なので、帰宅は夕方6時頃でしょうか…。また、直行バスで自宅前のバス停まで帰る予定です。ちょうどいい時間にバスがあるとよいのですが…。

てなわけで、全然滞在記になっていませんが、もう終わってしまいます。何ともお粗末さまでした。

サンディエゴ出張(2)

2009-03-17 12:21:01 | Abroad
無事にホテルに到着しました。

飛行機では、とっととビールを飲みながら「おくりびと」を見つつ夕食を食べ、メラトニンを飲んで、サクッと眠りにつきました。けど、決して熟睡できたわけではなく、何度も目を覚ましては寝るの繰り返し…。けど、機内消灯時間はほとんど寝ていました。

San Francisco国際空港には、予定より25分早く到着しましたが、そこからが問題…。肝心のSan Diego行が遅れに遅れまして…。なんでも、Los Angelsからの到着便の機材がSan Diego行になるはずが、その便が大幅に遅れたためにSan Diego行も自ずと遅れたという…。それにしても、当初1時間遅れで表示していたのが、どんどん遅れていき、仕舞には午後12:15発の便が午後1:57発にまで遅れまして、搭乗口ロビーで待ちくたびれてうたた寝状態…。ようやく搭乗できるようになるというアナウンスに、一部のお客さんから拍手が起こったくらいです。そんな遅れて出発したSan Diego行も、遅れているからなのか何もかも最優先で通過していきまして、搭乗口から離れて離陸までの早いこと!すぐに離陸してまっすぐにSan Diegoに向かい、順番を待つことなく着陸しました。あ、そうそう、このSan Diego行のUnited便のCAの一人が日本人の女性でした。珍しいこともあるもんです。

さて、San Diego国際空港に到着して、速攻で預け荷物をゲットしまして、TAXI乗り場に直行し、TAXIでホテルに直行しました。TAXIの運転手さんが、ご丁寧に観光地をいくつか教えてくれたり、学会会場の去年泊まったホテルの場所や、今回のホテルからの行き方まで教えてくれました。で、無事にホテルにチェックイン。早速、ホテルの無線LANに接続しまして、メールなどをチェックした後で、最寄りのスーパーマーケットまで歩いていくことにしました。ところが…。

このスーパーマーケットの遠いこと…。いや、遠い上に急な上り坂で、一気に脹ら脛がパンパンになってしまいました。で、やっとこさスーパーマーケットに到着。スーパーマーケットだけでなく、一種の郊外型ショッピングモールになっていまして、薬局やらなにやらいろいろ並んでおりました。で、ちょっとした買い物を済ませて、その中にあったマクドナルドでビッグマックセットを買って夕食にしました。その後、ホテルに戻ってから、あらかじめ見つけておいた最寄りの酒屋さんへ行って、寝つけのビールを買ってきまして、先ほどシャワーを浴びて、寝るばかりになったところです。TVでは久しぶりにNCISがやっていまして、Cambridge滞在中を思い出しております。

明日の学会は午後からなので、朝はゆっくりとして、早めの昼食をこちらに在住のAさんと一緒にとって、午後の開始に間に合うように会場へ向かう予定です。夜は夜で別途約束がありますので、明日は食事に困ることはなさそうです。

てなわけで、長~い一日がようやく終わろうとしております。

サンディエゴ出張(1)

2009-03-16 13:58:21 | Abroad
毎日往復4時間かかる通勤に、さすがにエントリーする余裕がなく、ずいぶんご無沙汰してしまいました…。

1月末にボストンから帰国したばかりなのに、今日から再び海外出張です。行先は昨年初めて海外を経験した時と同じサンディエゴです。目的も同じく学会参加のためです。けど、今回は発表はなし。その代わりに全ての発表を聞いて、レポートせねばなりません。

てことで、只今、成田空港直行バスの中です。フライトは17:05発です。なんとか時差ぼけを速攻で解消するために、今朝もいつもの5:30より早く起きて、生活時間を8時間早める無駄な努力をしております。

というわけで、再び滞在記の始まりです。ほんの4、5日の話ですが、よろしければお付き合い下さいませ。

Papers of Note from In Sequence, Feb 2009 (8)

2009-03-11 20:43:37 | Science News
  • The deep evolution of metazoan microRNAs.
    Benjamin M. Wheeler, Alysha M. Heimberg, Vanessa N. Moy, Erik A. Sperling, Thomas W. Holstein, Steffen Heber Kevin J. Peterson.
    Evolution & Development 11, 50-68 (2009) | doi:10.1111/j.1525-142X.2008.00302.x | PMID:19196333
    microRNAs (miRNAs) are approximately 22-nucleotide noncoding RNA regulatory genes that are key players in cellular differentiation and homeostasis. They might also play important roles in shaping metazoan macroevolution. Previous studies have shown that miRNAs are continuously being added to metazoan genomes through time, and, once integrated into gene regulatory networks, show only rare mutations within the primary sequence of the mature gene product and are only rarely secondarily lost. However, because the conclusions from these studies were largely based on phylogenetic conservation of miRNAs between model systems like Drosophila and the taxon of interest, it was unclear if these trends would describe most miRNAs in most metazoan taxa. Here, we describe the shared complement of miRNAs among 18 animal species using a combination of 454 sequencing of small RNA libraries with genomic searches. We show that the evolutionary trends elucidated from the model systems are generally true for all miRNA families and metazoan taxa explored: the continuous addition of miRNA families with only rare substitutions to the mature sequence, and only rare instances of secondary loss. Despite this conservation, we document evolutionary stable shifts to the determination of position 1 of the mature sequence, a phenomenon we call seed shifting, as well as the ability to post-transcriptionally edit the 5' end of the mature read, changing the identity of the seed sequence and possibly the repertoire of downstream targets. Finally, we describe a novel type of miRNA in demosponges that, although shows a different pre-miRNA structure, still shows remarkable conservation of the mature sequence in the two sponge species analyzed. We propose that miRNAs might be excellent phylogenetic markers, and suggest that the advent of morphological complexity might have its roots in miRNA innovation.

  • Coastal Synechococcus metagenome reveals major roles for horizontal gene transfer and plasmids in population diversity.
    B. Palenik, Q. Ren, V. Tai, I. T. Paulsen.
    Environmental Microbiology 11, 349-359 (2009) | doi:10.1111/j.1462-2920.2008.01772.x | PMID:19196269
    The extent to which cultured strains represent the genetic diversity of a population of microorganisms is poorly understood. Because they do not require culturing, metagenomic approaches have the potential to reveal the genetic diversity of the microbes actually present in an environment. From coastal California seawater, a complex and diverse environment, the marine cyanobacteria of the genus Synechococcus were enriched by flow cytometry-based sorting and the population metagenome was analysed with 454 sequencing technology. The sequence data were compared with model Synechococcus genomes, including those of two coastal strains, one isolated from the same and one from a very similar environment. The natural population metagenome had high sequence identity to most genes from the coastal model strains but diverged greatly from these genomes in multiple regions of atypical trinucleotide content that encoded diverse functions. These results can be explained by extensive horizontal gene transfer presumably with large differences in horizontally transferred genetic material between different strains. Some assembled contigs showed the presence of novel open reading frames not found in the model genomes, but these could not yet be unambiguously assigned to a Synechococcus clade. At least three distinct mobile DNA elements (plasmids) not found in model strain genomes were detected in the assembled contigs, suggesting for the first time their likely importance in marine cyanobacterial populations and possible role in horizontal gene transfer.

  • Down-regulation of Gfi-1 expression by TGF-β is important for differentiation of Th17 and CD103+ inducible regulatory T cells.
    Jinfang Zhu, Todd S. Davidson, Gang Wei, Dragana Jankovic, Kairong Cui, Dustin E. Schones, Liying Guo, Keji Zhao, Ethan M. Shevach, William E. Paul.
    The Journal of Experimental Medicine 206, 329-341 (2009) | doi:10.1084/jem.20081666 | PMID:19188499
    Growth factor independent 1 (Gfi-1), a transcriptional repressor, is transiently induced during T cell activation. Interleukin (IL) 4 further induces Gfi-1, resulting in optimal Th2 cell expansion. We report a second important function of Gfi-1 in CD4 T cells: prevention of alternative differentiation by Th2 cells, and inhibition of differentiation of naive CD4 T cells to either Th17 or inducible regulatory T (iTreg) cells. In Gfi1–/– Th2 cells, the Rorc, Il23r, and Cd103 loci showed histone 3 lysine 4 trimethylation modifications that were lacking in wild-type Th2 cells, implying that Gfi-1 is critical for epigenetic regulation of Th17 and iTreg cell–related genes in Th2 cells. Enforced Gfi-1 expression inhibited IL-17 production and iTreg cell differentiation. Furthermore, a key inducer of both Th17 and iTreg cell differentiation, transforming growth factor β, repressed Gfi-1 expression, implying a reciprocal negative regulation of CD4 T cell fate determination. Chromatin immunoprecipitation showed direct binding of the Gfi-1–lysine-specific demethylase 1 repressive complex to the intergenic region of Il17a/Il17f loci and to intron 1 of Cd103. T cell–specific Gfi1 conditional knockout mice displayed a striking delay in the onset of experimental allergic encephalitis correlated with a dramatic increase of Foxp3+CD103+ CD4 T cells. Thus, Gfi-1 plays a critical role both in enhancing Th2 cell expansion and in repressing induction of Th17 and CD103+ iTreg cells.

  • Genome-Wide Analysis In Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling.
    Nicholas T. Ingolia, Sina Ghaemmaghami, John R. S. Newman, Jonathan S. Weissman.
    Science, Science Express | DOI:10.1126/science.1168978 | PMID:19213877
    Techniques for systematically monitoring protein translation have lagged far behind methods for measuring mRNA levels. Here, we present a ribosome profiling strategy, based on deep sequencing of ribosome protected mRNA fragments, that enables genome-wide investigation of translation with sub-codon resolution. We used this technique to monitor translation in budding yeast under both rich and starvation conditions. These studies defined the protein sequences being translated and found extensive translational control both for determining absolute protein abundance and for responding to environmental stress. We also observed distinct phases during translation involving a large decrease in ribosome density going from early to late peptide elongation as well as widespread, regulated initiation at non-AUG codons. Ribosome profiling is readily adaptable to other organisms, making high-precision investigation of protein translation experimentally accessible.

  • Papers of Note from In Sequence, Feb 2009 (7)

    2009-03-11 20:42:29 | Science News
  • Evaluation of the bacterial diversity in cecal contents of laying hens fed various molting diets by using bacterial tag-encoded FLX amplicon pyrosequencing.
    T. R. Callaway, S. E. Dowd, R. D. Wolcott, Y. Sun, J. L. McReynolds, T. S. Edrington, J. A. Byrd, R. C. Anderson, N. Krueger, D. J. Nisbet.
    Poult Sci 88, 298-302 (2009) | doi:10.3382/ps.2008-00222 | PMID:19151343
    Laying hens are typically induced to molt to begin a new egg-laying cycle by withdrawing feed for up to 12 to 14 d. Fasted hens are more susceptible to colonization and tissue invasion by Salmonella enterica serovar Enteritidis. Much of this increased incidence in fasted hens is thought to be due to changes in the native intestinal microflora. An alternative to feed withdrawal involves feeding alfalfa meal crumble to hens, which is indigestible by poultry but provides fermentable substrate to the intestinal microbial population and reduces Salmonella colonization of hens compared with feed withdrawal. The present study was designed to quantify differences in the cecal microbial population of hens (n = 12) fed a typical layer ration, undergoing feed withdrawal, or being fed alfalfa crumble by using a novel tag bacterial diversity amplification method. Bacteroides, Prevotella, and Clostridium were the most common genera isolated from all treatment groups. Only the ceca of hens undergoing feed withdrawal (n = 4) contained Salmonella. The number of genera present was greatest in the alfalfa crumble-fed group and least in the feed withdrawal group (78 vs. 54 genera, respectively). Overall, the microbial diversity was least and Lactobacillius populations were not found in the hens undergoing feed withdrawal, which could explain much of these hens’ sensitivity to colonization by Salmonella.

  • Repetitive sequence variation and dynamics in the ribosomal DNA array of Saccharomyces cerevisiae as revealed by whole-genome resequencing.
    Stephen A. James, Michael J.T. O'Kelly, David M. Carter, Robert P. Davey, Alexander van Oudenaarden, Ian N. Roberts.
    Genome Res., Advance Online Articles | doi:10.1101/gr.084517.108 | PMID:19141593
    Ribosomal DNA (rDNA) plays a key role in ribosome biogenesis, encoding genes for the structural RNA components of this important cellular organelle. These genes are vital for efficient functioning of the cellular protein synthesis machinery and as such are highly conserved and normally present in high copy numbers. In the baker's yeast Saccharomyces cerevisiae, there are more than 100 rDNA repeats located at a single locus on chromosome XII. Stability and sequence homogeneity of the rDNA array is essential for function, and this is achieved primarily by the mechanism of gene conversion. Detecting variation within these arrays is extremely problematic due to their large size and repetitive structure. In an attempt to address this, we have analyzed over 35 Mbp of rDNA sequence obtained from whole-genome shotgun sequencing (WGSS) of 34 strains of S. cerevisiae. Contrary to expectation, we find significant rDNA sequence variation exists within individual genomes. Many of the detected polymorphisms are not fully resolved. For this type of sequence variation, we introduce the term partial single nucleotide polymorphism, or pSNP. Comparative analysis of the complete data set reveals that different S. cerevisiae genomes possess different patterns of rDNA polymorphism, with much of the variation located within the rapidly evolving nontranscribed intergenic spacer (IGS) region. Furthermore, we find that strains known to have either structured or mosaic/hybrid genomes can be distinguished from one another based on rDNA pSNP number, indicating that pSNP dynamics may provide a reliable new measure of genome origin and stability.

    In situ transcriptomic analysis of the globally important keystone N2-fixing taxon Crocosphaera watsonii.
    Ian Hewson, Rachel S Poretsky, Roxanne A Beinart, Angelicque E White, Tuo Shi, Shellie R Bench, Pia H Moisander, Ryan W Paerl, H James Tripp, Joseph P Montoya, Mary Ann Moran, Jonathan P Zehr.
    The ISME Journal, Advance online publication | doi:10.1038/ismej.2009.8 | PMID:19225552
    The diazotrophic cyanobacterium Crocosphaera watsonii supplies fixed nitrogen (N) to N-depleted surface waters of the tropical oceans, but the factors that determine its distribution and contribution to global N2 fixation are not well constrained for natural populations. Despite the heterogeneity of the marine environment, the genome of C. watsonii is highly conserved in nucleotide sequence in contrast to sympatric planktonic cyanobacteria. We applied a whole assemblage shotgun transcript sequencing approach to samples collected from a bloom of C. watsonii observed in the South Pacific to understand the genomic mechanisms that may lead to high population densities. We obtained 999 C. watsonii transcript reads from two metatranscriptomes prepared from mixed assemblage RNA collected in the day and at night. The C. watsonii population had unexpectedly high transcription of hypothetical protein genes (31% of protein-encoding genes) and transposases (12%). Furthermore, genes were expressed that are necessary for living in the oligotrophic ocean, including the nitrogenase cluster and the iron-stress-induced protein A (isiA) that functions to protect photosystem I from high-light-induced damage. C. watsonii transcripts retrieved from metatranscriptomes at other locations in the southwest Pacific Ocean, station ALOHA and the equatorial Atlantic Ocean were similar in composition to those recovered in the enriched population. Quantitative PCR and quantitative reverse transcriptase PCR were used to confirm the high expression of these genes within the bloom, but transcription patterns varied at shallower and deeper horizons. These data represent the first transcript study of a rare individual microorganism in situ and provide insight into the mechanisms of genome diversification and the ecophysiology of natural populations of keystone organisms that are important in global nitrogen cycling.

    Comparative day/night metatranscriptomic analysis of microbial communities in the North Pacific subtropical gyre.
    Rachel S. Poretsky, Ian Hewson, Shulei Sun, Andrew E. Allen, Jonathan P. Zehr, Mary Ann Moran.
    Environmental Microbiology, Early view | doi:10.1111/j.1462-2920.2008.01863.x | PMID:19207571
    Metatranscriptomic analyses of microbial assemblages (<5 μm) from surface water at the Hawaiian Ocean Time-Series (HOT) revealed community-wide metabolic activities and day/night patterns of differential gene expression. Pyrosequencing produced 75 558 putative mRNA reads from a day transcriptome and 75 946 from a night transcriptome. Taxonomic binning of annotated mRNAs indicated that Cyanobacteria contributed a greater percentage of the transcripts (54% of annotated sequences) than expected based on abundance (35% of cell counts and 21% 16S rRNA of libraries), and may represent the most actively transcribing cells in this surface ocean community in both the day and night. Major heterotrophic taxa contributing to the community transcriptome included α- (19% of annotated sequences, most of which were SAR11-related) and γ-Proteobacteria (4%). The composition of transcript pools was consistent with models of prokaryotic gene expression, including operon-based transcription patterns and an abundance of genes predicted to be highly expressed. Metabolic activities that are shared by many microbial taxa (e.g. glycolysis, citric acid cycle, amino acid biosynthesis and transcription and translation machinery) were well represented among the community transcripts. There was an overabundance of transcripts for photosynthesis, C1 metabolism and oxidative phosphorylation in the day compared with night, and evidence that energy acquisition is coordinated with solar radiation levels for both autotrophic and heterotrophic microbes. In contrast, housekeeping activities such as amino acid biosynthesis, membrane synthesis and repair, and vitamin biosynthesis were overrepresented in the night transcriptome. Direct sequencing of these environmental transcripts has provided detailed information on metabolic and biogeochemical responses of a microbial community to solar forcing.

  • Papers of Note from In Sequence, Feb 2009 (6)

    2009-03-11 20:41:26 | Science News
  • Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing.
    Moran Yassour, Tommy Kaplan, Hunter B. Fraser, Joshua Z. Levin, Jenna Pfiffner, Xian Adiconis, Gary Schroth, Shujun Luo, Irina Khrebtukova, Andreas Gnirke, Chad Nusbaum, Dawn-Anne Thompson, Nir Friedman, Aviv Regev.
    PNAS 106, 3264-3269 (2009) | doi:10.1073/pnas.0812841106 | PMID:19208812
    Defining the transcriptome, the repertoire of transcribed regions encoded in the genome, is a challenging experimental task. Current approaches, relying on sequencing of ESTs or cDNA libraries, are expensive and labor-intensive. Here, we present a general approach for ab initio discovery of the complete transcriptome of the budding yeast, based only on the unannotated genome sequence and millions of short reads from a single massively parallel sequencing run. Using novel algorithms, we automatically construct a highly accurate transcript catalog. Our approach automatically and fully defines 86% of the genes expressed under the given conditions, and discovers 160 previously undescribed transcription units of 250 bp or longer. It correctly demarcates the 5′ and 3′ UTR boundaries of 86 and 77% of expressed genes, respectively. The method further identifies 83% of known splice junctions in expressed genes, and discovers 25 previously uncharacterized introns, including 2 cases of condition-dependent intron retention. Our framework is applicable to poorly understood organisms, and can lead to greater understanding of the transcribed elements in an explored genome.

  • Three genomes from the phylum Acidobacteria provide insight into their lifestyles in soils.
    Naomi L. Ward, Jean F. Challacombe, Peter H. Janssen, Bernard Henrissat, Pedro M. Coutinho, Martin Wu, Gary Xie, Daniel H. Haft, Michelle Sait, Jonathan Badger, Ravi D. Barabote, Brent Bradley, Thomas S. Brettin, Lauren M. Brinkac, David Bruce, Todd Creasy, Sean C. Daugherty, Tanja M Davidsen, Robert T. DeBoy, J. Chris Detter, Robert J. Dodson, A. Scott Durkin, Anuradha Ganapathy, Michelle Gwinn-Giglio, Cliff S. Han, Hoda Khouri, Hajnalka Kiss, Sagar P. Kothari, Ramana Madupu, Karen Nelson, William C. Nelson, Ian Paulsen, Kevin Penn, Qinghu Ren, M. J. Rosovitz, Jeremy D. Selengut, Susmita Shrivastava, Steven A. Sullivan, Roxanne Tapia, L. Sue Thompson, Kisha L. Watkins, Qi Yang, Chunhui Yu, Nikhat Zafar, Liwei Zhou, Cheryl R. Kuske.
    Appl. Environ. Microbiol., AEM accepts | doi:10.1128/AEM.02294-08 | PMID:19201974
    The complete genomes of three strains from the phylum Acidobacteria were compared. Phylogenetic analysis placed them as a unique phylum. They share genomic traits with members of the Proteobacteria, the Cyanobacteria and the Fungi. The three strains appear to be versatile heterotrophs. Genomic and culture traits indicate use of carbon sources that span simple sugars to more complex substrates such as hemicellulose, cellulose and chitin. The genomes encode low-specificity MFS transporters and high-affinity ABC transporters for sugars, suggesting they are best suited to low-nutrient conditions. They appear capable of nitrate and nitrite reduction but not N2 fixation or denitrification. The genomes contained numerous genes encoding siderophore receptors, but no evidence was found for siderophore production, suggesting they may obtain iron via interaction with other microorganisms. The presence of cellulose synthesis genes, and a large class of novel high molecular weight excreted proteins suggest potential traits for desiccation resistance, biofilm formation, and/or contribution to soil structure. Polyketide synthase and macrolide glycosylation genes suggest the production of novel antimicrobial compounds. Genes encoding a variety of novel proteins were also identified. The abundance of acidobacteria in soils worldwide, and the breadth of potential carbon use by the sequenced strains, suggest significant and previously unrecognized contributions to the terrestrial carbon cycle. Combining our genomic evidence with available culture traits, we postulate that cells of these isolates are long-lived, divide slowly, exhibit slow metabolic rates in low-nutrient conditions, and are well equipped to tolerate fluctuations hydration in soil.

  • Whole genome sequence analysis of Mycobacterium bovis bacillus Calmette–Guérin (BCG) Tokyo 172: A comparative study of BCG vaccine substrains.
    Masaaki Seki, Ikuro Honda, Isao Fujita, Ikuya Yano, Saburo Yamamoto, Akira Koyama.
    Vaccine 27, 1710-1716 (2009) | doi:10.1016/j.vaccine.2009.01.034 | PMID:19200449
    To investigate the molecular characteristics of bacillus Calmette–Guérin (BCG) vaccines, the complete genomic sequence of Mycobacterium bovis BCG Tokyo 172 was determined, and the results were compared with those for BCG Pasteur and other M. tuberculosis complex. The genome of BCG Tokyo had a length of 4,371,711 bp and contained 4033 genes, including 3950 genes coding for proteins (CDS). There were 18 regions of difference (showing differences of more than 20 bp), 20 insertion or deletion (ins/del) mutations of less than 20 bp, and 68 SNPs between the two BCG substrains. These findings are useful for better understanding of the genetic differences in BCG substrains due to in vitro evolution of BCG.

  • Bermanella marisrubri gen. nov., sp. nov., a genome-sequenced gammaproteobacterium from the Red Sea.
    Jarone Pinhassi, María J. Pujalte, Javier Pascual, José M. González, Itziar Lekunberri, Carlos Pedrós-Alió, David R. Arahal.
    Int J Syst Evol Microbiol 59, 373-377 (2009) | DOI:10.1099/ijs.0.002113-0 | PMID:19196781
    A novel heterotrophic, marine, strictly aerobic, motile bacterium was isolated from the Red Sea at a depth of 1 m. Analysis of its 16S rRNA gene sequence, retrieved from the whole-genome sequence, showed that this bacterium was most closely related to the genera Oleispira, Oceanobacter and Thalassolituus, each of which contains a single species, within the class Gammaproteobacteria. Phenotypic, genotypic and phylogenetic analyses supported the creation of a novel genus and species to accommodate this bacterium, for which the name Bermanella marisrubri gen. nov., sp. nov. is proposed. The type strain of Bermanella marisrubri is RED65T (=CECT 7074T =CCUG 52064T).

  • Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing.
    Andreas Gnirke, Alexandre Melnikov, Jared Maguire, Peter Rogov, Emily M LeProust, William Brockman, Timothy Fennell, Georgia Giannoukos, Sheila Fisher, Carsten Russ, Stacey Gabriel, David B Jaffe, Eric S Lander, Chad Nusbaum.
    Nature Biotechnology 27, 182 - 189 (2009) | doi:10.1038/nbt.1523 | PMID:19182786
    Targeting genomic loci by massively parallel sequencing requires new methods to enrich templates to be sequenced. We developed a capture method that uses biotinylated RNA 'baits' to fish targets out of a 'pond' of DNA fragments. The RNA is transcribed from PCR-amplified oligodeoxynucleotides originally synthesized on a microarray, generating sufficient bait for multiple captures at concentrations high enough to drive the hybridization. We tested this method with 170-mer baits that target >15,000 coding exons (2.5 Mb) and four regions (1.7 Mb total) using Illumina sequencing as read-out. About 90% of uniquely aligning bases fell on or near bait sequence; up to 50% lay on exons proper. The uniformity was such that 60% of target bases in the exonic 'catch', and 80% in the regional catch, had at least half the mean coverage. One lane of Illumina sequence was sufficient to call high-confidence genotypes for 89% of the targeted exon space.

  • Papers of Note from In Sequence, Feb 2009 (5)

    2009-03-11 20:40:13 | Science News
  • PASS: a Program to Align Short Sequences.
    Davide Campagna, Alessandro Albiero, Alessandra Bilardi, Elisa Caniato, Claudio Forcato, Svetlin Manavski, Nicola Vitulo, Giorgio Valle.
    Bioinformatics, Advance Access | doi:10.1093/bioinformatics/btp087 | PMID:19218350
    Standard DNA alignment programs are inadequate to manage the data produced by new generation DNA sequencers. To answer this problem we developed PASS with the objective of improving execution time and sensitivity when compared with other available programs. PASS performs fast gapped and ungapped alignments of short DNA sequences onto a reference DNA, typically a genomic sequence. It is designed to handle a huge amount of reads such as those generated by Solexa, SOLiD or 454 technologies. The algorithm is based on a data structure that holds in RAM the index of the genomic positions of "seed" words (typically 11-12 bases) as well as an index of the precomputed scores of short words (typically 7-8 bases) aligned against each other. After building the genomic index, the program scans every query sequence performing 3 steps: 1) finds matching seed words in the genome; 2) for every match checks the precomputed alignment of the short flanking regions; 3) if passes step 2, then it performs an exact dynamic alignment of a narrow region around the match. The performance of the program is very striking both for sensitivity and speed. For instance, gap alignment is achieved hundreds of times faster than BLAST and several times faster than SOAP, especially when gaps are allowed. Furthermore, PASS has a higher sensitivity when compared with the other available programs.

  • Nanoelectromechanics of Methylated DNA in a Synthetic Nanopore.
    U. Mirsaidov, W. Timp, X. Zou, V. Dimitrov, K. Schulten, A.P. Feinberg, G. Timp.
    Biophysical Journal, 96, L32-L34, (2009) | doi:10.1016/j.bpj.2008.12.3760 | PMID:19217843
    Methylation of cytosine is a covalent modification of DNA that can be used to silence genes, orchestrating a myriad of biological processes including cancer. We have discovered that a synthetic nanopore in a membrane comparable in thickness to a protein binding site can be used to detect methylation. We observe a voltage threshold for permeation of methylated DNA through a <2 nm diameter pore, which we attribute to the stretching transition; this can differ by >1 V/20 nm depending on the methylation level, but not the DNA sequence.

  • Massively parallel pyrosequencing highlights minority variants in the HIV-1 env quasispecies deriving from lymphomonocyte sub-populations.
    Gabriella Rozera, Isabella Abbate, Alessandro Bruselles, Crhysoula Vlassi, Gianpiero D'Offizi, Pasquale Narciso, Giovanni Chillemi, Mattia Prosperi, Giuseppe Ippolito, Maria R Capobianchi.
    Retrovirology 6, 15 (2009) | doi:10.1186/1742-4690-6-15 | PMID:19216757
    Background
    Virus-associated cell membrane proteins acquired by HIV-1 during budding may give information on the cellular source of circulating virions. In the present study, by applying immunosorting of the virus and of the cells with antibodies targeting monocyte (CD36) and lymphocyte (CD26) markers, it was possible to directly compare HIV-1 quasispecies archived in circulating monocytes and T lymphocytes with that present in plasma virions originated from the same cell types. Five chronically HIV-1 infected patients who underwent therapy interruption after prolonged HAART were enrolled in the study. The analysis was performed by the powerful technology of ultra-deep pyrosequencing after PCR amplification of part of the env gene, coding for the viral glycoprotein (gp) 120, encompassing the tropism-related V3 loop region. V3 aminoacid sequences were used to establish heterogeneity parameters, to build phylogenetic trees and to predict co-receptor usage.

    Results
    The heterogeneity of proviral and viral genomes derived from monocytes was higher than that of T-lymphocyte origin. Both monocytes and T lymphocytes might contribute to virus rebounding in the circulation after therapy interruptions, but other virus sources might also be involved. In addition, both proviral and circulating viral sequences from monocytes and T lymphocytes were predictive of a predominant R5 coreceptor usage, but minor variants, segregating from the most frequent quasispecies variants, were present. In particular, in proviral genomes harboured by monocytes, minority variant clusters, with predicted X4 phenotype, were found.

    Conclusions
    This study provided the first direct comparison between the HIV-1 quasispecies archived as provirus in circulating monocytes and T lymphocytes with that of plasma virions replicating in the same cell types. Ultra-deep pyrosequencing generated data of some order of magnitude higher than any previously obtained with conventional approaches. Next generation sequencing allowed the analysis of previously inaccessible aspects of HIV-1 quasispecies, such as coreceptor usage of minority variants present in archived proviral sequences and in actually replicating virions, that may have clinical and therapeutic relevance

  • Sequencing and Analyses of All Known Human Rhinovirus Genomes Reveals Structure and Evolution.
    Ann C. Palmenberg, David Spiro, Ryan Kuzmickas, Shiliang Wang, Appolinaire Djikeng, Jennifer A. Rathe, Claire M. Fraser-Liggett, Stephen B. Liggett.
    Science, Science Express | DOI:10.1126/science.1165557 | PMID:19213880
    Infection by human rhinoviruses (HRVs) is a major cause of upper and lower respiratory tract disease worldwide and displays significant phenotypic variation. We examined diversity by completing the genome sequences for all known serotypes (n = 99). Superimposition of capsid crystal structure and optimal-energy RNA configurations established alignments and phylogeny. These revealed conserved motifs, clade-specific diversity including a potential new species (HRV-D), mutations in field isolates, and recombination. In analogy with poliovirus, a hypervariable 5'UTR tract may affect virulence. A configuration consistent with nonscanning internal ribosome entry was found in all HRVs and may account for rapid translation. The data density from complete sequences of the reference HRVs provided high resolution for this degree of modeling and serves as a platform for full genome-based epidemiologic studies and antiviral or vaccine development.

  • ChIP-seq accurately predicts tissue-specific activity of enhancers.
    Axel Visel, Matthew J. Blow, Zirong Li, Tao Zhang, Jennifer A. Akiyama, Amy Holt, Ingrid Plajzer-Frick, Malak Shoukry, Crystal Wright, Feng Chen, Veena Afzal, Bing Ren, Edward M. Rubin, Len A. Pennacchio.
    Nature 457, 854-858 (2009) | doi:10.1038/nature07730 | PMID:19212405
    A major yet unresolved quest in decoding the human genome is the identification of the regulatory sequences that control the spatial and temporal expression of genes. Distant-acting transcriptional enhancers are particularly challenging to uncover because they are scattered among the vast non-coding portion of the genome. Evolutionary sequence constraint can facilitate the discovery of enhancers, but fails to predict when and where they are active in vivo. Here we present the results of chromatin immunoprecipitation with the enhancer-associated protein p300 followed by massively parallel sequencing, and map several thousand in vivo binding sites of p300 in mouse embryonic forebrain, midbrain and limb tissue. We tested 86 of these sequences in a transgenic mouse assay, which in nearly all cases demonstrated reproducible enhancer activity in the tissues that were predicted by p300 binding. Our results indicate that in vivo mapping of p300 binding is a highly accurate means for identifying enhancers and their associated activities, and suggest that such data sets will be useful to study the role of tissue-specific enhancers in human biology and disease on a genome-wide scale.

  • Papers of Note from In Sequence, Feb 2009 (4)

    2009-03-11 20:39:42 | Science News
  • Massive parallel bisulfite sequencing of CG-rich DNA fragments reveals that methylation of many X-chromosomal CpG islands in female blood DNA is incomplete.
    Michael Zeschnigk, Marcel Martin, Gisela Betzl, Andreas Kalbe, Caroline Sirsch, Karin Buiting, Stephanie Gross, Epameinondas Fritzilas, Bruno Frey, Sven Rahmann, Bernhard Horsthemke.
    Hum Mol Genet., Advance Access | doi:10.1093/hmg/ddp054 | PMID:19223391
    Methylation of CpG islands (CGIs) plays an important role in gene silencing. For genome-wide methylation analysis of CGIs in female white blood cells and in sperm, we used four restriction enzymes and a size selection step to prepare DNA libraries enriched with CGIs. The DNA libraries were treated with sodium bisulfite and subjected to a modified 454/Roche Genome Sequencer protocol. We obtained 163,034 and 129,620 reads from blood and sperm, respectively, with an average read length of 133 bp. Bioinformatic analysis revealed that 12,358 (7.6%) blood library reads and 10,216 (7.9%) sperm library reads map to 6,167 and 5,796 different CGIs, respectively. In blood and sperm DNA we identified 824 (13.7%) and 482 (8.5%) fully methylated autosomal CGIs, respectively. Differential methylation, which is characterized by the presence of methylated and unmethylated reads of the same CGI, was observed in 53 and 52 autosomal CGIs in blood and sperm DNA, respectively. Remarkably, methylation of X-chromosomal CGIs in female blood cells was most often incomplete (25-75%). Such incomplete methylation was mainly found on the X-chromosome, suggesting that it is linked to X-chromosome inactivation.

  • Experimental discovery of sRNAs in Vibrio cholerae by direct cloning, 5S/tRNA depletion and parallel sequencing.
    Jane M. Liu, Jonathan Livny, Michael S. Lawrence, Marc D. Kimball, Matthew K. Waldor, Andrew Camilli.
    Nucleic Acids Research, Advance Access | doi:10.1093/nar/gkp080 | PMID:19223322
    Direct cloning and parallel sequencing, an extremely powerful method for microRNA (miRNA) discovery, has not yet been applied to bacterial transcriptomes. Here we present sRNA-Seq, an unbiased method that allows for interrogation of the entire small, non-coding RNA (sRNA) repertoire in any prokaryotic or eukaryotic organism. This method includes a novel treatment that depletes total RNA fractions of highly abundant tRNAs and small subunit rRNA, thereby enriching the starting pool for sRNA transcripts with novel functionality. As a proof-of-principle, we applied sRNA-Seq to the human pathogen Vibrio cholerae. Our results provide information, at unprecedented depth, on the complexity of the sRNA component of a bacterial transcriptome. From 407 039 sequence reads, all 20 known V. cholerae sRNAs, 500 new, putative intergenic sRNAs and 127 putative antisense sRNAs were identified in a limited number of growth conditions examined. In addition, characterization of a subset of the newly identified transcripts led to the identification of a novel sRNA regulator of carbon metabolism. Collectively, these results strongly suggest that the number of sRNAs in bacteria has been greatly underestimated and that future efforts to analyze bacterial transcriptomes will benefit from direct cloning and parallel sequencing experiments aided by 5S/tRNA depletion.

  • Genotypic detection of rifampicin and isoniazid resistant Mycobacterium tuberculosis strains by DNA sequencing: a randomized trial.
    Amina Abdelaal, Hassan Abd El-Ghaffar, Mohammad Hosam Eldeen Zaghloul, Noha El mashad, Ehab Badran, Amal Fathy.
    Annals of Clinical Microbiology and Antimicrobials 8, 4 (2009) | doi:10.1186/1476-0711-8-4 | PMID:19183459
    Annals of Clinical Microbiology and Antimicrobials 8, 5 (2009) | doi:10.1186/1476-0711-8-5 | PMID:19220891 (Correction)
    Background
    Tuberculosis is a growing international health concern. It is the biggest killer among the infectious diseases in the world today. Early detection of drug resistance allows starting of an appropriate treatment. Resistance to drugs is due to particular genomic mutations in specific genes of Mycobacterium tuberculosis(MTB). The aim of this study was to identify the presence of Isoniazid (INH) and Rifampicin(RIF) drug resistance in new and previously treated tuberculosis (TB) cases using DNA sequencing.

    Methods
    This study was carried out on 153 tuberculous patients with positive Bactec 460 culture for acid fast bacilli.

    Results
    Of the 153 patients, 105 (68.6%) were new cases and 48 (31.4%) were previously treated cases.Drug susceptibility testing on Bactec revealed 50 resistant cases for one or more of the first line antituberculous. Genotypic analysis was done only for rifampicin resistant specimens (23 cases) and INH resistant specimens (26 cases) to detect mutations responsible for drug resistance by PCR amplification of rpoB gene for rifampicin resistant cases and KatG gene for isoniazid resistant cases. Finally, DNA sequencing was done for detection of mutation within rpoB and katG genes. Genotypic analysis of RIF resistant cases revealed that 20/23 cases (86.9%) of RIF resistance were having rpoB gene mutation versus 3 cases (13.1%) having no mutation with a high statistical significant difference between them ( P<0.001). Direct sequencing of gene revealed point mutation in 24/26 (92.3%) and the remaining 2 /26 (7.7%) had wild type katG i.e. no evidence of mutation with a high statistical significant difference between them ( P<0.001). Conclusions
    We can conclude that rifampicin resistance could be used as a useful surrogate marker for estimation of multidrug resistance. In addition, Genotypic method was superior to that of the traditional phenotypic method which is time-consuming taking several weeks or longer.

  • Massively parallel sequencing identifies the gene Megf8 with ENU-induced mutation causing heterotaxy.
    Zhen Zhang, Deanne Alpert, Richard Francis, Bishwanath Chatterjee, Qing Yu, Terry Tansey, Steven L. Sabol, Cheng Cui, Yongli Bai, Maxim Koriabine, Yuko Yoshinaga, Jan-Fang Cheng, Feng Chen, Joel Martin, Wendy Schackwitz, Teresa M. Gunn, Kenneth L. Kramer, Pieter J. De Jong, Len A. Pennacchio, Cecilia W. Lo.
    PNAS 106, 3219-3224 (2009) | doi:10.1073/pnas.0813400106 | PMID:19218456
    Forward genetic screens with ENU (N-ethyl-N-nitrosourea) mutagenesis can facilitate gene discovery, but mutation identification is often difficult. We present the first study in which an ENU- induced mutation was identified by massively parallel DNA sequencing. This mutation causes heterotaxy and complex congenital heart defects and was mapped to a 2.2-Mb interval on mouse chromosome 7. Massively parallel sequencing of the entire 2.2-Mb interval identified 2 single-base substitutions, one in an intergenic region and a second causing replacement of a highly conserved cysteine with arginine (C193R) in the gene Megf8. Megf8 is evolutionarily conserved from human to fruit fly, and is observed to be ubiquitously expressed. Morpholino knockdown of Megf8 in zebrafish embryos resulted in a high incidence of heterotaxy, indicating a conserved role in laterality specification. Megf8C193R mouse mutants show normal breaking of symmetry at the node, but Nodal signaling failed to be propagated to the left lateral plate mesoderm. Videomicroscopy showed nodal cilia motility, which is required for left–right patterning, is unaffected. Although this protein is predicted to have receptor function based on its amino acid sequence, surprisingly confocal imaging showed it is translocated into the nucleus, where it is colocalized with Gfi1b and Baf60C, two proteins involved in chromatin remodeling. Overall, through the recovery of an ENU-induced mutation, we uncovered Megf8 as an essential regulator of left–right patterning.

  • Papers of Note from In Sequence, Feb 2009 (3)

    2009-03-11 20:38:39 | Science News
  • Quantitative measures for the management and comparison of annotated genomes.
    Karen Eilbeck, Barry Moore, Carson Holt, Mark Yandell.
    BMC Bioinformatics 10, 67 (2009) | doi:10.1186/1471-2105-10-67 | PMID:19236712
    Background
    The ever-increasing number of sequenced and annotated genomes has made management of their annotations a significant undertaking, especially for large eukaryotic genomes containing many thousands of genes. Typically, changes in gene and transcript numbers are used to summarize changes from release to release, but these measures say nothing about changes to individual annotations, nor do they provide any means to identify annotations in need of manual review.

    Results
    In response, we have developed a suite of quantitative measures to better characterize changes to a genome's annotations between releases, and to prioritize problematic annotations for manual review. We have applied these measures to the annotations of five eukaryotic genomes over multiple releases – H. sapiens, M. musculus, D. melanogaster, A. gambiae, and C. elegans.

    Conclusion
    Our results provide the first detailed, historical overview of how these genomes' annotations have changed over the years, and demonstrate the usefulness of these measures for genome annotation management.

  • Mapping the Burkholderia cenocepacia niche response via high-throughput sequencing.
    D. R. Yoder-Himes, P. S. G. Chain, Y. Zhu, O. Wurtzel, E. M. Rubin, James M. Tiedje, R. Sorek.
    PNAS 106, 3976-3981 (2009) | doi:10.1073/pnas.0813403106 | PMID:19234113
    Determining how an organism responds to its environment by altering gene expression is key to understanding its ecology. Here, we used RNA-seq to comprehensively and quantitatively assess the transcriptional response of the bacterial opportunistic cystic fibrosis (CF) pathogen and endemic soil dweller, Burkholderia cenocepacia, in conditions mimicking these 2 environments. By sequencing 762 million bases of cDNA from 2 closely related B. cenocepacia strains (one isolated from a CF patient and one from soil), we identified a number of potential virulence factors expressed under CF-like conditions, whereas genes whose protein products are involved in nitrogen scavenging and 2-component sensing were among those induced under soil-like conditions. Interestingly, 13 new putative noncoding RNAs were discovered using this technique, 12 of which are preferentially induced in the soil environment, suggesting that ncRNAs play an important role in survival in the soil. In addition, we detected a surprisingly large number of regulatory differences between the 2 strains, which may represent specific adaptations to the niches from which each strain was isolated, despite their high degree of DNA sequence similarity. Compared with the CF strain, the soil strain shows a stronger global gene expression response to its environment, which is consistent with the need for a more dynamic reaction to the heterogeneous conditions of soil.

  • Whole genome amplification of the rust Puccinia striiformis f. sp. tritici from single spores.
    Yanchun Wang, Mingqi Zhu, Rong Zhang, Hanli Yang, Yang Wang, Guangyu Sun, Shelin Jin, Tom Hsiang.
    Journal of Microbiological Methods, Article in Press | doi:10.1016/j.mimet.2009.02.007 | PMID:19233233
    Rust fungi are obligate parasites and cannot be routinely cultured to obtain sufficient biomass for DNA extractions. Multiple displacement amplification (MDA) was demonstrated in this study for whole genome amplification from single spores of the rust fungus, Puccinia striiformis. The genomic DNA coverage and fidelity of this method was evaluated by PCR amplification and sequencing of two genetic markers: portions of the multi-copy nuclear ribosomal DNA internal transcribed spacer region (ITS) and the single copy β-tubulin gene from two geographical diverse isolates. Our results show that MDA is a valuable tool for whole genome amplification from single spores, and we propose that MDA-amplified DNA can be used for molecular genetic analysis of the wheat yellow rust fungus.

  • Salmonella paratyphi C: Genetic Divergence from Salmonella choleraesuis and Pathogenic Convergence with Salmonella typhi.
    Wei-Qiao Liu, Ye Feng, Yan Wang, Qing-Hua Zou, Fang Chen, Ji-Tao Guo, Yi-Hong Peng, Yan Jin, Yong-Guo Li, Song-Nian Hu, Randal N. Johnston, Gui-Rong Liu, Shu-Lin Liu.
    PLoS ONE 4, e4510 (2009) | doi:10.1371/journal.pone.0004510 | PMID:19229335
    Background
    Although over 1400 Salmonella serovars cause usually self-limited gastroenteritis in humans, a few, e.g., Salmonella typhi and S. paratyphi C, cause typhoid, a potentially fatal systemic infection. It is not known whether the typhoid agents have evolved from a common ancestor (by divergent processes) or acquired similar pathogenic traits independently (by convergent processes). Comparison of different typhoid agents with non-typhoidal Salmonella lineages will provide excellent models for studies on how similar pathogens might have evolved.

    Methodologies/Principal Findings
    We sequenced a strain of S. paratyphi C, RKS4594, and compared it with previously sequenced Salmonella strains. RKS4594 contains a chromosome of 4,833,080 bp and a plasmid of 55,414 bp. We predicted 4,640 intact coding sequences (4,578 in the chromosome and 62 in the plasmid) and 152 pseudogenes (149 in the chromosome and 3 in the plasmid). RKS4594 shares as many as 4346 of the 4,640 genes with a strain of S. choleraesuis, which is primarily a swine pathogen, but only 4008 genes with another human-adapted typhoid agent, S. typhi. Comparison of 3691 genes shared by all six sequenced Salmonella strains placed S. paratyphi C and S. choleraesuis together at one end, and S. typhi at the opposite end, of the phylogenetic tree, demonstrating separate ancestries of the human-adapted typhoid agents. S. paratyphi C seemed to have suffered enormous selection pressures during its adaptation to man as suggested by the differential nucleotide substitutions and different sets of pseudogenes, between S. paratyphi C and S. choleraesuis.

    Conclusions
    S. paratyphi C does not share a common ancestor with other human-adapted typhoid agents, supporting the convergent evolution model of the typhoid agents. S. paratyphi C has diverged from a common ancestor with S. choleraesuis by accumulating genomic novelty during adaptation to man.

  • Analyzing Gene Expression from Marine Microbial Communities using Environmental Transcriptomics.
    Rachel S. Poretsky, Scott Gifford, Johanna Rinta-Kanto, Maria Vila-Costa, Mary Ann Moran.
    JoVE. 24 (2009) | doi:10.3791/1086 | PMID:19229184
    Analogous to metagenomics, environmental transcriptomics (metatranscriptomics) retrieves and sequences environmental mRNAs from a microbial assemblage without prior knowledge of what genes the community might be expressing. Thus it provides the most unbiased perspective on community gene expression in situ. Environmental transcriptomics protocols are technically difficult since prokaryotic mRNAs generally lack the poly(A) tails that make isolation of eukaryotic messages relatively straightforward and because of the relatively short half lives of mRNAs. In addition, mRNAs are much less abundant than rRNAs in total RNA extracts, thus an rRNA background often overwhelms mRNA signals. However, techniques for overcoming some of these difficulties have recently been developed. A procedure for analyzing environmental transcriptomes by creating clone libraries using random primers to reverse-transcribe and amplify environmental mRNAs was recently described was successful in two different natural environments, but results were biased by selection of the random primers used to initiate cDNA synthesis. Advances in linear amplification of mRNA obviate the need for random primers in the amplification step and make it possible to use less starting material decreasing the collection and processing time of samples and thereby minimizing RNA degradation. In vitro transcription methods for amplifying mRNA involve polyadenylating the mRNA and incorporating a T7 promoter onto the 3' end of the transcript. Amplified RNA (aRNA) can then be converted to double stranded cDNA using random hexamers and directly sequenced by pyrosequencing. A first use of this method at Station ALOHA demonstrated its utility for characterizing microbial community gene expression.

  • Papers of Note from In Sequence, Feb 2009 (2)

    2009-03-11 20:36:50 | Science News
  • Statistical Inferences for Isoform Expression in RNA-Seq.
    Hui Jiang, Wing Hung Wong.
    Bioinformatics, Advance Access | doi:10.1093/bioinformatics/btp113 | PMID:19244387
    The development of RNA sequencing (RNA-Seq) makes it possible for us to measure transcription at an unprecedented precision and throughput. However challenges remain in understanding the source and distribution of the reads, modeling the transcript abundance and developing efficient computational methods. In this paper, we develop a method to deal with the isoform expression estimation problem. The count of reads falling into a locus on the genome annotated with multiple isoforms is modeled as a Poisson variable. The expression of each individual isoform is estimated by solving a convex optimization problem and statistical inferences about the parameters are obtained from the posterior distribution by importance sampling. Our results show that isoform expression inference in RNA-Seq is possible by employing appropriate statistical methods.

  • Electrophysiological Study of Single Gold Nanoparticle/α-Hemolysin Complex Formation: A Nanotool to Slow Down ssDNA Through the -Hemolysin Nanopore.
    Yann Astier, Oktay Uzun, Francesco Stellacci.
    Small, Early View | doi:10.1002/smll.200801779 | PMID:19242940
    Single-monolayer-protected gold nanoparticles can be captured in the α-hemolysin nanopore (see image). Single-nanopore ion conductance studies of the nanoparticle/nanopore complex are described. The effect of the nanoparticle size, charge, and surface coating on ssDNA threading speed through the nanopore/nanoparticle complex is discussed.

    QSRA – a quality-value guided de novo short read assembler.
    Douglas W Bryant Jr, Weng-Keen Wong, Todd C Mockler.
    BMC Bioinformatics 10, 69 (2009) | doi:10.1186/1471-2105-10-69 | PMID:
    Background
    New rapid high-throughput sequencing technologies have sparked the creation of a new class of assembler. Since all high-throughput sequencing platforms incorporate errors in their output, short-read assemblers must be designed to account for this error while utilizing all available data.

    Results
    We have designed and implemented an assembler, Quality-value guided Short Read Assembler, created to take advantage of quality-value scores as a further method of dealing with error. Compared to previous published algorithms, our assembler shows significant improvements not only in speed but also in output quality.

    Conclusion
    QSRA generally produced the highest genomic coverage, while being faster than VCAKE. QSRA is extremely competitive in its longest contig and N50/N80 contig lengths, producing results of similar quality to those of EDENA and VELVET. QSRA provides a step closer to the goal of de novo assembly of complex genomes, improving upon the original VCAKE algorithm by not only drastically reducing runtimes but also increasing the viability of the assembly algorithm through further error handling capabilities.

    Alterations in Genes of the EGFR Signaling Pathway and Their Relationship to EGFR Tyrosine Kinase Inhibitor Sensitivity in Lung Cancer Cell Lines.
    Jeet Gandhi, Jianling Zhang, Yang Xie, Junichi Soh, Hisayuki Shigematsu, Wei Zhang, Hiromasa Yamamoto, Michael Peyton, Luc Girard, William W. Lockwood, Wan L. Lam, Marileila Varella-Garcia, John D. Minna, Adi F. Gazdar.
    PLoS ONE 4 e4576 (2009) | doi:10.1371/journal.pone.0004576 | PMID:19238210
    Background
    Deregulation of EGFR signaling is common in non-small cell lung cancers (NSCLC) and this finding led to the development of tyrosine kinase inhibitors (TKIs) that are highly effective in a subset of NSCLC. Mutations of EGFR (mEGFR) and copy number gains (CNGs) of EGFR (gEGFR) and HER2 (gHER2) have been reported to predict for TKI response. Mutations in KRAS (mKRAS) are associated with primary resistance to TKIs.

    Methodology/Principal Findings
    We investigated the relationship between mutations, CNGs and response to TKIs in a large panel of NSCLC cell lines. Genes studied were EGFR, HER2, HER3, HER4, KRAS, BRAF and PIK3CA. Mutations were detected by sequencing, while CNGs were determined by quantitative PCR (qPCR), fluorescence in situ hybridization (FISH) and array comparative genomic hybridization (aCGH). IC50 values for the TKIs gefitinib (Iressa) and erlotinib (Tarceva) were determined by MTS assay. For any of the seven genes tested, mutations (39/77, 50.6%), copy number gains (50/77, 64.9%) or either (65/77, 84.4%) were frequent in NSCLC lines. Mutations of EGFR (13%) and KRAS (24.7%) were frequent, while they were less frequent for the other genes. The three techniques for determining CNG were well correlated, and qPCR data were used for further analyses. CNGs were relatively frequent for EGFR and KRAS in adenocarcinomas. While mutations were largely mutually exclusive, CNGs were not. EGFR and KRAS mutant lines frequently demonstrated mutant allele specific imbalance i.e. the mutant form was usually in great excess compared to the wild type form. On a molar basis, sensitivity to gefitinib and erlotinib were highly correlated. Multivariate analyses led to the following results:
    1. mEGFR and gEGFR and gHER2 were independent factors related to gefitinib sensitivity, in descending order of importance.
    2. mKRAS was associated with increased in vitro resistance to gefitinib.
    Conclusions/Significance
    Our in vitro studies confirm and extend clinical observations and demonstrate the relative importance of both EGFR mutations and CNGs and HER2 CNGs in the sensitivity to TKIs.

  • Massive transcriptional start site analysis of human genes in hypoxia cells.
    Katsuya Tsuchihara, Yutaka Suzuki, Hiroyuki Wakaguri, Takuma Irie, Kousuke Tanimoto, Shin-ichi Hashimoto, Kouji Matsushima, Junko Mizushima-Sugano, Riu Yamashita, Kenta Nakai, David Bentley, Hiroyasu Esumi, Sumio Sugano.
    Nucleic Acids Research, Advance Access | doi:10.1093/nar/gkp066 | PMID:19237398
    Combining our full-length cDNA method and the massively parallel sequencing technology, we developed a simple method to collect precise positional information of transcriptional start sites (TSSs) together with digital information of the gene-expression levels in a high throughput manner. We applied this method to observe gene-expression changes in a colon cancer cell line cultured in normoxic and hypoxic conditions. We generated more than 100 million 36-base TSS-tag sequences and revealed comprehensive features of hypoxia responsive alterations in the transcriptional landscape of the human genome. The features include presence of inducible ‘hot regions’ in 54 genomic regions, 220 novel hypoxia inducible promoters that may drive non-protein-coding transcripts, 191 hypoxia responsive alternative promoters and detailed views of 120 novel as well as known hypoxia responsive genes. We further analyzed hypoxic response of different cells using additional 60 million TSS-tags and found that the degree of the gene-expression changes were different among cell lines, possibly reflecting cellular robustness against hypoxia. The novel dynamic figure of the human gene transcriptome will deepen our understanding of the transcriptional program of the human genome as well as bringing new insights into the biology of cancer cells in hypoxia.

    Novel sequencing strategy for repetitive DNA in a Drosophila BAC clone reveals that the centromeric region of the Y chromosome evolved from a telomere.
    María Méndez-Lago, Jadwiga Wild, Siobhan L. Whitehead, Alan Tracey, Beatriz de Pablos, Jane Rogers, Waclaw Szybalski, Alfredo Villasante.
    Nucleic Acids Research, Advance Access | doi:10.1093/nar/gkp085 | PMID:19237394
    The centromeric and telomeric heterochromatin of eukaryotic chromosomes is mainly composed of middle-repetitive elements, such as transposable elements and tandemly repeated DNA sequences. Because of this repetitive nature, Whole Genome Shotgun Projects have failed in sequencing these regions. We describe a novel kind of transposon-based approach for sequencing highly repetitive DNA sequences in BAC clones. The key to this strategy relies on physical mapping the precise position of the transposon insertion, which enables the correct assembly of the repeated DNA. We have applied this strategy to a clone from the centromeric region of the Y chromosome of Drosophila melanogaster. The analysis of the complete sequence of this clone has allowed us to prove that this centromeric region evolved from a telomere, possibly after a pericentric inversion of an ancestral telocentric chromosome. Our results confirm that the use of transposon-mediated sequencing, including positional mapping information, improves current finishing strategies. The strategy we describe could be a universal approach to resolving the heterochromatic regions of eukaryotic genomes.

  • Papers of Note from In Sequence, Feb 2009 (1)

    2009-03-11 20:35:36 | Science News
  • Chemically modified primers for improved multiplex PCR.
    Jonathan Shum, Natasha Paul.
    Analytical Biochemistry, In press | doi:10.1016/j.ab.2009.02.033 | PMID:19258004
    Multiplexed PCR, the amplification of multiple targets in a single reaction, presents a new set of challenges that further complicate more traditional PCR set-ups. These complications include a greater probability for non-specific amplicon formation and for imbalanced amplification of different targets, each of which can compromise quantification and detection of multiple targets. Despite these difficulties, multiplex PCR is frequently used in such applications as pathogen detection, RNA quantification, mutation analysis and now, next generation DNA sequencing. Herein, we investigate the utility of primers with one or two thermolabile 4-oxo-1-pentyl phosphotriester modifications in improving multiplex PCR performance. Initial endpoint and real-time analyses reveal a decrease in off-target amplification and subsequent increase in amplicon yield. Furthermore, the use of modified primers in multiplex set-ups revealed a greater limit of detection and more uniform amplification of each target as compared to unmodified primers. Overall, the thermolabile modified primers present a novel and exciting avenue in improving multiplex PCR performance.

  • Genome Sequences of Three Agrobacterium Biovars Help Elucidate the Evolution of Multi-Chromosome Genomes in Bacteria.
    Steven C. Slater, Barry S. Goldman, Brad Goodner, João C. Setubal*, Stephen K. Farrand, Eugene W. Nester, Thomas J. Burr, Lois Banta, Allan W. Dickerman, Ian Paulsen, Leon Otten, Garret Suen, Roy Welch, Nalvo F. Almeida, Frank Arnold, Oliver T. Burton, Zijin Du, Adam Ewing, Eric Godsy, Sara Heisel, Kathryn L. Houmiel, Jinal Jhaveri, Jing Lu, Nancy M. Miller, Stacie Norton, Qiang Chen, Waranyoo Phoolcharoen, Victoria Ohlin, Dan Ondrusek, Nicole Pride, Shawn L. Stricklin, Jian Sun, Cathy Wheeler, Lindsey Wilson, Huijun Zhu, Derek W. Wood.
    J. Bacteriol. Accepts | doi:10.1128/JB.01779-08 | PMID:19251847
    The family Rhizobiaceae contains plant-associated bacteria with critical roles in ecology and agriculture. Within this family, many Rhizobium and Sinorhizobium strains are nitrogen-fixing plant mutualists, while many strains designated as Agrobacterium are plant pathogens. These contrasting lifestyles are primarily dependent on the transmissible plasmids each strain harbors. Members of Rhizobiaceae also have diverse genome architectures that include single chromosomes, multiple chromosomes, and plasmids of various sizes. Agrobacterium strains have been divided into three Biovars, based on physiological and biochemical properties. The genome of a Biovar I strain, A. tumefaciens C58, has been previously sequenced. In this study the genomes of the Biovar II strain A. radiobacter K84, a commercially available biological control strain that inhibits certain pathogenic agrobacteria, and the Biovar III strain A. vitis S4, a narrow host range strain that infects grapes and invokes a hypersensitive response on non-host plants, were fully sequenced and annotated. Comparison with other sequenced members of the α-proteobacteria provides new data on evolution of multi-partite bacterial genomes. Primary chromosomes show extensive conservation of both gene content and order. In contrast, secondary chromosomes share smaller percentages of genes, and conserved gene order is restricted to short blocks. We propose that secondary chromosomes originated from an ancestral plasmid to which genes have been transferred from a progenitor primary chromosome. Similar patterns are observed in select β- and γ-proteobacteria species. Together these results define the evolution of chromosome architecture and gene content among the Rhizobiaceae and support a generalized mechanism for second chromosome formation among bacteria.

  • ABySS: A parallel assembler for short read sequence data.
    Jared T Simpson, Kim Wong, Shaun D Jackman, Jacqueline E Schein, Steven JM Jones, Inanc Birol.
    Genome Res., Published in Advance | doi:10.1101/gr.089532.108 | PMID:19251739
    Widespread adoption of massively parallel DNA sequencing instruments has prompted the recent development of de novo short read assembly algorithms. A common shortcoming of the available tools is their inability to efficiently assemble vast amounts of data generated from large-scale sequencing projects, such as the sequencing of individual human genomes to catalog natural genetic variation. To address this limitation, we developed ABySS (Assembly By Short Sequences), a parallelized sequence assembler. As a demonstration of the capability of our software, we assembled 3.5 billion paired-end reads from the genome of an African male publicly released by Illumina Inc. Approximately 2.76 million contigs ≥100bp in length were created with an N50 size of 1499bp, representing 68% of the reference human genome. Analysis of these contigs identified polymorphic and novel sequences not present in the human reference assembly, which were validated by alignment to alternate human assemblies and to other primate genomes.

  • Rapid Annotation of Anonymous Sequences from Genome Projects Using Semantic Similarities and a Weighting Scheme in Gene Ontology.
    Paolo Fontana, Alessandro Cestaro, Riccardo Velasco, Elide Formentin, Stefano Toppo.
    PLoS ONE 4, e4619 (2009) | doi:10.1371/journal.pone.0004619 | PMID:19247487
    Background
    Large-scale sequencing projects have now become routine lab practice and this has led to the development of a new generation of tools involving function prediction methods, bringing the latter back to the fore. The advent of Gene Ontology, with its structured vocabulary and paradigm, has provided computational biologists with an appropriate means for this task.

    Methodology
    We present here a novel method called ARGOT (Annotation Retrieval of Gene Ontology Terms) that is able to process quickly thousands of sequences for functional inference. The tool exploits for the first time an integrated approach which combines clustering of GO terms, based on their semantic similarities, with a weighting scheme which assesses retrieved hits sharing a certain number of biological features with the sequence to be annotated. These hits may be obtained by different methods and in this work we have based ARGOT processing on BLAST results.

    Conclusions
    The extensive benchmark involved 10,000 protein sequences, the complete S. cerevisiae genome and a small subset of proteins for purposes of comparison with other available tools. The algorithm was proven to outperform existing methods and to be suitable for function prediction of single proteins due to its high degree of sensitivity, specificity and coverage.


    Comparative analysis of H2A.Z nucleosome organization in human and yeast genome
    .
    Michael Y Tolstorukov, Peter V Kharchenko, Joseph A Goldman, Robert E Kingston, Peter J Park.
    Genome Res., Published in Advance | doi:10.1101/gr.084830.108 | PMID:19246569
    Eukaryotic DNA is wrapped around a histone protein core to constitute the fundamental repeating units of chromatin, the nucleosomes. The affinity of the histone core for DNA depends on the nucleotide sequence; however, it is unclear to what extent DNA sequence determines nucleosome positioning in vivo, and if the same rules of sequence-directed positioning apply to genomes of varying complexity. Using the data generated by high-throughput DNA sequencing combined with chromatin immunoprecipitation, we have identified positions of nucleosomes containing the H2A.Z histone variant and histone H3 tri-methylated at lysine 4 in human CD4+ T cells. We find that the 10-bp periodicity observed in nucleosomal sequences in yeast and other organisms is not pronounced in human nucleosomal sequences. This result was confirmed for a broader set of mononucleosomal fragments that were not selected for any specific histone variant or modification. We also find that human H2A.Z nucleosomes protect only about 120-bp of DNA from MNase digestion and exhibit specific sequence preferences, suggesting a novel mechanism of nucleosome organization for the H2A.Z variant.

  • Virus-free iPS cell induction paper

    2009-03-05 12:43:42 | Science News
  • Virus-free induction of pluripotency and subsequent excision of reprogramming factors.
    Keisuke Kaji, Katherine Norrby, Agnieszka Paca, Maria Mileikovsky, Paria Mohseni, Knut Woltjen.
    Nature, Advance online publication | doi:10.1038/nature07864 | PMID: 19252477
    Reprogramming of somatic cells to pluripotency, thereby creating induced pluripotent stem (iPS) cells, promises to transform regenerative medicine. Most instances of direct reprogramming have been achieved by forced expression of defined factors using multiple viral vectors. However, such iPS cells contain a large number of viral vector integrations, any one of which could cause unpredictable genetic dysfunction. Whereas c-Myc is dispensable for reprogramming, complete elimination of the other exogenous factors is also desired because ectopic expression of either Oct4 (also known as Pou5f1) or Klf4 can induce dysplasia. Two transient transfection-reprogramming methods have been published to address this issue. However, the efficiency of both approaches is extremely low, and neither has been applied successfully to human cells so far. Here we show that non-viral transfection of a single multiprotein expression vector, which comprises the coding sequences of c-Myc, Klf4, Oct4 and Sox2 linked with 2A peptides, can reprogram both mouse and human fibroblasts. Moreover, the transgene can be removed once reprogramming has been achieved. iPS cells produced with this non-viral vector show robust expression of pluripotency markers, indicating a reprogrammed state confirmed functionally by in vitro differentiation assays and formation of adult chimaeric mice. When the single-vector reprogramming system was combined with a piggyBac transposon, we succeeded in establishing reprogrammed human cell lines from embryonic fibroblasts with robust expression of pluripotency markers. This system minimizes genome modification in iPS cells and enables complete elimination of exogenous reprogramming factors, efficiently providing iPS cells that are applicable to regenerative medicine, drug screening and the establishment of disease models.
    プラスミドベクターにpiggyBacトランスポゾンを組み合わせたところがミソらしい。