Ou Outline In Introduct ction to RNA RNA-se seq Sa Samp - - PowerPoint PPT Presentation

ou outline
SMART_READER_LITE
LIVE PREVIEW

Ou Outline In Introduct ction to RNA RNA-se seq Sa Samp - - PowerPoint PPT Presentation

Ou Outline In Introduct ction to RNA RNA-se seq Sa Samp mple preparation Qu Quality y control Tr Transcript assembly Re Read alignment Di Differ eren ential g gen ene e e expres ession Da Data v


slide-1
SLIDE 1

Ou Outline

Todos Santos 2018

§ In Introduct ction to RNA RNA-se seq § Sa Samp mple preparation § Qu Quality y control § Tr Transcript assembly § Re Read alignment § Di Differ eren ential g gen ene e e expres ession § Da Data v visualization a and p plotting

slide-2
SLIDE 2

Re Regulation of transcription: § Tr Transcription factors § Hi Histone e modifications § DN DNA methylation Re Regulation of RNA processing: § Po Polyadenylation § Spl Splicing ng § Ca Capping § RN RNA export Re Regulation of translation: § mR mRNA decay § Tr Translational repression § Se Seque questration Po Posttranslational regulation: § Ch Chemical modifications (e.g. phosphorylation) § Pr Protein turnover (proteolysis)

Fu et al. (2014)

RN RNA-se seq me measures st steady st state mRNA le levels ls an and RNA NA se seque quenc nce co composition

Re Regulation of gene expression

Todos Santos 2018

slide-3
SLIDE 3

Reuter et al 2015

RN RNA-se seq is is the the mo most t commo mmon n HT HTS a application

Todos Santos 2018

slide-4
SLIDE 4

§ Us Use high-qua quality R RNA a as s start rting ng m materi rial. § Minor differ eren ences ces between een samples es can have e a substantial impact ct on ge gene expression. § Th Three biological replicates is the default but not ideal for every si situation. § So Some me recomme mmended kits for standard RNA-se seq: § NE NEBNe Next Ul Ultra II Directional RNA Library Prep Ki § Il Illumina kits

Sa Sample preparation

Todos Santos 2018

slide-5
SLIDE 5

§ St Starting RNA § Ty Typically 1-5 5 ug ug of

  • f high-qua

quality t total R RNA i is i ide deal. § Se Sequencing depth § Ty Typically you want about 20 million high quality reads/library. § Co Considerati tions § St Strand specific (default is yes) § Si Single-en end or paired ed-en end (singl gle e is suffici cien ent for wel ell annotated ed tr transcriptomes) § Lo Long g rea eads vs short rea eads (short Il Illumina rea eads, 50-150 150 nt nt, a , are us usua ually su sufficient) § rR rRNA de depl pletion o n or o r oligo-dT dT § Lo Low quantity/singl gle e cel cell

Sa Sample preparation

Todos Santos 2018

slide-6
SLIDE 6

rRN rRNA de depl pletion

illumina.com Zhernakova et al. (2009)

Ol Oligo go-dT dT se selection

RN RNA-se seq libr librar ary pr prepar paratio tion

Todos Santos 2018

slide-7
SLIDE 7

Dual Index Library shown

Sl Slide content courtesy of Il Illumina

HiSeq 2500

Metzker, M.L. (2010) NRG

Libr Librar ary compo mpositio ition

Todos Santos 2018

slide-8
SLIDE 8

@D64TDFP1:248:C50DMACXX:5:1101:1241:2095 1:N:0:ATCACG CACCGCCCGTCGCTATCCGGGACTGGAATTCTCGGGTGCCAAGGAACTCCA + CCCFFFFFHHHHHJIJGHJJJJIJJJJJGGGFFFFEABDHHHFHFF@@DD> @D64TDFP1:248:C50DMACXX:5:1101:1371:2154 1:N:0:ATCACG TCAATATTTGCATAGGGTATCTGGAATTCTCGGGTGCCAAGGAACTCCAGT + CCCFFFFFHHHHHJJJJGFHIJJJJJJJJJJJJJFHHIIJJHGHJFGHJJI @D64TDFP1:248:C50DMACXX:5:1101:1461:2205 1:N:0:ATCACG GAAAGACGTCTTCCTAGATTATGGAATTCTCGGGTGCCAAGGAACTCCAGT + CCCFFFFFHHHHHJJJJJJJJJJJIJJJJJJJJJHIJJJJJGIIJFGIJJJ

Line 1: sequence ID, description, and index; begins with @ Line 2: sequence; contains only A, C, T, G, and N Line 3: optional sequence ID; begins with + Line 4: signal quality of each base, cryptic code, phred 33 or 64 Index sequence

1 2 3 4 1 2 3 4 1 2 3 4

Read 1 Read 2 Read 3

Metzker, M.L. (2010) NRG

FA FASTQ format

Todos Santos 2018

slide-9
SLIDE 9

Demultiplexing and quality assessment Quality control – filter low quality data, trim adapters Map sequences to reference or de novo assemble reference Custom or standard data analysis Data visualization and presentation fastq files downloaded from server

Da Data a analy analysis is workflo low

Todos Santos 2018

slide-10
SLIDE 10

As Asses essing g Rea ead Quality Ph Phred qua quality s score: a a m measur ure o

  • f t

the he qua quality o

  • f ba

base c calling ng:

Q Q = -10 10 log(P) wh

where P P is the er error pr proba babi bility

Quality Quality contr trol

Todos Santos 2018

slide-11
SLIDE 11

100 base ses

P P = = 0.01 Q Q = 20 (Q2 Q20)

100 base ses

P P = = ? Q Q = ?

Q30 is a common quality thres eshold or quality cr criter erion

Q = -10 log(P) 10 10 reads 10 10 reads

Quality Quality contr trol

Todos Santos 2018

slide-12
SLIDE 12

Fa FastQC: : a GUI I tool for asses essing g the e quality of high gh-th throughput t se sequencing data. Tr Trimmomatic: : software e for trimming g adapter er seq equen ences es and low

  • w-

qua quality ba bases f from s seque quenc ncing ng r reads ds.

Quality Quality contr trol

Todos Santos 2018

slide-13
SLIDE 13

Trapnell and Salzberg (2009)

Se Sequence ce mapping/alignment

Todos Santos 2018

slide-14
SLIDE 14

Trapnell et al (2009)

Al Align gning r g reads t to m

  • mRN

RNAs As

Todos Santos 2018

slide-15
SLIDE 15

Trapnell et al (2010)

Dif Differential tial gene ne expr pressio ion

Todos Santos 2018

slide-16
SLIDE 16

Ot Other com

  • mmon
  • n DE

DE sof

  • ftware: DE

DESeq2, ed edgeR eR, , cu cuffdiff Ot Other mRNA NA aligners: Star, GNS NSAP, Top

  • phat2

RN RNA-se seq pipe pipeline lines

No No reference genome? Use se Trinity to asse ssemble transc scripts Ot Other abundance estimator

  • rs: RSEM

EM, ht htseq-co count Va Various GUIs and R-ba base sed d tools s for dr drawing ng pl plots

Pertea et al (2016)

slide-17
SLIDE 17

Integ egrative e Geno enomics Viewer er (IGV)

Ge Genome browse sers

Todos Santos 2018

slide-18
SLIDE 18

UCSC Geno enome e Browser er

Ge Genome browse sers

Todos Santos 2018

slide-19
SLIDE 19

Trinity w workfl flow

Haas et al (2013)

Bo Bowtie2 RS RSEM ed edgeR eR

  • 1. Diauxic shift
  • 2. Heat shock
  • 3. Log phase
  • 4. Plateau phase
  • S. pombe

Todos Santos 2018