Overview of the DE analysis Mary Piper Bioinformatics Consultant - - PowerPoint PPT Presentation

overview of the de analysis
SMART_READER_LITE
LIVE PREVIEW

Overview of the DE analysis Mary Piper Bioinformatics Consultant - - PowerPoint PPT Presentation

DataCamp RNA-Seq Differential Expression Analysis RNA - SEQ DIFFERENTIAL EXPRESSION ANALYSIS Overview of the DE analysis Mary Piper Bioinformatics Consultant and Trainer DataCamp RNA-Seq Differential Expression Analysis Review the


slide-1
SLIDE 1

DataCamp RNA-Seq Differential Expression Analysis

Overview of the DE analysis

RNA-SEQ DIFFERENTIAL EXPRESSION ANALYSIS

Mary Piper

Bioinformatics Consultant and Trainer

slide-2
SLIDE 2

DataCamp RNA-Seq Differential Expression Analysis

Review the dataset/question

slide-3
SLIDE 3

DataCamp RNA-Seq Differential Expression Analysis

Overview of the DE analysis

slide-4
SLIDE 4

DataCamp RNA-Seq Differential Expression Analysis

slide-5
SLIDE 5

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 workflow: Model

# Create DESeq object dds_wt <- DESeqDataSetFromMatrix(countData = wt_rawcounts, colData = reordered_wt_metadata, design = ~ condition)

slide-6
SLIDE 6

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 workflow: Design formula

# Design formula ~ strain + sex + treatment

slide-7
SLIDE 7

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 workflow: Design formula

# Design formula ~ strain + sex + treatment + sex:treatment

slide-8
SLIDE 8

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 workflow: Running

# Run analysis dds_wt <- DESeq(dds_wt) using pre-existing size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting model and testing

slide-9
SLIDE 9

DataCamp RNA-Seq Differential Expression Analysis

Let's practice!

RNA-SEQ DIFFERENTIAL EXPRESSION ANALYSIS

slide-10
SLIDE 10

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 model

RNA-SEQ DIFFERENTIAL EXPRESSION ANALYSIS

Mary Piper

Bioinformatics Consultant and Trainer

slide-11
SLIDE 11

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 model

slide-12
SLIDE 12

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 model - mean-variance relationship

# Syntax for apply() apply(data, rows/columns, function_to_apply) # Calculating mean for each gene (each row) mean_counts <- apply(wt_rawcounts[, 1:3], 1, mean) # Calculating variance for each gene (each row) variance_counts <- apply(wt_rawcounts[, 1:3], 1, var)

slide-13
SLIDE 13

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 model - dispersion

Plotting relationship between mean and variance:

# Creating data frame with mean and variance for every gene df <- data.frame(mean_counts, variance_counts) ggplot(df) + geom_point(aes(x=mean_counts, y=variance_counts)) + scale_y_log10() + scale_x_log10() + xlab("Mean counts per gene") + ylab("Variance per gene")

slide-14
SLIDE 14

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 model - dispersion

slide-15
SLIDE 15

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 model - dispersion

Var: variance μ: mean α: dispersion Dispersion formula: V ar = μ + α ∗ μ Relationship between mean, variance and dispersion: ↑ variance ⇒↑ dispersion ↑ mean ⇒↓ dispersion

2

slide-16
SLIDE 16

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 model - dispersion

# Plot dispersion estimates plotDispEsts(dds_wt)

slide-17
SLIDE 17

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 model - dispersion

slide-18
SLIDE 18

DataCamp RNA-Seq Differential Expression Analysis

Let's practice!

RNA-SEQ DIFFERENTIAL EXPRESSION ANALYSIS

slide-19
SLIDE 19

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 model - contrasts

RNA-SEQ DIFFERENTIAL EXPRESSION ANALYSIS

Mary Piper

Bioinformatics Consultant and Trainer

slide-20
SLIDE 20

DataCamp RNA-Seq Differential Expression Analysis

DESEq2 workflow

slide-21
SLIDE 21

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 workflow

# Run analysis dds_wt <- DESeq(dds_wt) using pre-existing size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting model and testing

slide-22
SLIDE 22

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 Negative Binomial Model

slide-23
SLIDE 23

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 Negative Binomial Model

slide-24
SLIDE 24

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 contrasts

results(wt_dds, alpha = 0.05)

slide-25
SLIDE 25

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 contrasts

The syntax is:

results(dds, contrast = c("condition_factor", "level_to_compare", "base_level"), alpha = 0.05) wt_res <- results(dds_wt, contrast = c("condition", "fibrosis", "normal"), alpha = 0.05)

slide-26
SLIDE 26

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 contrasts

wt_res

slide-27
SLIDE 27

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 LFC shrinkage

plotMA(wt_res, ylim=c(-8,8))

slide-28
SLIDE 28

DataCamp RNA-Seq Differential Expression Analysis

LFC shrinkage

wt_res <- lfcShrink(dds_wt, contrast=c("condition", "fibrosis", "normal"), res=wt_res) plotMA(wt_res, ylim=c(-8,8))

slide-29
SLIDE 29

DataCamp RNA-Seq Differential Expression Analysis

LFC shrinkage

slide-30
SLIDE 30

DataCamp RNA-Seq Differential Expression Analysis

Let's practice!

RNA-SEQ DIFFERENTIAL EXPRESSION ANALYSIS

slide-31
SLIDE 31

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 results

RNA-SEQ DIFFERENTIAL EXPRESSION ANALYSIS

Mary Piper

Bioinformatics Consultant and Trainer

slide-32
SLIDE 32

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 results table

mcols(wt_res)

slide-33
SLIDE 33

DataCamp RNA-Seq Differential Expression Analysis

DESeq2 results table

head(wt_res, n=10) log2 fold change (MAP): condition fibrosis vs normal Wald test p-value: condition fibrosis vs normal DataFrame with 6 rows and 6 columns baseMean log2FoldChange lfcSE <numeric> <numeric> <numeric> <n ENSMUSG00000102693 0 NA NA ENSMUSG00000064842 0 NA NA ENSMUSG00000051951 19.5084656230804 3.55089043143673 0.648400500074659 4.6687184 ENSMUSG00000102851 0 NA NA ENSMUSG00000103377 0 NA NA ENSMUSG00000104017 0 NA NA pvalue padj <numeric> <numeric> ENSMUSG00000102693 NA NA ENSMUSG00000064842 NA NA ENSMUSG00000051951 3.03084428526558e-06 1.93776447202312e-05 ENSMUSG00000102851 NA NA ENSMUSG00000103377 NA NA ENSMUSG00000104017 NA NA

slide-34
SLIDE 34

DataCamp RNA-Seq Differential Expression Analysis

Significant DE genes - summary

summary(wt_res)

slide-35
SLIDE 35

DataCamp RNA-Seq Differential Expression Analysis

Significant DE genes - fold-change threshold

wt_res <- results(dds_wt, contrast = c("condition", "fibrosis", "normal"), alpha = 0.05, lfcThreshold = 0.32) wt_res <- lfcShrink(dds_wt, contrast=c("condition", "fibrosis", "normal"), res=wt_res)

slide-36
SLIDE 36

DataCamp RNA-Seq Differential Expression Analysis

Significant DE genes - summary

summary(wt_res)

slide-37
SLIDE 37

DataCamp RNA-Seq Differential Expression Analysis

Results - annotate

library(annotables) grcm38

slide-38
SLIDE 38

DataCamp RNA-Seq Differential Expression Analysis

Results - extract

wt_res_all <- data.frame(wt_res) %>% rownames_to_column(var = "ensgene") %>% left_join(x = wt_res_all, y = grcm38[, c("ensgene", "symbol", "description")], by = "ensgene") View(wt_res_all)

slide-39
SLIDE 39

DataCamp RNA-Seq Differential Expression Analysis

Significant DE genes - arrange

wt_res_sig <- subset(wt_res_all, padj < 0.05) wt_res_sig <- wt_res_sig %>% arrange(padj) View(wt_res_all)

slide-40
SLIDE 40

DataCamp RNA-Seq Differential Expression Analysis

slide-41
SLIDE 41

DataCamp RNA-Seq Differential Expression Analysis

Let's practice!

RNA-SEQ DIFFERENTIAL EXPRESSION ANALYSIS