Pathway Analysis Jenny Wu Outline Introduction to NGS data - - PowerPoint PPT Presentation

pathway analysis
SMART_READER_LITE
LIVE PREVIEW

Pathway Analysis Jenny Wu Outline Introduction to NGS data - - PowerPoint PPT Presentation

Introduction to Next Generation Sequencing (NGS) Data Analysis and Pathway Analysis Jenny Wu Outline Introduction to NGS data analysis in Cancer Genomics NGS applications in cancer research Typical NGS workflows and pipeline


slide-1
SLIDE 1

Introduction to Next Generation Sequencing (NGS) Data Analysis and Pathway Analysis Jenny Wu

slide-2
SLIDE 2

Outline

  • Introduction to NGS data analysis in Cancer

Genomics

– NGS applications in cancer research – Typical NGS workflows and pipeline – Open source software with GUI

  • Pathway Analysis and Software
  • Pathway Analysis goals and concepts
  • Commercial and open source pathway analysis software
  • Data analysis resources
  • Summary
slide-3
SLIDE 3

Next Generation Sequencing

Massively Parallel Sequencing: One can generate hundreds of millions of short sequences (up to 250bp) in a single run in a short period of time with low per base cost.

  • Illumina/Solexa GA II, HiSeq 2500, 3000,X
  • Roche/454 FLX, Titanium
  • Life Technologies/Applied Biosystems SOLiD

Reviews: Michael Metzker (2010) Nature Reviews Genetics 11:31 Quail et al (2012) BMC Genomics Jul 24;13:341.

slide-4
SLIDE 4

NGS in Cancer Genomics

Shyr et al.2013

slide-5
SLIDE 5

Data Analysis in the bottleneck

(wall.hms.harvard.edu)

Informatics

slide-6
SLIDE 6

Basic NGS Workflow

Olson et al.

QC and pipeline analysis Data interpretation Isolation of material PCR amplification End repair, size selection Library QC Cluster generation Instrument operation

slide-7
SLIDE 7

High Throughput Data Analysis Overview

Olson et al.

slide-8
SLIDE 8

http://www.broadinstitute.org/gsa/wiki/images/7/7a/Overall_flow.jpg http://www.broadinstitute.org/gatk/guide/topic?name=intro

Many Analysis Pipelines Start with Read Mapping

http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.html

Genotyping (GATK) RNA-seq (Tuxedo)

Typical Data Analysis Pipelines

slide-9
SLIDE 9

Cancer NGS Data Analysis Pipeline-Software

Raw reads Analysis-ready reads

FASTQC, FASTX- toolkit, Trimmomatic

Mapped reads

Visualization (IGV, IGB, USCS GB……) BWA, STAR

……

Data Task Software

slide-10
SLIDE 10

Cancer NGS Application Specific Software

Cufflinks, MISO DESeq2,GATK MACS2, SISSRs Mapped reads Bismark, BS Seeker SomaticSniper, VarScan2, mutect freeBayes, Pindel, CNVnator

……

slide-11
SLIDE 11

Open Source Software with GUI

http://www.broadinstitute.org/cancer/software/GENE-E

Galaxy: Web based platform for analysis of large datasets

http://hpc-galaxy.oit.uci.edu/root https://main.g2.bx.psu.edu/ https://usegalaxy.org/

GENE-E: java based matrix visualization and analysis platform; includes heatmap, clustering, filtering etc.

slide-12
SLIDE 12

Commercial software for NGS analysis

  • Easy to use, no

command line skills required

  • Usually platform

independent

  • Little to no learning

curve

  • Limited flexibility
  • Harder to publish
slide-13
SLIDE 13

Outline

  • Introduction to NGS data analysis in Cancer

Genomics

– NGS applications in cancer research – Typical NGS workflows and pipeline – Open source software with GUI

  • Pathway Analysis and Software
  • Pathway Analysis goals and concepts
  • Commercial and open source pathway analysis software
  • Data analysis resources
  • Summary
slide-14
SLIDE 14

Why Pathway Analysis

  • Logical next step in any high

throughput experiments

  • Goal: to characterize biological

meaning of the joint changes in gene expression

  • Why? Often group of genes doing related

functions are changed

slide-15
SLIDE 15

Pathway and Network Analysis

Pathway Analysis Methods:

  • Functional category over representation:

discrete test for significance (BiNGO, David, IPA etc)

  • Continuous test (GSEA, PAGE)
  • Signaling Pathway Impact Analysis (iPathway

Guide)

Network Analysis: (WGCNA, Cytoscape etc)

slide-16
SLIDE 16

Functional Category Enrichment

  • Discrete tests: enrichment for groups in gene

lists

– Select gene list at some predefined cutoff – For each gene list and functional category cross-tabulate to get a 2X2 contingency table – Test for significance using Fisher’s exact test – FDR correction for multiple hypothesis testing

Differentially expressed Not differentially expressed total In the pathway a b a+b Not in the pathway c d c+d total a+c b+d n

slide-17
SLIDE 17

Functional Categories in Pathway Analysis

  • Gene Ontology

– Biological Process – Molecular Function – Cellular Localization

  • Pathway Databases

– KEGG – BioCarta – Broad Institute (MSigDB) – Commercial knowledge bases such as IPA

  • Other

– Transcription factor targets – Protein complexes – Self-Defined

slide-18
SLIDE 18

Commerical and Open Source Pathway Analysis Software

slide-19
SLIDE 19

Ingenuity Pathway Analysis Tool

slide-20
SLIDE 20

IPA Input file

slide-21
SLIDE 21

IPA results page

slide-22
SLIDE 22

Resources in NGS data analysis

Public forums: Computational resources available at UCI:

  • HPC: open source software
  • CLCbio, IPA, JMP Genomics…
slide-23
SLIDE 23

Summary

Thank you!

  • NGS technologies are transforming

cancer research.

  • Data analysis is a crucial part in NGS

applications

  • Pathway analysis concepts and software
  • Data analysis resources