Unlocking the Power of Continuity in Single Cell RNA - Seq : - - PowerPoint PPT Presentation

unlocking the power of continuity in single cell rna seq
SMART_READER_LITE
LIVE PREVIEW

Unlocking the Power of Continuity in Single Cell RNA - Seq : - - PowerPoint PPT Presentation

Unlocking the Power of Continuity in Single Cell RNA - Seq : Differential Gene Expression Along Developmental Trajectories Hector Roux de Bzieux Talk given at the Statistics Group in Biostatistics and Genomics Seminar on Sandrine Dudoit


slide-1
SLIDE 1

Unlocking the Power of Continuity in Single Cell RNA-Seq: Differential Gene Expression Along Developmental Trajectories

Hector Roux de Bézieux Group in Biostatistics Sandrine Dudoit’s lab

GitHub: HectorRDB Website: http://hectorrdb.github.io

Talk given at the Statistics and Genomics Seminar on 04/18

slide-2
SLIDE 2

Overview

1) Introduction to scRNA-Seq 2) Trajectory Inference with Slingshot 3) Differential Expression with tradeSeq, 4) Clustering gene patterns with RSEC

slide-3
SLIDE 3

1.Introduction to scRNA-Seq

slide-4
SLIDE 4

Central Dogma of biology

https://translate.bio/rna-therapeutics/central-dogma-for-web-4-3/

slide-5
SLIDE 5

Bulk RNA - Seq Single–cell RNA - Seq

VS

VS

Single-cell RNA-Seq

Unmixing the smoothie

slide-6
SLIDE 6

Recent explosion in scRNA-Seq

https://twitter.com/vallens/status/1113982015517282304

slide-7
SLIDE 7

Data structure

Cell 1 Cell 2 Cell 3 … Cell n Gene 1 28 25 … 2 Gene 2 3 8 … 36 Gene 3 5 … … … … … … … Gene G 12 8 … 11

slide-8
SLIDE 8

2.Trajectory Inference with Slingshot

slide-9
SLIDE 9

Dimensionality reduction

  • Bone-marrow stem cells

from the monocle 3 vignette

  • 2660 cells and 3004 genes
  • Now 2660 cells in two

dimensions using UMAP

  • Leland McInnes, John Healy, and James Melville.UMAP: Uniform

Manifold Approximation and Projection for Dimension Reduction. ArXiv , 2 2018.URL http://arxiv.org/abs/1802.03426

slide-10
SLIDE 10

Sustentacular cell (Sus) Olfactory receptor neuron (ORN) Immature olfactory neuron Globose basal cell (GBC) Horizontal basal cell (HBC) Olfactory ensheathing glia Bowman’s gland

Olfactory Epithelium

slide-11
SLIDE 11

Fletcher RB, Das D, Gadye L, Street K, Baudhuin A, Risso D, Wagner A, Cole MB, Flores Q, Choi YG, Yosef N, Purdom E, Dudoit S, Ngai J. Deconstructing Olfactory Stem Cell Trajectories at Single-Cell

  • Resolution. Cell Stem Cell. 2017; 20(6):817–30.

Olfactory Epithelium

slide-12
SLIDE 12
  • Low dimensional

PCA, ICA, tSNE, UMAP, … Clustered SC3, Seurat, RSEC, …

Input Data

slide-13
SLIDE 13
  • Identify global structure

Cluster shape sensitive

Minimal Spanning Tree

slide-14
SLIDE 14
  • Identify global structure

Cluster shape sensitive Incorporate prior knowledge

Constrained MST

slide-15
SLIDE 15

Highly stable Uses cells, not clusters Incongruent across branches

Principal Curves

slide-16
SLIDE 16

Highly stable Uses cells, not clusters Mostly congruent across branches

Simultaneous Principal Curves

slide-17
SLIDE 17

Highly stable Uses cells, not clusters Mostly congruent across branches

Computing Pseudotime

Kelly Street, Davide Risso, Russell B.Fletcher, Diya Das, John Ngai, Nir Yosef ,Elizabeth Purdom, and Sandrine Dudoit. Slingshot:cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics , 19(1):477, 12 2018.ISSN 1471-2164.doi:10.1186/s12864-018-4772-0.URL https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-018-4772-0 .

slide-18
SLIDE 18

Fletcher RB, Das D, Gadye L, Street K, Baudhuin A, Risso D, Wagner A, Cole MB, Flores Q, Choi YG, Yosef N, Purdom E, Dudoit S, Ngai J. Deconstructing Olfactory Stem Cell Trajectories at Single-Cell

  • Resolution. Cell Stem Cell. 2017; 20(6):817–30.

Trajectory inference

slide-19
SLIDE 19

Trajectory inference

➢ Finding developmental paths Each cell has a pseudotime, which measure how far along it is in the developmental process

slide-20
SLIDE 20

Challenges of Slingshot

Slingshot can only tackles tree structures.

It can not handle connected (including cyclic) trajectories, nor non connected trajectories.

Wouter Saelens, Robrecht Cannoodt, Helena Todorov, and Yvan Saeys.Acomparisonofsingle-cell trajectory inference methods. NatureBiotechnology , page 1, 4 2019.ISSN 1087-0156.doi:10.1038/ s41587-019-0071-9.URL http://www.nature.com/articles/s41587-019-0071-9

Can be handled by slingshot

slide-21
SLIDE 21

3.Differential Expression with tradeSe

Seq

slide-22
SLIDE 22

cluster-based DE is artificial

1 2 3 4 5 0.0 0.3 0.6 0.9

Pseudotime log(count + 1)

Genes are now expressed in a continuous manner (since 2014) Differential Expression is still cluster-based, i.e. discrete.

slide-23
SLIDE 23

Trajectory-based DE

We developed tradeSe Seq, an algorithm that leverages the continuous nature of scRNA-Seq.

Ø Available as an R package on Github (statOmics/tradeSeq).Soon on Bioconductor. Ø Modular tool that work with any dimensionality reduction and trajectory inference method.

slide-24
SLIDE 24

Statistical model

!

"# = %& '"#, )"

'"# = + ,"-(/

#) 1

  • 23

4-# + 6#7" + log (%#)

Can accommodate Ø Design matrix Ø Different sequencing depth Ø Weights

slide-25
SLIDE 25

Statistical model

1 2 3 4 5 0.0 0.3 0.6 0.9

Pseudotime log(count + 1)

1 2 3 4 5 0.0 0.3 0.6 0.9

Pseudotime log(count + 1)

slide-26
SLIDE 26

An investigation tool

!"# $% = ' () * +"#)

, )-.

Testing null hypotheses of the form: Using Wald Statistics of the form: 01: 34+" = 0 6

" = +

7"

43 348

93

:.34+

7"

slide-27
SLIDE 27

An investigation tool

0.00 0.25 0.50 0.75 1.00 1.25 25 50 75 100 pseudotime count (log + 1 scale) 0.00 0.25 0.50 0.75 1.00 1.25 25 50 75 100 pseudotime count (log + 1 scale) 0.00 0.25 0.50 0.75 1.00 1.25 25 50 75 100 pseudotime count (log + 1 scale) 0.00 0.25 0.50 0.75 1.00 1.25 25 50 75 100 pseudotime count (log + 1 scale) 0.00 0.25 0.50 0.75 1.00 1.25 25 50 75 100 pseudotime count (log + 1 scale) 0.00 0.25 0.50 0.75 1.00 1.25 25 50 75 100 pseudotime count (log + 1 scale)

Differential Expression Tests

Within the orange lineage Between the orange and blue lineages Lineages associationTest startVsEndTest diffEndTest patternTest earlyDETest DE DE Not DE Not DE Not DE Not DE Not DE DE DE DE DE Not DE Not DE Not DE Not DE DE DE DE DE Not DE DE DE Not DE DE DE DE DE Not DE DE Not DE

slide-28
SLIDE 28

Association test

Contrast matrix !": $%&' = $%&)' +,- .// 0 ≠ 0′

$%3' $%4' $%5' … $%6' 1

  • 1

… 1

  • 1

… 1 … … … … … …

  • 1

… 1

slide-29
SLIDE 29

StartVsEndTest

0.0 0.5 1.0 1.5 0.4 0.6 0.8 1.0 1.2

color by expression of Mpo

dim1 dim2

slide-30
SLIDE 30

DiffEndTest

0.0 0.5 1.0 1.5 0.4 0.6 0.8 1.0 1.2

color by expression of Prtn3

dim1 dim2

slide-31
SLIDE 31

Simulation framework: dynverse

Wouter Saelens, Robrecht Cannoodt, Helena Todorov, and Yvan Saeys.Acomparisonofsingle-cell trajectory inference methods. NatureBiotechnology , page 1, 4 2019.ISSN 1087-0156.doi:10.1038/ s41587-019-0071-9.URL http://www.nature.com/articles/s41587

  • 019-0071-9
slide-32
SLIDE 32

Outperforms existing methods

−4 −2 2 4 −2.5 0.0 2.5 5.0

PC1 PC2

Cyclic dataset

a

−40 −20 20 −25 25

PC1 PC2

Bifurcating dataset

b

−20 20 40 −20 20 40

PC1 PC2

Multifurcating dataset

c

0.5 0.6 0.7 0.8 0.9 1.0 0.01 0.050.10

FDR TPR

d

0.5 0.6 0.7 0.8 0.9 1.0 0.01 0.05 0.10

FDR TPR

e

0.5 0.6 0.7 0.8 0.9 1.0 0.01 0.05 0.10

FDP TPR

f

tradeSeq_slingshot_end tradeSeq_GPfates_end tradeSeq_Monocle2_end tradeSeq_slingshot_pattern tradeSeq_GPfates_pattern tradeSeq_Monocle2_pattern tradeSeq_slingshot_assoc Monocle3_assoc BEAM GPfates edgeR

slide-33
SLIDE 33

Outperforms existing methods

0.00 0.25 0.50 0.75 1.00 0.01 0.05 0.10

FDR TPR

All 10 datasets

tradeSeq_slingshot_end tradeSeq_GPfates_end tradeSeq_Monocle2_end tradeSeq_slingshot_pattern tradeSeq_GPfates_pattern tradeSeq_Monocle2_pattern BEAM GPfates edgeR

slide-34
SLIDE 34

Provides unique insights

1 2 0.0 0.3 0.6 0.9

pseudotime expression + 1 (log−scale)

Gene Irf8 in the bone marrow dataset

slide-35
SLIDE 35

Also works with multiple lineages

slide-36
SLIDE 36

Perspectives for tradeSeq

Ø Possible to develop new tests, especially to look at speed or acceleration of gene changes. Ø Zero-inflation weights are estimated before the smoothers. Future improvements could focus on joint-improvements. Ø Publish the paper and software

slide-37
SLIDE 37

3.Clustering gene patterns with RSEC

slide-38
SLIDE 38

Single lineage clustering

Embryogenesis datasets Ø Cluster with RSEC on the genes, using the !"#$ as features

slide-39
SLIDE 39

Multiple lineages clustering

Bone marrow dataset: Clusters for the top 500 genes

slide-40
SLIDE 40

Limitations

Ø Clustering with more than one lineage is hard to interpret (and sometimes leads to adherent results). Ø Most filtering or merging criterions used in RSEC are not applicable

  • here. Filtering only based on cluster size might miss small but very

strong signals. Ø Current work by Stephanie DeGraaf might be more promising.

slide-41
SLIDE 41

Sandrine Dudoit Kelly Street Koen Van den Berge Lieven Clement

slide-42
SLIDE 42

Thank you for listening Any questions?