Cbio 16S analysis pipeline Katie Lennard Microbiome analysis - - PowerPoint PPT Presentation

cbio 16s analysis pipeline
SMART_READER_LITE
LIVE PREVIEW

Cbio 16S analysis pipeline Katie Lennard Microbiome analysis - - PowerPoint PPT Presentation

Cbio 16S analysis pipeline Katie Lennard Microbiome analysis workflow Data preprocessing (UCT High Performance Cluster) Microbiome analysis workflow unsupervised classification correlations analyses Import data into R Microbiome analysis


slide-1
SLIDE 1

Cbio 16S analysis pipeline

Katie Lennard

slide-2
SLIDE 2

Microbiome analysis workflow

Data preprocessing (UCT High Performance Cluster)

slide-3
SLIDE 3

Microbiome analysis workflow

correlations analyses unsupervised classification

Import data into R

slide-4
SLIDE 4

Microbiome analysis workflow

correlations analyses unsupervised classification

Exploratory

Summary barplots

slide-5
SLIDE 5

Microbiome analysis workflow

correlations analyses unsupervised classification

Exploratory

Beta diversity: NMDS/PCoA

slide-6
SLIDE 6

Microbiome analysis workflow

correlations analyses unsupervised classification

Exploratory

Annotated heatmaps

slide-7
SLIDE 7

Microbiome analysis workflow

correlations analyses unsupervised classification Differential abundance testing

Downstream analyses

slide-8
SLIDE 8

Microbiome analysis workflow

correlations analyses

Downstream analyses

slide-9
SLIDE 9

Microbiome analysis workflow

correlations analyses unsupervised classification unsupervised classification

Downstream analyses

slide-10
SLIDE 10

Microbiome analysis workflow

correlations analyses unsupervised classification Biomarker discovery: random forests

Downstream analyses

slide-11
SLIDE 11

Customized .R script to make your life easier

  • Convert from phyloseq object to metagenomeSeq object
  • Get the lowest available taxonomic annotation for each OTU and merge

counts at this level

  • Heatmap (using NMF package) customized for phyloseq objects
  • Can easily specify a subset of taxa and/or samples to plot
  • Select annotation colours
  • Select distance function for clustering
  • Choose to merge taxa at a given level (e.g. Genus) or plot individual OTUs
  • Generic barplot function build on phyloseq plot_bar()
  • Specify subset of samples
  • Filter OTUs so very rare ones (that just clog up the legend) are excluded
  • Merge at any taxonomic level (Family, Genus etc..)
  • Differential abundance testing + heatmap of significant results
  • Built around MetagenomSeq’s fitzig() and mrfulltable() functions
  • NB: currently only setup for two-class categorical comparisons
  • Correlations testing + correlation plot of significant results
slide-12
SLIDE 12

Customized .R script to make your life easier

  • For PICRUSt data: takes the output from PICRUSt's

metagenome_contributions.py, together with taxonomic annotation for the OTUs included in this table and provides a summary of the contribution of each Family/Genus.. etc to ONE SPECIFIC KEGG gene e.g. K02030

  • Random forests analysis on the otu table of a supplied phyloseq
  • bject
  • The data is randomly divided into a training (two thirds of the data) and test

set (remaining one third of the data not used for training)

  • Results printed to screen and written to file including:
  • most important taxa, AUC, PPV, NPV, OOB errors, class errors
  • option to specify the top N taxa to see how they perform
slide-13
SLIDE 13

Random Forests output example

slide-14
SLIDE 14

Random Forests output example

slide-15
SLIDE 15

The 16S accreditation dataset: first look

  • Number of OTUs: 181 (140 retained after filtering)
  • Number of samples: 15
  • Sample data summary (columns=Treatment; rows=Dog):

0 1 2 3 4 B 1 1 1 1 1 G 1 1 1 1 1 K 1 1 1 1 1

slide-16
SLIDE 16

The 16S accreditation dataset: first look

slide-17
SLIDE 17

The 16S accreditation dataset: first look

slide-18
SLIDE 18

The 16S accreditation dataset: first look