GNU Guix R. Wurmus , B. Uyar, B. Osberg, - - PowerPoint PPT Presentation

gnu guix
SMART_READER_LITE
LIVE PREVIEW

GNU Guix R. Wurmus , B. Uyar, B. Osberg, - - PowerPoint PPT Presentation

Reproducible genomics analysis pipelines with GNU Guix R. Wurmus , B. Uyar, B. Osberg, V. Franke, https://doi.org/10.1093/gigascience/giy123 A. Gosdschan, K. Wreczycka, J. Ronen, A. Akalin


slide-1
SLIDE 1

GNU Guix

Reproducible genomics analysis pipelines with

使用 可重复性的 基因组学 分析管道 提供

  • R. Wurmus, B. Uyar, B. Osberg, V. Franke,
  • A. Gosdschan, K. Wreczycka, J. Ronen, A. Akalin

https://doi.org/10.1093/gigascience/giy123

slide-2
SLIDE 2

a b 笔记本

a = 10ml b = 30ml Supplier: ACME Temp: 22 deg C

slide-3
SLIDE 3

To repeat an experiment we first need to reproduce its environment

slide-4
SLIDE 4
slide-5
SLIDE 5

How hard could this possibly be?

slide-6
SLIDE 6 coreutils-8.24 perl-5.22.1 tar-1.28 gzip-1.6 bzip2-1.0.6 xz-5.2.2 file-5.25 diffutils-3.3 patch-2.7.5 sed-4.2.2 findutils-4.6.0 gawk-4.1.3 grep-2.22 coreutils-8.24 make-4.1 bash-4.3.42 ld-wrapper-0 binutils-2.25.1 gcc-4.9.3 glibc-2.22 glibc-utf8-locales-2.22 acl-2.2.52 gmp-6.1.0 libcap-2.24 glibc-utf8-locales-2.22 gcc-4.9.3 ld-wrapper-boot3-0 binutils-cross-boot0-2.25.1 make-boot0-4.1 diffutils-boot0-3.3 findutils-boot0-4.6.0 file-boot0-5.25 bootstrap-binaries-0 ed-1.12 libsigsegv-2.10 perl-boot0-5.22.1 perl-5.22.1 acl-2.2.52 gmp-6.1.0 libcap-2.24 pkg-config-0.29 guile-2.0.11 bison-3.0.4 readline-6.3 ncurses-6.0 gcc-cross-boot0-wrapped-4.9.3 texinfo-6.0 bash-static-4.3.42 libstdc++-4.9.3 zlib-1.2.8 perl-boot0-5.22.1 gettext-boot0-0.19.7 gcc-cross-boot0-4.9.3 glibc-bootstrap-0 gcc-bootstrap-0 linux-libre-headers-3.14.37 gzip-1.6 gettext-0.19.7 attr-2.4.47 m4-1.4.17 gzip-1.6 guile-bootstrap-2.0 binutils-bootstrap-0 gettext-0.19.7 attr-2.4.47 m4-1.4.17 gcc-cross-boot0-wrapped-4.9.3 glibc-intermediate-2.22 m4-1.4.17 expat-2.1.0 lzip-1.16 pkg-config-0.29 libffi-3.2.1 readline-6.3 libunistring-0.9.6 libltdl-2.4.6 libgc-7.4.2 gmp-6.1.0 ncurses-6.0 libatomic-ops-7.4.2 m4-1.4.17 expat-2.1.0

Very.

slide-7
SLIDE 7

to the rescue?

Containers

slide-8
SLIDE 8

lack transparency

Containers

strawberry? whale oil?

slide-9
SLIDE 9

Automate genomics analyses

Design goals

RNAseq

U C G G A C A C C C G U A A A

ChIPseq single cell BSseq

1

slide-10
SLIDE 10

PiGx ChIPseq

Improve read quality Trim-Galore Align reads Bowtie2 Call peaks MACS2 ChIP QC & reproducibility ChIPQC + IDR Peak annotation genomation Compute read coverage R Scripts Check sequencing quality FastQC Pan-sample quality check MultiQC

slide-11
SLIDE 11

Simple user interface

Design goals

Settings Sample sheet

interactive reports browser tracks alignments QC reports sample clustering

2

slide-12
SLIDE 12

Easy to install reproducibly

Design goals

guix package

  • -install pigx

3

slide-13
SLIDE 13

Reproducible package manager Full environment declarations Builds software in isolation

source / binary transparency

slide-14
SLIDE 14

Pack an application bundle

higher order source description lower-level binary application bundles

slide-15
SLIDE 15

90%

Status

not reproducible minor problems reproducible

all pipelines PiGx BSseq PiGx ChIPseq PiGx RNAseq PiGx scRNAseq

~98%

slide-16
SLIDE 16

Constrain software variables Containers are not transparent (smoothies) Guix builds software reproducibly and transparently PiGx shows that Guix makes reproducibility easy PiGx brings analysis to non-bioinformaticians

2 3 4 1 5

slide-17
SLIDE 17

http://bioinformatics.mdc-berlin.de/pigx/ https://hpc.guixsd.org https://gnu.org/s/guix

Let’s talk!

#guix on irc.freenode.net

ricardo.wurmus@mdc-berlin.de

Learn more

slide-18
SLIDE 18
slide-19
SLIDE 19

PiGx RNAseq

Improve read quality Trim-Galore Align reads STAR Quantify expression STAR / Salmon Analyze differential expression DESeq2 Find enriched GO terms g:ProfileR Compute read coverage Bedtools Check sequencing quality FastQC Pan-sample quality check MultiQC

slide-20
SLIDE 20

PiGx BSseq

Improve read quality Trim-Galore Align reads Bismark Call methylation methylkit Differential methylation methylkit Annotate DMRs and segments genomation Check sequencing quality FastQC Pan-sample quality check MultiQC Methylation segmentation methylkit

slide-21
SLIDE 21

PiGx single cell RNAseq

Improve read quality Trim-Galore Align reads STAR Determine cell number Dropbead Dropout rate and QC Scater Dimension reduction tSNE + PCA Compute read coverage Bedtools Check sequencing quality FastQC Pan-sample quality check MultiQC

slide-22
SLIDE 22
slide-23
SLIDE 23

headers sources build tools libraries ...

slide-24
SLIDE 24

headers sources build tools libraries ...

cabba9e-samtools-1.7/ bin samtools lib ...