Genomics Virtual Laboratory Mike Pheasant (UQ) Andrew Lonie - - PowerPoint PPT Presentation

genomics virtual laboratory
SMART_READER_LITE
LIVE PREVIEW

Genomics Virtual Laboratory Mike Pheasant (UQ) Andrew Lonie - - PowerPoint PPT Presentation

Genomics Virtual Laboratory Mike Pheasant (UQ) Andrew Lonie (VLSCI) What is the Genomics Virtual Laboratory? NeCTAR funded nationally distributed platform for genomics, built on the Research Cloud and RDSI NeCTAR? Research Cloud? RDSI?


slide-1
SLIDE 1

Genomics Virtual Laboratory

Mike Pheasant (UQ) Andrew Lonie (VLSCI)

slide-2
SLIDE 2

What is the Genomics Virtual Laboratory? NeCTAR funded nationally distributed platform for genomics, built on the Research Cloud and RDSI

slide-3
SLIDE 3

NeCTAR? Research Cloud? RDSI?

NeCTAR = National eResearch Collaboration Tools and Resources

http://www.nectar.org.au

NeCTAR Research Cloud

http://www.nectar.org.au/research-cloud

RDSI = Research Data Storage Initiative

http://www.rdsi.uq.edu.au/

slide-4
SLIDE 4

What is the Genomics Virtual Laboratory?

NeCTAR funded nationally distributed platform for genomic analyses:

Infrastructure

  • Workflow management system
  • Bioinformatics toolkit (for command-line users)
  • Visualisation services
  • Scalable compute infrastructure

Resources

  • Tutorials and exemplar workflows targetted at common high throughput

genomics tasks

  • Data catalogues and coordination centres
  • Subscription based support
slide-5
SLIDE 5

What is the Genomics Virtual Lab?

slide-6
SLIDE 6

Workflow platforms

slide-7
SLIDE 7

Workflow platforms

Interactive platforms for developing genomics workflows and interactive data analysis

  • Galaxy
  • Genepattern, others possible (Bioflow, ...)

What's Galaxy? "an open, web-based platform for performing accessible, reproducible, and transparent genomic science."

http://galaxyproject.org Accessible: Users without programming experience can easily specify parameters and run tools and workflows Reproducible: Galaxy captures information so that any user can repeat and understand a complete computational analysis Transparent: Users share and publish analyses via the web

slide-8
SLIDE 8

Visualisation platforms

slide-9
SLIDE 9

Cluster-on-the-cloud

slide-10
SLIDE 10

Cluster-on-the-cloud

CloudBioLinux - Linux with comprehensive, actively maintained suite of bioinformatics tools

http://cloudbiolinux.org/

CloudMan: platform for launching and scaling CloudBioLinux clusters and Galaxy clusters on the cloud

http://usecloudman.org

Research Cloud: ~25000 CPUs to be spread across 6-10 research centres around Australia, to host research activities 'on demand'

http://www.nectar.org.au/research-cloud

slide-11
SLIDE 11

Data catalogues

slide-12
SLIDE 12

Data catalogues

UCSC databases Ensembl databases ENCODE dbSNP, Hapmap ICGC, COSMIC BPA Framework Datasets

  • sarcoma
  • wheat
  • soil diversity
slide-13
SLIDE 13

Tutorials and workshops

slide-14
SLIDE 14

Tutorials and education resources

NGS School - summer schools, 2 day workshops Galaxy based online tutorials:

  • Intro to NGS
  • Genome Browsers
  • Common analyses

Differential gene expression

Variant calling

ChIPseq

...

slide-15
SLIDE 15

Exemplar best practice workflows

slide-16
SLIDE 16

Exemplar workflows

  • Variant calling:

○ GATK best-practice ○ microbial ○ cancer-optimised

  • RNA-seq differential expression
  • Fusion gene discovery from RNA-seq
  • MicroRNA analysis
  • De novo genome and transcriptome assembly
  • Metagenomics
  • ChIP-seq
  • Variant annotation
  • Pathway analysis
  • Methylation
slide-17
SLIDE 17

Support

slide-18
SLIDE 18

Genomics Informatics Network

Institutional subscriptions:

  • genomics support (% of FTE)
  • large compute and data resources
  • managed instances of GVL
  • new GVL tool development
  • advocacy to funding bodies for resources
  • communities of best practice
slide-19
SLIDE 19

Or...roll your own GVL

slide-20
SLIDE 20

Progress and timelines

Dec 2012 Prototype at Qld (UQ) and Vic (UoM)

  • Galaxy
  • UCSC browser + databases
  • Bioinformatics cluster-on-the-cloud
  • Initial tutorials and exemplars

Jun 2013 Production at Qld (UQ) and Vic (UoM), prototype @ other Research Cloud nodes Data coordination centres, data catalogues Dec 2013 Additional workflows and tutorials Additional nodes Jun 2014 Operations (support centres - subscriptions)