HISTORY CASE STUDIES FUTURE PERSPECTIVES MARIA KUZMINA UNIVERSITY - - PowerPoint PPT Presentation

history
SMART_READER_LITE
LIVE PREVIEW

HISTORY CASE STUDIES FUTURE PERSPECTIVES MARIA KUZMINA UNIVERSITY - - PowerPoint PPT Presentation

PLANT DNA BARCODING: HISTORY CASE STUDIES FUTURE PERSPECTIVES MARIA KUZMINA UNIVERSITY OF GUELPH, CANADA Building the DNA barcode library for the flora of Canada using herbarium specimens Encouraging start ... COI is a successful


slide-1
SLIDE 1

PLANT DNA BARCODING: HISTORY CASE STUDIES FUTURE PERSPECTIVES

MARIA KUZMINA UNIVERSITY OF GUELPH, CANADA

slide-2
SLIDE 2

Building the DNA barcode library for the flora of Canada using herbarium specimens

slide-3
SLIDE 3
  • COI is a successful barcode for animals but

fails in plants for several reasons:

  • intron presence is variable across

plants

  • exceptionally low rates of evolution

COI

Encouraging start ...

slide-4
SLIDE 4

Ideal barcode should be:

  • Universal primers
  • Bidirectional
  • Maximum discrimination among

species Resources:

  • 190 species of the land plants
  • 7 plastid DNA regions

Decision:

  • rbcL (61% species discrimination)
  • matK (69% species discrimination)

psbA-trnH matK rbcL

2009 Chloroplast DNA markers

slide-5
SLIDE 5

Top

  • polog
  • logical cor

ical corresp espon

  • nde

denc nce e of

  • f th

the e DN DNA A ba barco code de ph phyloge ylogeny ny an and d th the Ang e Angiospe iosperm P m Phylog hylogen eny y Gr Grou

  • up

p (AP (APG) G)

The central role

  • f the plastid gene rbcL

in our overall understanding

  • f the evolution of the angiosperms

(Soltis et al., 2005) (Kress et al., 2009) (Stevens, 2001 onward) Angiosperm phylogeny website

slide-6
SLIDE 6

Nuclear Ribosomal DNA

(Chen et al, 2010)

18S 5.8S 26S

ITS1 ITS2

  • ITS was originally proposed but

rejected

  • ITS2 was proposed later
  • variable short region
  • can be easily amplified across a

diverse sample of plants

  • used previously to discriminate

species

  • good length for NGS

Adding a nuclear marker ...

slide-7
SLIDE 7

Comparing “Apples with Oranges and Kiwi”

Publicati

  • n

Geographi c area No.

  • f

species

Reported species resolution (%)

rbcL matK ITS2 rbcL+ma tK rbcL+ITS 2 All 3

A.Fazekas et al, 2008 North America

92

48 56

K.Burgess et al, 2011 Koffler Scientific Reserve (KSR), Ontario

436

80 89 93

M.Kuzmin a et al, 2012 Churchill, Manitoba

312

54 63 69

D.Percy et al, 2014 North America

71

Incomplete lineage sorting OR plastid capture with selective sweep

M.Zarrei et al, 2015 North America

83

Polyploidy and hybridization

T.Elliott et al, 2015 Mont St.Hilaire, Quebec

582

Focusing on quality control of collected material and data

slide-8
SLIDE 8

Gene Pros Cons rbcL (550 bp)  Easily amplified  Good length for NGS  Poor taxonomic resolution matK (800 bp)  Good taxonomic resolution  Often difficult to amplify  Too long for most NGS platforms ITS2 (350 bp)  Good taxonomic resolution  Good length for NGS  Paralogous copies  Not easy to align across a diverse set of taxa

What it boils down too ...

slide-9
SLIDE 9
  • Agriculture and Agri-Food Canada (DAO)
  • Canadian Museum of Nature, National Herbarium of Canada

(CAN)

  • McGill University Herbarium, Macdonald Campus (MTMG)
  • Ontario Agriculture College Herbarium (OAC)
  • Private Herbarium of Bruce Bennett, White Horse, Yukon (BABY)
  • Royal Ontario Museum, Green Herbarium (TRT)
  • The Manitoba Museum (MMMN)
  • Universite de Montreal, L'Herbier Marie-Victorin (MT)
  • University of Alberta Herbarium, (ALTA)
  • University of British Columbia Herbarium (UBC)
  • University of Manitoba Herbarium (WIN)

Resource s

Plant DNA Barcode Library for All Canada

1 2 3 4 5 6 2008 2009 2010 2011 2012 2013 2014 2015 2016

5K

slide-10
SLIDE 10

Sampling

~18,000 specimens

slide-11
SLIDE 11

Sequencing

slide-12
SLIDE 12

Sequencing

slide-13
SLIDE 13

Taxonomic Bias

slide-14
SLIDE 14

Taxonomic Bias

slide-15
SLIDE 15

Taxonomic Bias

slide-16
SLIDE 16

Taxonomic Bias

slide-17
SLIDE 17

rbcL matK ITS2 10 20 30 40 50 60 70 80 90 100 Gene species resolution (%) rbcL matK ITS2 10 20 30 40 50 60 70 80 90 100 Gene species resolution (%)

species resolution (%) rbcL rbcL matK matK ITS2 ITS2 BLAST mothu r

Library resolution

Plant checklists from 28 national parks and reserves

slide-18
SLIDE 18

Arctic Boreal Pacific Prairie Woodland Atlantic 20 30 40 50 60 70 80 90 Region species resolution (%) Arctic Boreal Pacific Prairie Woodland Atlantic 20 30 40 50 60 70 80 90 Region species resolution (%) Arctic Boreal Pacific Prairie Woodland Atlantic 20 30 40 50 60 70 80 90 Region species resolution (%)

species resolution (%) matK rbcL ITS2 Arctic Boreal Pacific Prairie Woodland Atlantic Arctic Boreal Pacific Prairie Woodland Atlantic Arctic Boreal Pacific Prairie Woodland Atlantic

Library resolution

Plant checklists from 28 national parks and reserves combined in 6 biogeographic regions

slide-19
SLIDE 19

The he DN DNA A ba barco code de ref efer eren ence ce li libr brar ary y for

  • r mosse

mosses: s: rb rbcL and and tr trnL-F f F for

  • r 77

775 5 spe species of cies of Can Canad adian ian Br Bryop

  • phyt

hyta

Canadian Museum of Nature Center for Biodiversity Genomics

Maria Kuzmina, Jennifer Doubt, Catherine La Farge, Juan Carlos Villarreal & Paul Hebert

slide-20
SLIDE 20

Step one: sampling, imaging, databasing

slide-21
SLIDE 21

Source location of the specimens inc include luded in d in th the DN e DNA A ba barco code de ref efer eren ence ce li libr brar ary y for

  • r Can

Canad adian mosse ian mosses

 ~ 2000 specimens  775 species  ~ 3 records per species

slide-22
SLIDE 22

Number of moss specimens analyzed by province

slide-23
SLIDE 23

The he most ofte most often use n used d ph phyloge ylogene netic tic mar marker ers f s for

  • r mosses

mosses

(Stech & Quandt, 2010) rbcL trnL-F ITS

slide-24
SLIDE 24

Time

Relationship between specimen age and sequence recovery

Overall sequencing success:

Specimens Species rbcL 84% 94% trnL-F 85% 98%

slide-25
SLIDE 25

The Maximum Lik he Maximum Likelihood best elihood best rbc rbcL tr tree ee Monophyletic Polyphyletic 1665 specimens UNUSUAL: Many orders are polyphyletic!

slide-26
SLIDE 26

Boot Bootst strap p consensus consensus rbc rbcL tr tree ee Bootstrap >80% 1665 specimens SURPRISINGLY: rbcL poorly supports beta taxonomy but good at resolving genera!

slide-27
SLIDE 27

Species resolution with rbcL and trnL-F for species-rich orders of mosses

slide-28
SLIDE 28

Ree eexam xamina ination tion of

  • f tax

taxono

  • nomy (

my (red b ed bar ars) p ) provok

  • ked

ed by by rbc rbcL results esults

slide-29
SLIDE 29

Acknowledgement s

  • Anuar Rodrigues
  • Stephanie deWaard
  • Jesse Sills
  • Sean Graham
  • Aaron Fazekas
  • Bruce Bennett
  • Timothy Dickinson
  • Jeffrey Saarela
  • Paul Catling
  • Steven Newmaster
  • Diana Percy
  • Erin Fenneman
  • Aurelien Lauron-Moreau
  • Bruce Ford
  • Lynn Gillespie
  • Bruce Ford
  • Lynn Gillespie
  • Erin Fenneman
  • Aurelien Lauron-Moreau
  • Bruce Ford
  • Lynn Gillespie
  • Ragupathy Subramanyam
  • Jeannette Whitton
  • Linda Jennings
  • Deborah Metsger
  • Connor Warne
  • Allison Brown
  • Elizabeth Sears
  • Jeremy De Waard
slide-30
SLIDE 30

Reference Library for Targeted SNP-based Identification of Cibotium barometz Using NGS

Natalia Ivanova Maria Kuzmina Evgeny Zakharov

slide-31
SLIDE 31

Cibotium barometz

Plant growing in the Botanischer Garten München- Nymphenburg, Munich, Germany Photograph by: Daderot, Public domain

http://tropical.theferns.info/image.php?id=Cibotium+barometz

The golden brown hairs at the base of the frond Photograph by: Mokkie Creative Commons Attribution-Share Alike 4.0

slide-32
SLIDE 32

Medicinal Use

Anti- inflammatory Anti- rheumatic Anti-

  • steoporotic

Tonic Styptic Antibacterial Antioxidant

slide-33
SLIDE 33

Cibotium Phylogeny

Geiger JMO, Korall P, Ranker TA, Kleist AC, Nelson CL (2013) Molecular Phylogenetic Relationships of Cibotium and Origin of the Hawaiian Endemics. Am Fern J, 103: 141–152, doi:10.1640/0002-8444-103.3.141

slide-34
SLIDE 34

Cibotium barometz ID

rps4 (ribosomal protein S4) – 94 bp atpA (ATP synthase alpha chain) – 86 bp trnG-trnR intergenic spacer – 95 and 102 bp rps4-trnS intergenic spacer – 84 bp and 87 bp atpB-rbcL intergenic spacer – 79 and 110 bp

slide-35
SLIDE 35

Cibotium Reference Library – BOLD

slide-36
SLIDE 36

Cibotium Reference Library – BOLD

Average age of UBC Cibotium herbarium material – 58 years

slide-37
SLIDE 37

Cibotium Reference Library – BOLD

slide-38
SLIDE 38

rps4

atpA

rps4- trnS 87-200 rps4- trnS 163-300 trnG- rtnR 565-705 trnG- rtnR 681-830

atpB- rbcL

SNP Summary

Confirmed C. barometz SNPs Shared C. barometz/cumingii SNPs Signature SNPs

slide-39
SLIDE 39

Summary

Assembled reference library with voucher specimens Confirmed available GenBank data for C.barometz and C.cumingii Increased coverage for the regions of interest Resulting reference library can be used for regulatory purposes

slide-40
SLIDE 40

Thank you!

slide-41
SLIDE 41

Genome2-ID

unbiased and rapid species identification using NGS data

David L. Erickson

slide-42
SLIDE 42

Genome2-ID: Format reference database

DNA4 Technologies LLC

Example: plant chloroplast genomes as our reference ~ 150,000 bases in size A A T C G A T C G G A T C T A G A T C T C G A T A T A DECOMPOSE EACH SEQUECE INTO LIST OF OVERLAPPING “WORDS” A A T C G A A T C G A T T C G A T C C G A T C G G A T C G G A T C G G A T C G G A T . . G A T A T A

  • E. purpurea_NC1234
  • E. purpurea_NC1234
  • E. purpurea_NC1234
  • E. purpurea_NC1234
  • E. purpurea_NC1234
  • E. purpurea_NC1234
  • E. purpurea_NC1234

. .

  • E. purpurea_NC1234
slide-43
SLIDE 43

Genome2-ID: Format reference database

DNA4 Technologies LLC

[A A T C G A [A T C G A T [T C G A T C [C G A T C G [G A T C G G [A T C G G A [T C G G A T . . [G A T A T A ;E. purpurea_NC1234 ;E. purpurea_NC1234 ;E. purpurea_NC1234 ;E. purpurea_NC1234 ;E. purpurea_NC1234 ;E. purpurea_NC1234 ;E. purpurea_NC1234 . . ;E. purpurea_NC1234 ;E. angustifolia _NC1235 ;E. angustifolia _NC1235 ;E. angustifolia _NC1235 ;E. angustifolia _NC1235 ;E. angustifolia _NC1235 ;E. angustifolia _NC1235 . . ;E. angustifolia _NC1235 [G A T A T T ;E. angustifolia _NC1235 ;Hypericum perforatum_NC22871 . . Hypericum perforatum_NC22871 . Hypericum perforatum_NC22871 Hypericum perforatum_NC22871 [G A T A T C Echinacea purpurea Echinacea angustifolia Hypericum perforatum

slide-44
SLIDE 44

Genome2-ID: Format Forensic Sequence

DNA4 Technologies LLC

DNA from Sample

G G A T A C T A G C T C G C C T A C T T C A T A G C C T T A G T G T T T A C A T A C A T A C G C T T A

Sequence from Sample (WGS)

G G A T A C G A T A C T A T A C T A T A C T A G A C T A G C T A G C T C A G C T C G C C T A C T . . . . . C G C T T A

Input Data

slide-45
SLIDE 45

Genome2-ID: Search Database

✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

Words

ATCATCATA C ATCATCATA G ATCATCTTA C ATCATTTTA C ATCCCTTTA C ATCCCATTA C . . 1 2 3 4 5 6 7 8 9 10 11 N

Reference species