Jeff Newman, Lycoming College November 5, 2011 Newman Lab Members - - PowerPoint PPT Presentation
Jeff Newman, Lycoming College November 5, 2011 Newman Lab Members - - PowerPoint PPT Presentation
Incorporating research into the curriculum: NextGen sequence data in the General Biology (Freshman!) and Molecular Biology laboratories. Jeff Newman, Lycoming College November 5, 2011 Newman Lab Members Lab Members Rhizobium etli, E.coli
Newman Lab Members
Lab Members
- Rhizobium etli, E.coli FGARAT– Diana Burley’97, Kathy Roberts’98, Kevin Ferguson’98,
Jon Cook’98, Kim Mistiszyn’99, David Wilson’99, Lori Schultz’99, Rachel Lawton’99, Laura Singer’99, Julie Wagner’00
- Human, Staphylococcus aureus FGARAT– Anna Bucher’00, Missy Stokes’01, Melissa
Fogg’01, Amy Mayhew’02, Andy Cardillo‘02, Chris Brennan’02, Kristen Skvorak’02, Chris Robbins’03, Stefanie Mensch’03, Christy Boob’04, John Mazzulo’04, Jason Catanzaro’04, Deanne Greene’04, Justin Jay’05, Breann Wolfe’05, Jenny Kinne’05, Denise Greene’05, Marla Yates’05, Jennifer Leader’04, Michael Powell’05, Erica Walsh’05, Bethany Mingle’05, Kim McDowell’06, Andy Lutzkanin’06.
- Heterodimer FGARAT & Bioinformatics: Kevin Frederick’01, Mitch Marzo’01, Liz
Sehi’06, Jessica Bennett’07, Matt Wright’08
- Other work: Mark McCleland’99, Matthew Georgy’99, Josh Stutzman’00, Tyler
Hoffman’12
- Novel Microbes: Will Tumbusch’06, Kellie Cicconi’07, Tyler Marcinko’08, Pat Hayes’09,
Brittane Strahan‘09, Allison Batties‘10, Damian Mariano‘10, Samantha McKenna‘10, Alicia Schueck’10, Stephanie Woodhouse‘10, KC Failor‘11, Katherine Smith‘11, Kristen Collins‘11, Karen Kirk-V, Melissa Cashner‘11, Krissy Harstead ’12, Jordan Krebs‘13, Trisha Duncan’13, Logan Mariano ’13, Clark Thompson ‘13
- Color Key: BS-level jobs, Health Care/Vet, Graduate School, Still at Lyco
The Context: Teaching Biology @ Lycoming College
- Lycoming College is a National Liberal Arts
College with ~1400 students – Member of the “Carnegie 44” – Graduate 35-40 Bio majors/year
- 50% Pre-health (Anat & Physiol)
(MD,DO,PA,PT,DMD,VMD)
- 20% Ecology
- 20% General (“Comprehensive”)
- 10% Cell & Molecular
– 7 Bio faculty + 1 part time adjunct
- Standard teaching load is 12
“contact hours” (2 ch for 3 hr lab)
- No credit for IS or Honors or Chair
- Fall
– Intro Bio with 2 or 3 labs – Upper level with 1 lab (Molecular Biology or Genome Analysis or Research Methods)
- Spring
– Microbiology with 2 labs (2 x 2 hr/wk) – Biochemistry with lab
- No Specific Research Funding
My Institution’s Standard Teaching Load (in the sciences) is
0% 0% 0% 0% 0%
- 1. < 10 real hours/week in class & lab
- 2. 10-12 real hours/week in class & lab
- 3. Exactly 12 real hours/week in class & lab
- 4. 12-15 real hours/week in class & lab
- 5. >15 real hours/week in class & lab
Is research/scholarly activity required for tenure and/or promotion?
1 2 3 4 5
0% 0% 0% 0% 0%
- 1. We do not have the
- pportunity to do research
- 2. Research is NOT required but
is encouraged.
- 3. Modest research activity is
required for tenure, publication is required for promotion
- 4. Publication is required
- 5. Extramural funding is required
Do you ID True Unknowns in Micro Lab?
0% 0% 0% 0% 0%
- A. No, not in any course
- B. Yes, in Gen Micro but no 16S rDNA
- C. Yes, in Gen Micro with 16S rDNA seq
- D. Yes, in Upper Level Micro but no 16S rDNA
- E. Yes, in Upper Level Micro with 16S rDNA seq
Countdown
10
Solution: Do Research in Class/Lab with students as your research assistants.
10 week Unknown Microbe Lab
- Aseptic technique, inoculation
- Staining & Microscopy
- Temp, O2 requirements, antibiotics
- Carb metabolism/fermentation
- Nitrogen/amino acid metab
- Differential & Selective media
- Bergey’s Manual
- Biolog GenIII plates
- Fatty Acid Methyl Ester (FAME) Analysis
- 16S rDNA PCR & sequencing
- Search sequences at ________________
- MEGA5 for multiple sequence alignment/ construction of NJ Phylo tree
- API test $trip$ with research organisms
Lycoming Creek Loyalsock Creek
If you have a 16S rDNA sequence, what database/tool do you use?
0% 0% 0% 0% 0% 0%
- A. Don’t know, have never done it.
- B. BLAST vs non-redundant (nr) GenBank at NCBI
- C. Classifier at the Ribosomal Database Project (RDP)
- D. Seq Match at RDP
- E. Identify at EzTaxon.org
- F. other
Sequence Analysis
- EZTaxon.org – curated database, can search type strains that define species
– Species must be cultured, “officially” named & published – <98.5% identity to type strains is indicative of new species
- NCBI – contains many more sequences – including those from metagenomic, non-
culture based approaches – Little information associated with many sequences, other than source of the sample – Phylogeny of organism can often be determined
Method Background – Sequencing
- PCR products from pure cultures and cloned PCR products from uncultured
- rganisms can be sent for sequencing.
- We use Agencourt/Beckman/Coulter
– Slower (1 -2 week turnaround) but less expensive ($2.50/Rxn) than other companies, higher quality service – they purify PCR samples
Documenting Discoveries of Diversity http://www.lycoming.edu/~newman
Participant Bioinformatics Activity
- 16S rRNA sequence analysis in
Microbiology
– One read with EzTaxon – View trace with MEGA – 16S rRNA MSA and tree with MEGA5
Bacillus subtilis Trichococcus patagoniensis EMW Staphylococcus aureus Exiguobacterium undae Lactococcus lactis Streptococcus pyogenes Corynebacterium callunae Streptomyces coelicolor Oerskovia jenensis Arthrobacter aurescens Prochlorococcus marinus Geovibrio ferrireducens Cytophaga hutchinsonii Chryseobacterium indologenes Blastopirellula marina Helicobacter pylori Bdellovibrio bacteriovorus Neisseria gonorrhoeae Aquaspirillum sinuosum Pseudomonas aeruginosa Escherichia coli Acinetobacter johnsonii Psychrobacter maritimus MC1 Nitrospira moscoviensis Chloroflexus aurantiacus Thermomicrobium roseum Aquifex pyrophilus
0.05
Microbiology Course Feeds into Research
- Environmental unknowns cultured & characterized in Microbiology course
- Colony PCR of 16S rDNA with primers 27f & 1492r, 1 Sanger sequencing rxn
- Compare sequence to validly published type strains (Eztaxon.org)
>99.0% identical <99.0% identical Species identification assigned
- Fully sequence both strands of nearly complete 16S rDNA
- Submit Sequence to GenBank
- ClustalW Alignment, Neighbor Joining Tree to infer
phylogenetic relationships
- Obtain closest relatives as reference strains
- Morphological/Metabolic characterization w/standard tests
- Colony morphology, color, Gram stain, wet mount
- Temperature, O2 , pH, NaCl requirements
- Carbohydrate & Nitrogen metabolism
- Exoenzymes, differential and selective medium
- API Test Strips (50 CH, 20E/NE, ZYM)
- Fatty Acid Methyl Ester (FAME) Analysis (MIDI)
- Biolog GenIII metabolic profile
- Family-specific tests such as Multi Locus Sequence
Analysis, pigment analysis, respiratory quinones, polar lipids, cell wall amino acids.
- Deposit Strains in Culture Collections (e.g. ATCC, DSMZ,
CCUG, JCM, ARS/NRRL)
Publish new species in IJSEM
My exposure to new species descriptions in IJSEM, Taxonomy & Systematics is
0% 0% 0% 0%
- 1. Never use it, avoid topic in class
- 2. Occasionally look up original species descriptions
- 3. Use it all the time, check IJSEM every month
- 4. Damn splitters & lumpers and keep changing names
Obtain Complete 16S rRNA sequence
Figure 2. 16S rDNA Sequence Analysis – Top schematic shows the location of PCR and sequencing primers. Lower diagram illustrates the reads assembled into the final consensus sequence.
27f
rRNA1 1492r 530f rRNA2 785f 1114f 1100r
16S rRNA gene ~ 1500 bp
- The Gold Standard
– Less than 70% DNA-DNA hybridization
- % Identity with 16S rDNA sequence
– 97% vs. 98.5%
- Fig from Stackebrandt
& Ebers, 2006, Microbiol. Today 33:152
How are bacterial species defined?
2nd Measure of Genomic Uniqueness
- DNA-DNA hybridization is “best”, but not archivable
- Multi Locus Sequence Analysis with protein
coding genes provides better resolution than 16S rDNA – gyrB, rpoB, groEL/hsp60/cpn60, glnA, recA
- 95% Average Nucleotide Identity across the genome
(ANI)
MLSA Lab in Molecular Biology
- Problem: Several “Micrococcus” strains have pigmentation and other traits
different from type strains of M.luteus & M.yunnanensis – possibly novel.
- All have similar 16S rRNA sequences, what about protein coding genes?
Figure M1. Variation in Pigments among Micrococcus strains. M = Micrococcus luteus B-287T i = Micrococcus sp. AF c = Micrococcus sp. SSX r = Micrococcus sp. TMG
- = Micrococcus sp. LYLL
c = Micrococcus sp. LYG1
- = Micrococcus sp. LYO1
c = Micrococcus sp. LYE1 c = Micrococcus sp. D7 us = Micrococcus yunnanensis DSM 21948T
MLSA Lab in Molecular Biology
1. Isolated gDNA from several organisms 2. choose a housekeeping gene, 3. retrieved protein sequence from M.luteus genome site at NCBI, 4. performed BLAST search, 5. choose subset of organisms for multiple sequence alignment 6. Design degenerate primers for conserved sequences, ordered primers 7. Performed/optimized PCR to amplify gene fragment 8. Sequenced PCR product, assembled contig with CAP3 9. Performed multiple sequence alignment
- 10. Constructed tree and matrix
MLSA Lab in Molecular Biology
gb|CP001628.1|:5819-7981 Micrococcus luteus NCTC 2665 complete genome M.luteus B287 gyrB M.luteus AF gyrB M.yunnanensis DSM21948T gyrB
- M. lutues LYL1 gyB contig
- M. luteus LYG1 gyB contig
M.yunnanensis D7 gyrB M.luteus LYE1 gyrB M.luteus TMG gyrB M.luteus SSX gyrB M.luteus LYO1 gyrB
99 94 60 51 33 34 41 0.002
gb|CP001628.1|:5819-7981 Micrococcus luteus NCTC 2665 complete genome M.luteus B287 gyrB M.luteus AF gyrB M.yunnanensis DSM21948T gyrB
- M. luteus LYL1 gyrB contig
- M. luteus LYG1 gyrB contig
M.yunnanensis D7 gyrB M.luteus LYE1 gyrB M.luteus TMG gyrB M.luteus SSX gyrB M.luteus LYO1 gyrB
99 94 60 51 33 34 41 0.002
MLSA Lab in Molecular Biology
Conclusions: Some strains are probably novel species, but will be difficult to conclusively demonstrate due to overall high sequence conservation among existing named species
Figure M1. Variation in Pigments among Micrococcus strains. M = Micrococcus luteus B-287T i = Micrococcus sp. AF c = Micrococcus sp. SSX r = Micrococcus sp. TMG
- = Micrococcus sp. LYLL
c = Micrococcus sp. LYG1
- = Micrococcus sp. LYO1
c = Micrococcus sp. LYE1 c = Micrococcus sp. D7 us = Micrococcus yunnanensis DSM 21948T
Participant Bioinformatics Activity
- Multi Locus Sequence Analysis in Molecular Biology
– Protein seq from genome – MSA w/NCBI – Assemble two reads with CAP3 – MSA with nucleic acid – Construct Tree & Difference table
gb|CP001628.1|:5819-7981 Micrococcus luteus NCTC 2665 complete genome M.luteus B287 gyrB M.luteus AF gyrB M.yunnanensis DSM21948T gyrB
- M. lutues LYL1 gyB contig
- M. luteus LYG1 gyB contig
M.yunnanensis D7 gyrB M.luteus LYE1 gyrB M.luteus TMG gyrB M.luteus SSX gyrB M.luteus LYO1 gyrB
99 94 60 51 33 34 41 0.002
95% ANI ~ 70% DDH ~ 69% Conserved (>90%)
Why do we want to sequence genomes?
- Easier than DDH to distinguish
species?
- More Informative than DDH!
- Data can be archived for reuse
with other organisms.
- Can correlate traditional
tests/traits with genes
- Will be required soon for new
species descriptions
- 5 year goal: ID unknown
- rganisms by genome
sequencing when <$50 ($1?) each, develop hypotheses to test
How will we sequence genomes?
- Now – GCAT-SEEK PSU
Genomics Core – Our 1st run as part of pilot in progress?
- Future – Ion Torrent – 2 or 3
genomes per 316 chip (~200 Mb) for $1500; 20 or 30 genomes per 318 chip (>1 Gb) for $2000
- Soon? - commercial services,
$250/genome (what we pay for one 96 well plate of Sanger sequences)
What to do with the data?
- Individual reads must be
assembled into contigs
- 454 FLX+ ~700b
- Ion Torrent ~200b
- SOLiD ~35b
- Assembly done with
NextGENe software
- n Juniata Server,
can download contigs & stats to review with viewer
Why do we want to sequence genomes?
- Easier than DDH to distinguish species?
- Sequence of random
20% of 2 genomes allows calculation of Average Nucleotide Identity (ANI)
- Jspecies runs on Java,
requires BLAST and MUMmer (Linux or Mac) be installed locally.
- DDH cannot be archived,
reused, provides no other useful information
What else will we do with sequence?
- Bio110 Introductory Biology I
– Students get overlapping 30 kb segments of genome (0-30kb, 25-55kb, 50-80kb, etc) – Use NCBI ORF finder to ID ORFs, – create a map of their DNA fragment to show genes &
- rientation
– Create a table of the genes with location and name of each gene – Choose 1 gene to research in more detail
Do you have a Bioinformatics activity in your Intro Bio?
0% 0% 0% 0% 0%
- 1. Protein/molecular structure
- 2. DNA sequence analysis/Translation
- 3. Protein seq. analysis, alignment, phylo tree
- 4. All of the above
- 5. None of the above
Participant Bioinformatics Activity
- ORF finder in Bio110
– Find ORFs with 30 kb of C.gleum sequence – Begin table – Begin map – Retrieve nucleotide sequence, perform BLAST-N search
What else will we do with the sequence?
- Bio437 Molecular Biology
- Annotate genome with…
– Glimmer – RAST (Rapid Annotation Using Subsequence Technology)
Basys.ca for annotation?
- Looks good, but slow/
non-functional (long queue)?
Genome Annotation with RAST Server
The SEED for Annotation based on Subsystems
Creation of Metabolic Models
Biolog Phenotype Microarrays
Newman Lab Members
Naming organisms for …… ?!
- http://www.colbertnation.com/the-colbert-report-
videos/168483/may-14-2008/who-s-not-honoring-me-now---- science
- http://www.colbertnation.com/the-colbert-report-
videos/178730/august-06-2008/spida-of-love---jason-bond
References
- Felsenstein J (1985) Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791.
- “Genus I. Comamonas” in “Bergey’s Manual of Systematic Bacteriology” 2nd Edition. 2C: 689-696. G.M.
Garrity, D.J. Brenner, N.R. Krieg, J.T. Staley, eds. Springer, East Lansing, MI, 2001.
- Ibekwe A.M., Kennedy A.C. (1999). Fatty acid methyl ester (FAME) profiles as a tool to investigate community
structure of two agricultural soils. Plant and Soil. 206: 151-161.
- Ma Y.F., Y. Zhang, J.Y. Zhang, D.W. Chen, Y. Zhu, H. Zheng, S.Y. Wang, C.Y. Jiang, G.P. Zhao, S.J. Liu. (2009).
The Complete Genome of Comamonas testosterone Reveals Its Genetic Adaptations to Changing Environments. Applied and Environmental Microbiology. 75: 6812-6819.
- Saitou N., Nei M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees.
Molecular Biology and Evolution. 4: 406-425.
- Stackebrandt E. and Ebers J. (2006). Taxonomic parameters revisited: tarnished gold standards.
Microbiology Today. 33:152-155.
- Tamura K., J. Dudley, M. Nei, A. Kumar. (2007). MEGA4: Molecular Evolutionary Genetics Analysis (MEGA)
software version 4.0. Molecular Biology and Evolution. 24: 1596-1599.
- Tamura K., Nei M., Kumar S. (2004). Prospects for inferring very large phylogenies by using the neighbor-joining
- method. Proceedings of the National Academy of Sciences (USA). 101: 11030-11035.
- Tamura K, Dudley J, Nei M & Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software
version 4.0. Molecular Biology and Evolution 24:1596-1599.
- Tindall B. J., Rossello-Mora R., Busse H.-J., Ludwig W., Kampfer Pl. Notes on the characterization of prokaryote
strains for taxonomic purposes. (2010). International Journal of Systematic and Evolutionary Microbiology. 60:249-266.
- Wauters G., T. De Baere, A. Willems, E. Falsen, M. Vaneechoutte. (2003). Description of Comamonas aquatica
- comb. nov. and Comamonas kerstersii sp. nov. for two subgroups of Comamonas terrigena and emended description
- f Comamonas terrigena. International Journal of Systematic and Evolutionary Microbiology. 53: 859-862.
Align sequences, construct trees
- Should be done using “expertly
maintained” alignment at RDP or
- ther specialty sites.
How Many Bacterial Species Are There?
- FIG. 3. Collector's curve of the
Chao1 nonparametric richness estimator for sequences in the RDP-II. Accession numbers were used to determine the order in which sequences have been
- sampled. OTUs defined by a
collection of identical sequences reached an estimate of 325,040 different OTUs.
- From Patrick D. Schloss and Jo
- Handelsman. Status of the Microbial
- Census. Microbiology and Molecular
Biology Reviews, December 2004, p. 686-691, Vol. 68, No. 4
How Many Species Have Been Named?
Who is discovering & publishing new bacterial species?
rank country Q1 2007 Q2 2007 Q3 2007 Q4 2007 Q1 2008 Q2 2008 Q3 2008 Q4 2008 total 1 Korea 31 24 45 38 39 25 25 27 254 2 Japan 16 26 32 19 19 18 19 20 169 3 China 17 14 16 14 17 21 27 24 150 4 Germany 18 23 21 25 17 15 10 18 147 5 USA 10 8 17 13 9 14 6 12 89 6 Spain 4 10 3 8 6 7 7 10 55 7 France 3 8 5 8 6 8 5 10 53 8 UK 6 5 5 9 5 3 5 5 43 9 Belgium 4 6 7 4 6 2 4 8 41 10 India 5 4 5 5 3 6 6 6 40 11 Russia 4 2 4 7 4 6 3 6 36
Table 1. National affiliations associated with articles published in the International Journal of Systematic and Evolutionary Microbiology (IJSEM) during 2007 & 2008. For each article, the national affiliation of the authors was noted. A nation with multiple authors on a single article was awarded only one “point” for that article. Only nations with >30 articles are shown below. Data compiled by undergraduate students working in the PI’s lab.
Obtain Closest Relatives
- Can no longer compare just
to published results, must test in parallel.
- Write to Authors
- Culture Collections
– NRRL free – KCTC $50 – BCCM/LMG 42 Euro (~$60) – DSMZ 50 Euro (~$70) – JCM 5250 Yen (~$65) – CCUG 640 Kronor (~$100) – ATCC $205
Analyze Phenotypes in Parallel
Biolog GenIII Plate Results
Properly Format FAME Results
- FAME analysis must be
done using “standard” methods, not “instant” or “quick”.
Compile Differences
- Traits that can distinguish
among strains are highlighted.
Figure 5. Growth of Chryseobacterium strains on Endo,R2A and Cetrimide agars.
Deposit Strains in Culture Collections
- Need to submit Deposit Certificate from
Culoture Collections in 2 different countries
Novel species currently “in the pipeline”
- Discovered at Lycoming
– Chryseobacterium piperi CTM from the Loyalsock Creek – Chryseobacterium angstadti KM from a newt tank – Chryseobacterium diehli BLS-98 from a wastewater treatment sequencing batch reactor – Kaistella zaccaria JJC from the Loyalsock Creek – Comamonas franzi SJM from the Loyalsock Creek – Several Micrococcus luteus/ yunnanensis related strains
- Collaborations
– Meiothermus centralia Pnk1 from Centralia soils with Wade Johnson, Susquehanna Univ. – Microbispora sp. from Centralia soils with Tammy Tobin, Susquehanna Univ. – Kaistella sp. 3519-10 from Vostok Ice Core with Brent Christner, Louisiana State University – its genome has already been sequenced!
Science @ Lycoming College
- Lycoming College is a
National Liberal Arts College with ~1400 students
– Largest Majors are Biology, Business & Psychology – Member of the “Carnegie 44”
- Biology major has 4 tracks
– Cell & Molecular Biology
- Lab tech jobs
- Graduate School
– Ecology
- Clean Water Institute
- DEP, DCNR, other environmental jobs
- Graduate School
– Anatomy & Physiology
- Cadaver-based Human Anatomy
- Heath Professions
- MD, DO, DVM, DMD, Optom, PA, PT
– Comprehensive
- High School Teaching
– Tell your younger siblings and cousins to check us out!
The Unknown Microbe Lab
Source of Organisms
Lycoming Creek Loyalsock Creek
The Unknown Microbe Lab
Biolog GenIII plates
The Unknown Microbe Lab
16S Ribosomal RNA (rRNA) as an identification tool
Why is rRNA useful?
- present in all organisms
- Conserved sequences used to amplify rRNA gene from “any” organism
- rRNAs of related organisms will be more similar than those of distantly
related organisms - due to accumulation of mutations
- Ribosomal Database Project (RDP) has >1.4 million sequences
– <98.5% identity = different species – <96% identity = different genus – <92% identity = different family
- Different strains
within a species may have very different properties.
Polymerase Chain Reaction (PCR)
- Polymerase Chain Reaction (PCR) uses a heat-stable DNA polymerase
(Taq from Thermus aquaticus) to repeatedly copy DNA resulting in exponential amplification of DNA fragment defined by 2 oligonuclelotide primers.
- Two known sequences are required for primer binding sites, one for each
strand.
- Conserved sequences are
used to amplify variable fragments.
DNA Sequencing
- Dideoxynucleotides (ddNTPs)
lack 3’ OH required for addition
- f next nucleotide in a growing
strand.
- Incorporation of fluorescently labeled
ddNTP into a strand during DNA synthesis causes chain termination and tags the newly synthesized strand with dye molecule
- By starting at the same point (primer) and
terminating at different points, different sized fragments are generated, separated by electrophoresis, read by a detector and used to assemble the sequence.
The Unknown Microbe Lab
Week 9 – MIDI Fatty Acid Methyl Ester Analysis (FAME) Repeat other tests as necessary Week 11 (Fri) – Lab Report Drafts due Week 13 (Mon) – Peer Reviews (2) due Week 14 (Fri) – Final Lab reports due
FAME’s separated by Gas Chromatography
min 0.5 1 1.5 2 2.5 3 3.5 4 pA 20 40 60 80 100 FID1 A, (E10910.724\A0071372.D) 0.709 0.718 1.042 1.183 1.217 1.313 1.457 1.515 1.540 1.643 1.910 2.156 2.204 2.475 2.514 2.588 2.782 2.833 2.863 2.912 3.093 3.123 3.151 3.183 3.235 3.390 3.419 3.468 3.496 3.614 3.756 3.823 4.225