Phenotype Genotype Host Vector Pathogen 1 Bacterial Viral - - PDF document

phenotype genotype
SMART_READER_LITE
LIVE PREVIEW

Phenotype Genotype Host Vector Pathogen 1 Bacterial Viral - - PDF document

It is like any other experiment! You need to know your data/input sources What is a bioinformatics You need to understand your methods and their assumptions experiment? You need a plan to get from point A to point B You need


slide-1
SLIDE 1

1

What is a bioinformatics experiment? It is like any other experiment!

  • You need to know your data/input sources
  • You need to understand your methods and

their assumptions

  • You need a plan to get from point A to point B
  • You need to understand your equipment
  • You need to be critical and understand

potential sources of error

  • You need to interpret your results
  • Your results need to be reproduceable
  • Your results should be testable

Phenotype Genotype

Remember the Goal

Infectious Disease Paradigm

Experimental systems Pathogen Vector Host

slide-2
SLIDE 2

2

Know your data

Bacterial Viral Parasitic Genetics Immunology Proteomic Microarray Genomics Host

>Pfa3D7|chr1_000035|2001.01.03|GENOMIC|Sanger CCAGTTTGTGATGTTTCTCCTTGTTGCACATAGCCACTTCCTTTCTTCCTCTGTTAAAAAAGCTCCTCTCCCCAGAATACTCCCTTGTAGCACCATGATTGAATGACCACTCTTTCTTAAGTGAACTTCTTGAAAAC TAGCTTGAATCCATCCTCTTCATTTCCTTACCTCTTATTTCAGTTTCAGCCTATTGCTGTCTAGCATATGATATATCTGCCATTCTGCATGTGATATATCTCAATGCCTGTCATCATTTCCAGCATTCCACCCTATA TGTGGCCTATATACCGCCATCTTCTGTCCCATGCTCTTACTGAAGTCTGCCTAGATTAGACCATTTTAAAGTCCATCACTACATACTTCTTCCTCTTCTGGTCCTTTAGCTTGATGGACACACCTGGTTTCATACTT ACTAATTATTGCATTTGGCAACTGTATTTACTACTTAAACCATTCTTTCACCCTTACGTGGTTTGGGTTGGCTGTCGCTCTCAGGAGTACCTGGCTAGCTTGGCTGCATGGTTTCCCTGCCAAAATAGATACAGGAA GGGTAGTACCTCCTAGGGGATGCTATGCTTTTTGAGATAAGGCTCCTGGAAATAGGTGTCATCCAGTCTGAACCTATGAGAAATGGAGATCGCTTGAGATTATTGCCTACTTAAAGAACCTTAGTGAACTGTTCTAC TAGGATTTACTTTTCACAGCTCCTTGATGGGGAAAAAAAAAAATTACACATTCGAGTTCTTCCCTCAGGGATAAATACGAAAAACTTGTCCATCAGCTTAGCTAGAAGTTAGCAACCACAGACAGAAACTTTGTAAT ACTTTTTTTTTAAAGTTTTATAATATTCTTAGTTCTCATATTAGTTTTTTTTTTTCATCTTTTCCCATTTTTTTGTGATTATAAAACTTTTTACATTGGAATTGTTCTGTTACTGTGTGGAGCAAAAAATAAGAGGG GAAATTGTTTAGACCTTTTATACTAAGTCTTAGTATATCCAGAGGGGAGTGTAGGTGGTGGCAGCTTTACAATGGAAAGGAAACGTGGGGAGTCCCCAGGACCTGCAGTCAATGAATAACAGGCTCCCTCCGTGACT GGTAAGAGAGTTTTGGGAGGTGAGTATAACATGACTCATTCAGGTTCTTTCTTGCTCTTTAATTTCTGTGATATTTTGCTGGTCTATGAAACTGATAAGCCTTAATGGTTGCAGTTTATTCTGACCCAGAGTTTATT TTCTAGTGATGCCTTTATTTTTTTGTTGGGATGTTGCTGGATAGTAAAGTAAAACTGAAGATCCTGGCCTTTCTCGTTCTCCTTCAAGAAAATGGGATTTCAGAAAACACTTGTCTTTAGCCTTCTCATGAAATTAA TTTCATAGACCTGTTTCTTGTTTTGATGGAACCCCTGTAGAGTTACCTAATATAAAGGTATATTAAGATTTTCTAGAGAAACAGAACCCATAGGCTAGATGAAGAGATAGATTTAAGAGGAGAGTTCTTGTAGGTGT TGGCTCACATGGTTATAAGGAAGTCCCATGATCTGTAGCCTGGAGAACCAGGACAGCTGGTGATGTGTTTCAGTCTAGGTTCAAAGGTCTGAGAGTCTGAGAATCAGGTGGACGGGAGGGTGTTAGTCCTGGGCTGG GTCTGAAAGCTTAAGAACTTAAGGCCGGATGTCTAGGGGCAGGAGACGAGTGTCTCAGCTCTAGCACAGAGCGAATTTGCCCTGCTGCCTCTTTGTTTATTCAGGCCCTCAGTGTATTGGAAGATGCCCACTTAAAG TAGGCAGTCTACTTAAAGAGGACCAGTCTACTTTGGTCTTCCCCAAAGTGGATTAGGGAGGATCATCTACTTTACTCAGTTATCAATTTAAGTGCTAATCTCTTCTGGAAATGCCCTCATAGACACACCCTGAAATA ATGGTTTACCAGCTATCTGGGCACCTCTTACTTTACATTCACAAGTAATGTTAACCATCACAACAGAGTAAAAAAACCTGTTGTTGTTGTTATTGTCTTGCTTTTTTAAAGGAGAGATTTATTTCTGCTGCTAGTTT AACTTCCTCCTAAAACTGGTTTGGTAATATTCGTGAACTCCCATGACTAGAGAAACTTCAGGTGTCTGTAGAGCTCTTTGACTTTCAGAACCGTGTTGCAAGTGTCCTTAACTGATTTGAAAGTTCTAATAACAACC AACGTTGGAATGTTGATCGTGTTCTAGGTGCTGTGCCAAATGCTCCCCGCATAACCTCTCACCTGATTGCTAAAACAGCTTTATGAGCTCCGTATTGTCTACATTTTACAGGAGTGCCAACTGAAGCCTGGACAAGT AAATTATCTTCTCAGGTTACACATAGGCTGCTGGGCCTGGCTTCTGGCCTTCAATTCTAACTATTGTTATTTTCATGAAAGTGACACCTTAAGTGCTTTCTTTGGTAGTGGTGTTGGGGTAAGCCTTTGTAGAACAG AACAGTTGTTACAGAAAACTTGTTTACATGGAAGCATTCCTTCAGCGATGACTGACAGACGGGAAAAGCAAAGTGCAGGTCGACCATCTCAAATATGAAAATGTGAAATCCAAAATGCTCCAAAATCCAAAACTTCT TGGGATTGACATGATGCTCATAGGAAATGCTCTTTGGAGCATTTTGGATTTTAGATTTTTGGGTTCGGGATGCTCAACCTGTAAGTATAATGCAAATATTCCAAATTCTGAAAAAAGGCGAAATCAGAAACACTTTT GGTCCCAGACATTTAGAATGAGGGCTATGCAACCTGTAACTGAGAATTTTTACCTAGTGCATATTATGTCCAAGTAACTAACAACTGTTGAAGGAAAGAATTTTAACATCCCATTTTACTCTCATTAAGTGGTTGTG GAAATGACCAATGGCATTTATACTTAGGTTTGTAACATCATCCATTTATTATACGGTCTTTCTTTGCTTATCTGCTGCATTCTTGAGATTGAAAATTTTATCCTGGAATAATAAATGACCCTATCTCAAACAGCTGC CATGTTAAGATGAATAAGAACATCATAGGGGGAGTAGATGCATTTTTGGGAGGCCTCCATCTGAAGTGACATGAATTCATAACACTCTAGTTCTGTCTACATGTCATGCTGTTACTAGGTGAGCAGGGAACTGTCAT TCCTACACCTTATTTAATAGAGGTGATCAGAATGGAGGATAAAGGGAAATAGCATGAGACTGTGAATGGATGTGGGGATTCTCATTGGTTTTGCTGCCAAGTAGAATCGTGTCACCTAGCAAATCACAACATTTCTG GCCTTCACTTTCCTGACTAGTAAAACGAGGTTTTTGAACTAGGCTGTCTTTACTGATTCTTTAACTGCTAAAGTTCTATGATTTTACATATGAAACCAAACCTAACAACATTGCTAACATGTATTTTTCAAAGCCAC AGAAGTTACATGCACATTTAATGAAGTTCCAGTGGCTTTATTAGAATTGGCTGATTGTACCATTATATTGCATTATAATAGCAAGGGTGAGGGTTGTTTACTTGTTCGGGGAAGGGGGGCATTGGGGCTACTTGTAC TTAAGCCTCAGGCCTGCCTGCTTCATGATCTTTGCTTGCCTTTTCTCACTACTAATTGCCCCTCACTTACAAGCTGAGACCTGCCCTCTTTCCCCTAGGGCTAATGCCTGTGTTGGGATCTTGAGCTCTCTTTTTGT TAACTGATTCTCTGTGTTTTTTTGTTTTTTTGTTTTTTTTTTGAGACGAGTCTCGCTGTGTCGCCCAAGCTGGAGCGCAGTGGTGTGATCTCTGCTCACTGAAACCTCTACCTGCCGGGTTCAAGCAATTCTCCTGC CTCAGCCTCCTGAGTAGCTGGGATTACAGGCATGCACCACCACGCCTGGCTAATTTTTGTATTTTTAGTAGAGACGGGGTTTCACCATGTTGGTCAGGCTGGTCTCAAACTCCTGACCTTGTGATCCGCCCACCTTG GCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCGCGCCTGGCCCTCCTTTTTTCTTTTTTGGGACGGTGTGTGGCTCATTGCTTAGGCTGGATTGCAGTGGCACAATCTCGGCTTACTACAACCTCCGCCTTC CTGGTTCAAGTGATTCTTCTCCCTCGCTTCCCGAGTAGCTAGGATTACAGGCGCCCGCCACCATGCCCTGCTAATTTTGTATTTTTAGTAGAGACGAGGTTTCACCAAGTTGGTCTGGCTGGTCTTGAACTCCTGAC CTCAGGTGATCCACCCAACTTGGCCTCCCAAAGTGCTGGGATTACAGGTGTGAGCCACTGCACCCGGCCATGTTAACTGGTTTTCTTTTTTTGTCCCTGAGTTCCTGTCTTTAGGATCTGAACACTTGATTTTTATT TATTTATTTATTTTTTTTGTTTGTGAAACTCGTTTCGCTTTTGTTGCCCAGGCTGTAGTGCAATGGCATGATCTCGGCTCACTGTGCAACCTCTGCCTTCTGGGTTCAAGCAATTCTCCTGCCCCACCCAGCCTCCT GAGTATCTGGGATTACAGGCTCCTGCCAGCACGCCCGGCTAATTTTTGTATTTTTTAGTAGAGATGGGGTTCACCACTTTGGCCTGGCTGGTCTTGAACTCCCGACCCCAGATGATCCGCTTGCCTCGGCCTGCCAA AGTGTTGGGATTACAGCTGTGAGCCACTGTCCCCGGCCTTTTTTTTTTTTTTTTAGATGGGGTCTTTTTCTATTGCCCAGGCTGGAGTGCAGTGGTTTGATCATAGCTCACTGTAGCCTTGAACTCCTGGGCTCAAA CAATCCCCCACCTCAGCCTCCCAAAGCGCTGGGATTATAGGCATGAGCTACCACACCCGGCCTGAACACTTGATCCCTTTTTTTTTTTTTTTTTTTTTTGAGACAGCAATGATGCGATCTTGGCTCACTGTAACGTA CGCCTCAAGTAGCTGGGATTATAGGCGCCCGCCACCATGCCTGGCCAATTTTTTTGGATTTTAGTAGAGACAGGGTTTCACCATGTTGGCCGGGCTGGTCTGGAACTCCTGACCTCAGGTGATCTTCCCGCCTTGGC TTCCGAAAATGGGATTACTGGCGTGAGCCACCGTGCCCGGCCTCACTGGAGCTCTTTTAATAGGTGAACTCTGGTTGCCCCTTTGCATGTCTCTTATTCCTTCCTCTGCTATAGGAATATAGGCTTTTAAACCCCAA CTCCGTGAGTAGACCAGCCTGCTTCTCTGAATTTCTGAGTACCAGGTGAACCTGCAGGGTGTCATGTCAGAAACAGAGACTTTTTTTTTTTATAGTGAAGATGTCCTTGATGACTGTGTATACAAATACACACACAT ACACACTTTTTAAAAAAAGTTAATTTCCAGACTTTATGGACAGTGTGCAGATTCTTTATTATATCACAGTGTTATTTTTCTTGCCTGCATTTCCCCCCACCTTCTATGGCTTTGCCTGTATTACCACATATTTATTA CAGAATCCTTTGACACCAGTGTTCTGGCTGATTCCCTGTCAACCCTCTGTTGTCTCCCTCTGTTCCCCACCTAACTCTCTCTAAGTGGGCAGGCTTGTTTTTGGTTATGATTCGCCCCAAAAGTTATAAAAGTACAT TTGGATCATAGTTGCCTTTGATGGTTTCTGCGGTAGAACCAGTGGTGCCAGTTAATTTCTTGAATGGCTGCCCCCATAAATTGGGAGTAGCTATTGGAAGTGCTTTGTGAGCTTATCAGGGAAATGACAGGACTGAA TAATGATCTGTCATGGGCATGGTATGGGGGGTGGTGGCACATGTGCCATCATTTGCCAGTGGCCCCGGAAGCCCAACACTCTGTTTATATATGTGTATTAATTGTTTCTTTGGTTGTCCAGCATTGGACTCATAATG GCCTTTTGTATATATCAGGGTTCCTCACCGTTTGAAGTAGAGTTTCCAATACCTACTTTAACATTGGCTCAGCCACTTATATTTACAAAAGGTCTCAAGATTTCTTACTGGTAGAATTATTTAGATTCTATACTTAA TATTAAGCAATTTCACCCTTGAGTCATAATTTCCAAAGTGTGCTCTCCCAGTATATTCTAATAGCGGTTCCCAGGATTTGGACCACGGACTGTATTGAGGAAAAATGCTGGTTGCTAGGTATTAAGAACTGATGTAA ATTAGTAAGAAAAGACAGATGATCCATTGAAAATGTGGTAAAATAATAATAGGTAATGTTTGCCGAGTGTGCCAGATCCTGTGGTAAGTGTTTTAAATGTTGTGTTGGTTGCTTTTCATAGTTCCCTAATGAGATCA TTATGATTATCCCTAATTTGTGCTTGAGGAAGTGAGGCACAGAAGCTCATTAAGTTCCCTGAGGTCACCCATACTTAAGTGATGGAACCAGGACTTGAGCCGAGTCAGCCCAACTCCAGAGCCTGTCCTCATAACCA ATGTGTTGTAAAGGTCAAAGGAGATTTCCGGATCTTCACAGAAAGGGAACACAAATTCACATTGACAGATATAAATTATTTTGAGGTACCGCTTTTCACTTCTGAGATTCAAGTGTGACTCTGGCAAGAAGGTGATG TATATACTTACATTAATGGAATATATAATATCTTTTTTTAAAAAAATGATGTTTAACAGCTGTTGGTATCATTGCCTAAATCAATTATATTATTAGTGTTGCAGAATGATGATACTCTAATTGTATCATTATTTTTT CATGTATTAACTCTGATACTTTTTTTTTTTTTGAGACGGAGTCTCGCTCTGTCCCCCAGGCTGCAGTGTAGTGGCGTGATCTCTGTTCACTGCAAGCTCCGCCTCTCGGGTTCACGCCATTCTCCTGCCTCAGCCTC CTGAGTAGCTGGGACTACAGGTGCCGGCCACCACGCCCGGCTAATTTTTTTGTATTTTTAGTAGCGACGGGGTTTCACCGTGTTAGCCAGGATGATCTCGATCTCCTGACCTCGTCATCCACCCGCCCTGGCCTCCC AAAGTGCTGGGATTACAGGCGTGAGCCACCACGCCCGGCCTATAACTTTGATACTTTTATAAAAGAAATTTACTCCTGATCAATTACTTTGCTTTCTGGAAGTCACTTTATCCAGGAAGGCCAAGATAAGTCCTTGT TTGTTTTCCTTTTTTGTCTATTTCCAAAATGGTAGTCCCCCACCTTATTCATGGTTTTGCTTTCTGTGGTTTCAGTTAAATGGAAAATTCCAGAAATAAATAGTTCATAAGTTTTACTTATTTATT

Typical 2 D gel

slide-3
SLIDE 3

3

Know your method

Can you tell when something has gone wrong? Can you determine why?

slide-4
SLIDE 4

4

slide-5
SLIDE 5

5

Know your equipment

Do you know how it works? Do you know how to fix it?

slide-6
SLIDE 6

6

Know your procedure

How do you get from point A to point B?

slide-7
SLIDE 7

7

slide-8
SLIDE 8

8

Know your technique

How were the results/data generated? What sources of error do these techniques produce?

Microarrays: A snapshot of all detectable RNA’s present in a given cell type, tissue, disease state, experimental conditions or developmental stage relative to a control. Proteomics: A snapshot of all detectable proteins in a given cell type, tissue, disease state, experimental condition

  • r developmental stage

EST’s: A snapshot of all detectable RNA’s [usually Poly(A)+] present in a give cell type, tissue, disease state, experimental condition or developmental stage.

Remember the assumptions Microarrays

  • cDNA microarrays
  • “GeneChip” in situ synthesized
  • ligonucleotide arrays
  • Oligomer (~70mer) arrays

cDNA Microarrays

Robotic microarrayer

slide-9
SLIDE 9

9

Chip Oligo Array Hybridization

General Scanning ScanArray 3000

What about protein expression?

SEQUEST Database Search

Mass Spectrometer Protein Database Nucleic Acid Database EST Database

Tandem Mass Spectrum Theoretical Mass Spectrum Correlation Analysis Ranked Score of Matched Peptides

slide-10
SLIDE 10

10

ENNPCKLQYDYNTNVTHGFGQEYPCETDIVERFSDTEGAQCDKKKIKDNSEGACAPYRRL HVCVRNLENINDYSKINNKHNLLVEVCLAAKYEGESITGRYPQHQETNPDTKSQLCTVLA RSFADIGDIIRGKDLYRGGNTKEKKKRKKLEENLKTIFGHIYDELKNGKTNGEEELQKRY RGDKDNDFYQLREDWWDANRETVWKAITCNAGSYQYSQPTCGRGEIPYVTLSKCQCIAGE VPTYFDYVPQYLRWFEEWAEDFCRKKKKKIPNVKTNCRQVQRGKEKYCDRDGYNCDGTIR KQYIYRLDTDCTKCSLACKTFAEWIDNQKEQFDKQKQKYQNEISGGGGRRQKRSTHSTKE YEGYEKHFNEELRNEGKDVRSFLQLLSKEKICKERIQVGEETANYGNFENESNTFSHTEY CDRCPLCGVDCSSDNCRKKPDKSCDEQITDKEYPPENTTKIPKLTAEKRKTGILKKYEKF CKNSDGNNGGQIKKWECHYEKNDKDDGNGDINNCIQGDWKTSKNVYYPISYYSFFYGSII DMLNESIEWRERLKSCINDAKLGKCRKGCKNPCECYKRWVEKKKDEWDKIKEFFRKQKDL LKDIAGMDAGELLEFYLENIFLEDMKNANGDPKVIEKFKEILGKENEEVQDPLKTKKTID DFLEKELNEAKNCVEKNPDNECPKQKAPGDGAAPSDPPREDITHHDGEHSSDEDEEEEEE EEQQPPAEGTEQGEEKSESKEVVEQQETPQKDTEKTVPTTTPTVDVCDTVKTALADTGSL NAACSLKYVTGKNYGWRCIAPSGTTSGKDGAICVPPRTQELCLYYLKELSDTTQKGLREA FIKTAAQETYLLWQKYKEDKQNETASTELDIDDPQTQLNGGEIPEDFKRQMFYTFGDYRD LFLGRYIGNDLDKVNNNITAVFQNGDHIPNGQKTDRQRQEFWGTYGKDIWKGMLCALQEA GGKKTLTETYNYSNVTFNGHLTGTKLNEFASRPSFLRWMTEWGDQFCRERITQLQILKER CMVYQYNGDKGKDDKKEKCTEACTYYKEWLTNWQDNYKKQNQRYTEVKGTSPYKEDSDVK ESKYAHGYLRKILKNIICTSGTDIAYCNCMEGTSTTDSSNNDNIPESLKYPPIEIEEGCT CKDPSPGEVIPEKKVPEPKVLPKPPKLPKRQPKERDFPTPALKNAMLSSTIMWSIGIGFA TFTYFYLKKKTKSTIDLLRVINIPKSDYDIPTKLSPNRYIPYTSGKYRGKRYIYLEGDSG TDSGYTDHYSDITSSSESEYEELDINDIYAPRAPKYKTLIEVVLEPSGNNTTASGNNTPS DTQNDIQNDGIPSSKITDNEWNTLKDEFISQYLQSEQPNDVPNDYSSGDIPLNTQPNTLY FDNPDEKPFITSIHDRDLYSGEEYSYNVNMVNTNNDIPISGKNGTYSGIDLINDSLNSNN

Peptide database

Interpret your results

Are they what you expected? Might your data have violated some assumptions?

slide-11
SLIDE 11

11

You can’t Break it

  • What is the worst that can happen?

– Crash the program? – Restart the computer? – Waste three hours of compute time? – Learn something?

  • Bioinformatics is a technique, it must be

learned and learning involves exploration and mistakes - The important thing is to learn from your mistakes

You can misinterpret it!

  • Biological data are not random
  • Molecules or parts of molecules do not

evolve/behave independently

  • Data and computer programs were

generated by humans

  • The computer ALWAYS has an answer, our

job is to be sure it is the best possible answer given what we know.