course contents 18 9
play

Course contents (18.9.) Biological background (book chapter 1) - PowerPoint PPT Presentation

Course contents (18.9.) Biological background (book chapter 1) Probability calculus (chapters 2 and 3) Sequence alignment (chapter 6) This week (18.9. and 21.9.) Rapid alignment methods: FASTA and BLAST (chapter 7)


  1. Course contents (18.9.) Biological background (book chapter 1) • Probability calculus (chapters 2 and 3) • Sequence alignment (chapter 6) • – This week (18.9. and 21.9.) Rapid alignment methods: FASTA and • BLAST (chapter 7) – Next week (25.9. and 28.9.) Phylogenetic trees (chapter 12) • Expression data analysis (chapter 11) • Introduction to bioinformatics, Autumn 2007 28

  2. Sequence Alignment (chapter 6) The biological problem l Global alignment l Local alignment l Multiple alignment l Introduction to bioinformatics, Autumn 2007 29

  3. Background: comparative genomics Basic question in biology: what properties are shared l among organisms? Genome s equencing allows comparison of organisms l at DNA and protein levels Comparisons can be used to l − Find evolutionary relationships between organisms − Identify functionally conserved sequences − Identify corresponding genes in human and model organisms: develop models for human diseases Introduction to bioinformatics, Autumn 2007 30

  4. Homologs • Two genes g B and g C evolved from the same ancestor gene g A are g A = agt gt ccgt t aagt gcgt t c called homologs g B = agt gccgt t aaagt t gt acgt c • Homologs usually exhibit conserved functions g C = ct gact gt t t gt ggt t c • Close evolutionary relationship => expect a high number of homologs Introduction to bioinformatics, Autumn 2007 31

  5. Sequence similarity Intuitively, similarity of two sequences refers to the l degree of match between corresponding positions in sequence agt gccgt t aaagt t gt acgt c ct gact gt t t gt ggt t c What about sequences that differ in length? l Introduction to bioinformatics, Autumn 2007 32

  6. Similarity vs homology Sequence similarity is not sequence homology l − If the two sequences g B and g C have accumulated enough mutations, the similarity between them is likely to be low #mutations #mutations 0 agt gt ccgt t aagt gcgt t c 64 acagt ccgt t cgggct at t g 1 agt gt ccgt t at agt gcgt t c 128 cagagcact accgc 2 agt gt ccgct t at agt gcgt t c 256 cacgagt aagat at agct 4 agt gt ccgct t aagggcgt t c 512 t aat cgt gat a 8 agt gt ccgct t caaggggcgt 1024 accct t at ct act t cct ggagt t 16 gggccgt t cat gggggt 2048 agcgacct gcccaa 32 gcagggcgt cact gagggct 4096 caaac Homology is more difficult to detect over greater evolutionary distances. Introduction to bioinformatics, Autumn 2007 33

  7. Similarity vs homology (2) Sequence similarity can occur by chance l − Similarity does not imply homology Consider comparing two short sequences against l each other Introduction to bioinformatics, Autumn 2007 34

  8. Orthologs and paralogs We distinguish between two types of homology l − Orthologs: homologs from two different species, separated by a speciation event − Paralogs: homologs within a species, separated by a gene duplication event Organism A g A g A Gene duplication event g A g A’ g B g C Paralogs Organism B Organism C g B g C Orthologs Introduction to bioinformatics, Autumn 2007 35

  9. Orthologs and paralogs (2) Orthologs typically retain the original function l In paralogs, one copy is free to mutate and acquire l new function (no selective pressure) Organism A g A g A g A g A’ g B g C g B g C Organism B Organism C Introduction to bioinformatics, Autumn 2007 36

  10. Paralogy example: hemoglobin • Hemoglobin is a protein complex which transports oxygen • In humans, hemoglobin consists of four protein subunits and four non- protein heme groups Sickle cell diseases Hemoglobin A, are caused by mutations www.rcsb.org/pdb/explore.do?structureId=1GZX in hemoglobin genes Introduction to bioinformatics, Autumn 2007 37 http://en.wikipedia.org/wiki/Image:Sicklecells.jpg

  11. Paralogy example: hemoglobin • In adults, three types are normally present – Hemoglobin A: 2 alpha and 2 beta subunits – Hemoglobin A2: 2 alpha and 2 delta subunits – Hemoglobin F: 2 alpha and 2 gamma subunits • Each type of subunit (alpha, beta, gamma, delta) is encoded by a separate gene Hemoglobin A, www.rcsb.org/pdb/explore.do?structureId=1GZX Introduction to bioinformatics, Autumn 2007 38

  12. Paralogy example: hemoglobin • The subunit genes are paralogs of each other, i.e., they have a common ancestor gene • Demonstration in lecture: hemoglobin human paralogs in NCBI sequence databases http://www.ncbi.nlm.nih.gov/sites/entrez ?db=Nucleotide – Find human hemoglobin alpha, beta, gamma and delta Compare sequences – Hemoglobin A, www.rcsb.org/pdb/explore.do?structureId=1GZX Introduction to bioinformatics, Autumn 2007 39

  13. Orthology example: insulin The genes coding for insulin in human ( Homo sapiens ) l and mouse ( Mus musculus ) are orthologs: − They have a common ancestor gene in the ancestor species of human and mouse − Demonstration in lecture: find insulin orthologs from human and mouse in NCBI sequence databases Introduction to bioinformatics, Autumn 2007 40

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend