18 417 introduction to computational molecular biology
play

18.417 Introduction to Computational Molecular Biology Foundations - PowerPoint PPT Presentation

18.417 Introduction to Computational Molecular Biology Foundations of Structural Bioinformatics Sebastian Will MIT, Math Department Fall 2011 S.Will, 18.417, Fall 2011 Credits: Slides borrow from slides of J er ome Waldisp


  1. 18.417 Introduction to Computational Molecular Biology — Foundations of Structural Bioinformatics — Sebastian Will MIT, Math Department Fall 2011 S.Will, 18.417, Fall 2011 Credits: Slides borrow from slides of J´ erˆ ome Waldisp¨ uhl and Dominic Rose/Rolf Backofen

  2. Before we start Instructor: Sebastian Will Contact: wills@mit.edu Office hours: by appointment, Office: 2-155 Lecture: Tuesday, Thursday, 9:30-11:00 am Room: 8-205 Web: http://math.mit.edu/classes/18.417/ (slides, further information) Credits/Evaluation: no assignments, no exam, but Final Project Final Project: • study paper in depth, implement/extend S.Will, 18.417, Fall 2011 algorithm, or theoretical proof • project report (2-4 pages), talk (20 min) • find a topic during term

  3. What is Computational Molecular Biology (a.k.a. Bioinformatics)? Short answer: study of computational approaches to study of biological systems (at the molecular level) Today: somewhat longer answer, including • What are the components of biological systems? • How do they work together? • What is their chemistry and structure? • Which aspects do we want to study in Computational Biology? • What is Structural Bioinformatics? S.Will, 18.417, Fall 2011 • What can you learn in this course?

  4. Components of Biological Systems • Three classes of biological macromolecules : • DNA (= deoxyribonucleic acid) • RNA (= ribonucleic acid) • Protein • Single molecules are linear chains of building blocks, specified by sequence of their building blocks, e.g. ACTGGAGCGTC. • Molecules form 3D- structures . Folding is a physical process ( minimize energy ) • “Levinthal Paradox”: fast folding but huge conformation space • Structure allows macromolecules to interact. S.Will, 18.417, Fall 2011 Structure=Function , e.g. ’lock&key’

  5. Information Flow — Central Dogma Replication Transcription Translation DNA RNA Protein DNA: store genetic information (e.g. in genome ); regular double helix structure building blocks: 4 nucleotides A,C,G, and T (Adenine, Cytosine, Guanine, Thymine) RNA: intermediate for protein synthesis ( messenger RNA ), catalytic and regulatory function ( non-coding RNA ) building blocks: 4 nucleotides A,C,G, and U (U=Uracil) and some rare other nucleotides S.Will, 18.417, Fall 2011 Protein: catalytic and regulatory function ( ‘enzymes’ ) building blocks: 20 amino acids + 1 rare aa

  6. Genetic code • Transcription: A,C,G,T �→ A,C,G,U • Translation: Tripletts from alphabet { A,C,G,U } (= codons ) redundantly code for amino acids S.Will, 18.417, Fall 2011

  7. Information Flow (Cell Compartments) S.Will, 18.417, Fall 2011

  8. Protein Bio-Synthesis S.Will, 18.417, Fall 2011 Important for molecular mechanism: complementarity of nucleotides G-C, A-T, A-U

  9. Evolution ( ) Gram-positives Fungi Animals Chlamydiae Slime moulds Green nonsulfur bacteria Plants ACCGA Actinobacteria Algae Planctomycetes Spirochaetes Protozoa ACCTA T Fusobacteria Crenarchaeota Cyanobacteria Nanoarchaeota (blue-green algae) C ACCCGA C T TCCTA ACTA Euryarchaeota Thermophilic sulfate-reducers Acidobacteria Protoeobacteria • variaton (imperfect replication: point mutation, deletion, insertion, ... ) S.Will, 18.417, Fall 2011 • selection • homologous sequences

  10. What can we study (computationally)? S.Will, 18.417, Fall 2011

  11. What can we study (computationally)? • Evolutionary relation between homologous molecules/fragments of molecules • Structural relation between molecules • Relation between sequence and structure • Interaction between molecules • Interaction networks, Regulatory networks, Metabolic networks • Structure of genomes, Relation between genomes • . . . S.Will, 18.417, Fall 2011

  12. Areas of Bioinformatics 1. Genomics: Study of entire genomes. Huge amount of data, fast algorithms, limited to sequence. 2. Systems Biology: Study of complex in- teractions in biological systems. High level of representation. 3. Structural Bioinformatics: Study of the S.Will, 18.417, Fall 2011 folding process of bio-molecules. Less structural data than sequence data avail- able, step toward function, fills gap be- tween genomics and systems biology.

  13. Some Organic Chemistry Biological macromolecules (and most organic compounds) are built from only few different types of atoms • C — Carbon • H — Hydrogen • O — Oxygen • N — Nitrogen • P — Phosphor • S — Sulfur CHNO: 99% of cell mass Organic Chemistry = Chemistry of Carbon Special properties of Carbon • binds up to 4 other atoms, e.g. Methane (tetrahedron conformation) • small size S.Will, 18.417, Fall 2011 • strong covalent bonds 1e covalent bond: +1 +1 2e +1 H H H H – H • chains and rings ⇒ large, stable, complex molecules

  14. Non-covalent bonds • Covalent 1e 2e +1 +1 +1 H H H H – H • Non-covalent • Van der Waals (sum of the attractive or repulsive forces between molecules, caused by correlations in the fluctuating polarizations of nearby particles) • hydrogen bonds (attractive interaction of a hydrogen atom with an electronegative atom) • ionic bonds (electrostatic attraction between two oppositely charged ions, e.g. Na+ Cl ) thermal movement C−−C Bond S.Will, 18.417, Fall 2011 [in kcal/mol] 0.1 1 10 100 1000 non−covalent complete Bond glucose oxidation

  15. Functional groups organic molecules: carbon skeleton + functional groups functional groups are involved in specific chemical reactions Alcohol C O H hydroxyl group carbonyl group Ketone C O /Aldehyde O Carboxylic Acid carboxyl group C C O H S.Will, 18.417, Fall 2011 H amino group Amine C N H

  16. Small organic molecules Small: ≤ 30 atoms 4 families: • sugars ⇒ component of building blocks, main energy source • fats / fatty acids ⇒ cell membrane, energy source • amino acids ⇒ proteins • nucleotides S.Will, 18.417, Fall 2011 ⇒ DNA + RNA, energy currency

  17. Sugars ⇒ component of building blocks, main energy source • general formula (CH 2 O) n , different lengths (e.g n=5, n=6) • linear, cyclic For example, saccharose (glucose+fructose): CH OH 2 CH OH 2 O H H O H H OH H H HO O HO CH OH S.Will, 18.417, Fall 2011 2 H OH OH H

  18. Fats Fat = Triglyceride of fatty acids ⇒ cell membrane (lipid bilayer), energy source S.Will, 18.417, Fall 2011

  19. Amino Acids • all aa same build • aa differ in side chains R • size • charge: positiv/negativ (sauer/basisch) • hydrophobicity: hydrophobic/hydrophilic • in naturally occuring proteins: 21 different amino acids S.Will, 18.417, Fall 2011

  20. Amino Acids S.Will, 18.417, Fall 2011

  21. Nucleotides Purines pentose Base glycosidic bond Adenine Guanine OH = ribose Pyrimidines H = deoxyribose nucleoside nucleotide monophosphate nucleotide diphosphate R nucleotide triphosphate Cytosine Uracil Thymine Nucleotides work as energy currency of metabolism S.Will, 18.417, Fall 2011 NTP − → P + NDP + E (split of nucleoside triphosphate into phosphate + nucleoside diphosphate releases energy)

  22. Complementarity of Organic Bases H O H N H N N H O N N N H N N N H N N N N N O N H O Guanine H Cytosine Adenine Thymine S.Will, 18.417, Fall 2011

  23. DNA structure Primary structure: chain of nucleotides Tertiary Structure: antiparallel double helix Thymine Adenine 5' end O_ O 3' end NH2 O P N OH O _O HN N N O N N O O O O_ O P NH2 O O O N P O O _O N HN N N O N O O H2N O_ O Phosphate- O P O H2N O N O deoxyribose P O O _O backbone NH N N N O N O O O_ O H2N O P O O O P O N _O O N N NH N O O N O NH2 S.Will, 18.417, Fall 2011 O_ O OH Cytosine P O 3' end _O Guanine 5' end RNA primary structure similar, but • ribose not deoxyribose , • U not T , • single stranded

  24. RNA structure tRNA Hammerhead Ribozyme S.Will, 18.417, Fall 2011 mainly stabilized by contacts between complementary bases (H-bonds) ⇒ RNA secondary structure = set of base pairs

  25. RNA secondary structure • set of pairs of (complementary) bases that form H-bonds • 2D representation (typical tRNA clover-leaf) C A C C A G G C G C C G G C U A G U U G A C U U C G U A G A G G C C C G G U C C G G G C G C G G U A C U U C GC GGU U A C G A C G C G U U U A A G C S.Will, 18.417, Fall 2011 • linear representation GGGCGUGUGGCGUAGUCGGUAGCGCGCUCCCUUAGCAUGGAGAGGUCUCCGGUUCGAUUCCGGACACGCCCACCA (((((((..((((........)))).(((((.......)).)))...(((((.......)))))))))))).... • note: example is pseudoknot-free

  26. Protein Primary Structure • Protein = chain of amino acids (AA) • aa connected by peptide bonds S.Will, 18.417, Fall 2011 and so on . . .

  27. Protein Structure Formation / Folding • minimization of free energy • Forces between amino acid side chains • hydrophobic interaction • H-bonds • electro-static force • van-der-Waals force • disulfide bonds S.Will, 18.417, Fall 2011

  28. Protein secondary structure: α -helix Features: • 3.6 amino acids per turn • hydrogen bond between residues n and n + 4 • local motif • approximately 40% of the structure S.Will, 18.417, Fall 2011

  29. Protein secondary structure: β -sheets Features: • 2 amino acids per turn • hydrogen bond between residues of different strands • involve long-range interactions • approximately 20% of the structure S.Will, 18.417, Fall 2011

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend