18.417 Introduction to Computational Molecular Biology Foundations - PowerPoint PPT Presentation

18.417 Introduction to Computational Molecular Biology — Foundations of Structural Bioinformatics — Sebastian Will MIT, Math Department Fall 2011 S.Will, 18.417, Fall 2011 Credits: Slides borrow from slides of J´ erˆ ome Waldisp¨ uhl and Dominic Rose/Rolf Backofen

Before we start Instructor: Sebastian Will Contact: wills@mit.edu Office hours: by appointment, Office: 2-155 Lecture: Tuesday, Thursday, 9:30-11:00 am Room: 8-205 Web: http://math.mit.edu/classes/18.417/ (slides, further information) Credits/Evaluation: no assignments, no exam, but Final Project Final Project: • study paper in depth, implement/extend S.Will, 18.417, Fall 2011 algorithm, or theoretical proof • project report (2-4 pages), talk (20 min) • find a topic during term

What is Computational Molecular Biology (a.k.a. Bioinformatics)? Short answer: study of computational approaches to study of biological systems (at the molecular level) Today: somewhat longer answer, including • What are the components of biological systems? • How do they work together? • What is their chemistry and structure? • Which aspects do we want to study in Computational Biology? • What is Structural Bioinformatics? S.Will, 18.417, Fall 2011 • What can you learn in this course?

Components of Biological Systems • Three classes of biological macromolecules : • DNA (= deoxyribonucleic acid) • RNA (= ribonucleic acid) • Protein • Single molecules are linear chains of building blocks, specified by sequence of their building blocks, e.g. ACTGGAGCGTC. • Molecules form 3D- structures . Folding is a physical process ( minimize energy ) • “Levinthal Paradox”: fast folding but huge conformation space • Structure allows macromolecules to interact. S.Will, 18.417, Fall 2011 Structure=Function , e.g. ’lock&key’

Information Flow — Central Dogma Replication Transcription Translation DNA RNA Protein DNA: store genetic information (e.g. in genome ); regular double helix structure building blocks: 4 nucleotides A,C,G, and T (Adenine, Cytosine, Guanine, Thymine) RNA: intermediate for protein synthesis ( messenger RNA ), catalytic and regulatory function ( non-coding RNA ) building blocks: 4 nucleotides A,C,G, and U (U=Uracil) and some rare other nucleotides S.Will, 18.417, Fall 2011 Protein: catalytic and regulatory function ( ‘enzymes’ ) building blocks: 20 amino acids + 1 rare aa

Genetic code • Transcription: A,C,G,T �→ A,C,G,U • Translation: Tripletts from alphabet { A,C,G,U } (= codons ) redundantly code for amino acids S.Will, 18.417, Fall 2011

Information Flow (Cell Compartments) S.Will, 18.417, Fall 2011

Protein Bio-Synthesis S.Will, 18.417, Fall 2011 Important for molecular mechanism: complementarity of nucleotides G-C, A-T, A-U

Evolution ( ) Gram-positives Fungi Animals Chlamydiae Slime moulds Green nonsulfur bacteria Plants ACCGA Actinobacteria Algae Planctomycetes Spirochaetes Protozoa ACCTA T Fusobacteria Crenarchaeota Cyanobacteria Nanoarchaeota (blue-green algae) C ACCCGA C T TCCTA ACTA Euryarchaeota Thermophilic sulfate-reducers Acidobacteria Protoeobacteria • variaton (imperfect replication: point mutation, deletion, insertion, ... ) S.Will, 18.417, Fall 2011 • selection • homologous sequences

What can we study (computationally)? S.Will, 18.417, Fall 2011

What can we study (computationally)? • Evolutionary relation between homologous molecules/fragments of molecules • Structural relation between molecules • Relation between sequence and structure • Interaction between molecules • Interaction networks, Regulatory networks, Metabolic networks • Structure of genomes, Relation between genomes • . . . S.Will, 18.417, Fall 2011

Areas of Bioinformatics 1. Genomics: Study of entire genomes. Huge amount of data, fast algorithms, limited to sequence. 2. Systems Biology: Study of complex interactions in biological systems. High level of representation. 3. Structural Bioinformatics: Study of the S.Will, 18.417, Fall 2011 folding process of bio-molecules. Less structural data than sequence data avail- able, step toward function, fills gap between genomics and systems biology.

Some Organic Chemistry Biological macromolecules (and most organic compounds) are built from only few different types of atoms • C — Carbon • H — Hydrogen • O — Oxygen • N — Nitrogen • P — Phosphor • S — Sulfur CHNO: 99% of cell mass Organic Chemistry = Chemistry of Carbon Special properties of Carbon • binds up to 4 other atoms, e.g. Methane (tetrahedron conformation) • small size S.Will, 18.417, Fall 2011 • strong covalent bonds 1e covalent bond: +1 +1 2e +1 H H H H – H • chains and rings ⇒ large, stable, complex molecules

Non-covalent bonds • Covalent 1e 2e +1 +1 +1 H H H H – H • Non-covalent • Van der Waals (sum of the attractive or repulsive forces between molecules, caused by correlations in the fluctuating polarizations of nearby particles) • hydrogen bonds (attractive interaction of a hydrogen atom with an electronegative atom) • ionic bonds (electrostatic attraction between two oppositely charged ions, e.g. Na+ Cl ) thermal movement C−−C Bond S.Will, 18.417, Fall 2011 [in kcal/mol] 0.1 1 10 100 1000 non−covalent complete Bond glucose oxidation

Functional groups organic molecules: carbon skeleton + functional groups functional groups are involved in specific chemical reactions Alcohol C O H hydroxyl group carbonyl group Ketone C O /Aldehyde O Carboxylic Acid carboxyl group C C O H S.Will, 18.417, Fall 2011 H amino group Amine C N H

Small organic molecules Small: ≤ 30 atoms 4 families: • sugars ⇒ component of building blocks, main energy source • fats / fatty acids ⇒ cell membrane, energy source • amino acids ⇒ proteins • nucleotides S.Will, 18.417, Fall 2011 ⇒ DNA + RNA, energy currency

Sugars ⇒ component of building blocks, main energy source • general formula (CH 2 O) n , different lengths (e.g n=5, n=6) • linear, cyclic For example, saccharose (glucose+fructose): CH OH 2 CH OH 2 O H H O H H OH H H HO O HO CH OH S.Will, 18.417, Fall 2011 2 H OH OH H

Fats Fat = Triglyceride of fatty acids ⇒ cell membrane (lipid bilayer), energy source S.Will, 18.417, Fall 2011

Amino Acids • all aa same build • aa differ in side chains R • size • charge: positiv/negativ (sauer/basisch) • hydrophobicity: hydrophobic/hydrophilic • in naturally occuring proteins: 21 different amino acids S.Will, 18.417, Fall 2011

Amino Acids S.Will, 18.417, Fall 2011

Nucleotides Purines pentose Base glycosidic bond Adenine Guanine OH = ribose Pyrimidines H = deoxyribose nucleoside nucleotide monophosphate nucleotide diphosphate R nucleotide triphosphate Cytosine Uracil Thymine Nucleotides work as energy currency of metabolism S.Will, 18.417, Fall 2011 NTP − → P + NDP + E (split of nucleoside triphosphate into phosphate + nucleoside diphosphate releases energy)

Complementarity of Organic Bases H O H N H N N H O N N N H N N N H N N N N N O N H O Guanine H Cytosine Adenine Thymine S.Will, 18.417, Fall 2011

DNA structure Primary structure: chain of nucleotides Tertiary Structure: antiparallel double helix Thymine Adenine 5' end O_ O 3' end NH2 O P N OH O _O HN N N O N N O O O O_ O P NH2 O O O N P O O _O N HN N N O N O O H2N O_ O Phosphate- O P O H2N O N O deoxyribose P O O _O backbone NH N N N O N O O O_ O H2N O P O O O P O N _O O N N NH N O O N O NH2 S.Will, 18.417, Fall 2011 O_ O OH Cytosine P O 3' end _O Guanine 5' end RNA primary structure similar, but • ribose not deoxyribose , • U not T , • single stranded

RNA structure tRNA Hammerhead Ribozyme S.Will, 18.417, Fall 2011 mainly stabilized by contacts between complementary bases (H-bonds) ⇒ RNA secondary structure = set of base pairs

RNA secondary structure • set of pairs of (complementary) bases that form H-bonds • 2D representation (typical tRNA clover-leaf) C A C C A G G C G C C G G C U A G U U G A C U U C G U A G A G G C C C G G U C C G G G C G C G G U A C U U C GC GGU U A C G A C G C G U U U A A G C S.Will, 18.417, Fall 2011 • linear representation GGGCGUGUGGCGUAGUCGGUAGCGCGCUCCCUUAGCAUGGAGAGGUCUCCGGUUCGAUUCCGGACACGCCCACCA (((((((..((((........)))).(((((.......)).)))...(((((.......)))))))))))).... • note: example is pseudoknot-free

Protein Primary Structure • Protein = chain of amino acids (AA) • aa connected by peptide bonds S.Will, 18.417, Fall 2011 and so on . . .

Protein Structure Formation / Folding • minimization of free energy • Forces between amino acid side chains • hydrophobic interaction • H-bonds • electro-static force • van-der-Waals force • disulfide bonds S.Will, 18.417, Fall 2011

Protein secondary structure: α -helix Features: • 3.6 amino acids per turn • hydrogen bond between residues n and n + 4 • local motif • approximately 40% of the structure S.Will, 18.417, Fall 2011

Protein secondary structure: β -sheets Features: • 2 amino acids per turn • hydrogen bond between residues of different strands • involve long-range interactions • approximately 20% of the structure S.Will, 18.417, Fall 2011

18.417 Introduction to Computational Molecular Biology Foundations - PowerPoint PPT Presentation

18.417 Introduction to Computational Molecular Biology Foundations of Structural Bioinformatics Sebastian Will MIT, Math Department Fall 2011 S.Will, 18.417, Fall 2011 Credits: Slides borrow from slides of J er ome Waldisp

Basics of Molecular biology Molecular biology is the study of biology at molecular level.

4. Molecular dynamics Understanding Molecular Simulation Molecular Simulations Molecular

2019-20 DNA Biology New Products RNA Biology PROTEIN Biology MOLECULAR Biology Plant DNA

Computational Challenges in Computational Challenges in Genomics and Molecular Biology Genomics

1. Introduction to Molecular & Systems Biology EECS 600: Systems Biology &

Molecular Simulation Introduction Understanding Molecular Simulation Introduction Why to use

Introduction to Fetal Medicine: Genetics and Embryology Question: What do cancer biology,

CSE 417: Algorithms and Computational Complexity 1: Organization & Overview Winter 2006

Formal Molecular Biology According to V. Danos & C. Laneve J er ome Caffaro

Information problems in Information problems in molecular biology and molecular biology and

Small Angle Scattering (SAXS/SANS) Small Angle Scattering (SAXS/SANS) Small Angle Scattering

Probing dimerization and structural flexibility Probing dimerization and structural flexibility

Protein unfolding and flexible systems Protein unfolding and flexible systems Protein unfolding

Deep Computing in Biology Challenges and Progress Ajay K. Royyuru Computational Biology Center

Molecular vibrations Ask Hjorth Larsen Center for Atomic-scale Materials Design 2008 Molecular

3. Monte Carlo Simulations Understanding Molecular Simulation Molecular Simulations Molecular

14 Social Systems 14.1 Generating ontologies 14.2 Wisdom of the crowds 14.3 Folksonomies

Study of the Nuclear Symmetry Energy: Study of the Nuclear Symmetry Energy: Future theoretical

Thanks for your attention!

Bored by Classification ConvNets? End-to-end Learning of other Computer Vision Tasks Thomas Brox

DNA AND RNA ATI TEAS SCIENCE DNA & RNA Questions related to DNA and RNA cover topics

Modelling Biochemical Reaction Networks Lecture 2: Overview of biochemistry Marc R. Roussel

Advanced Search Genetic algorithm Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department

An Introduction to Molecular Biology and RNA motifs Dimitrios Palitskaris 1 2 What is life?

Sambuz

Useful Links

Newsletter

Mail Us

18.417 Introduction to Computational Molecular Biology Foundations - PowerPoint PPT Presentation

18.417 Introduction to Computational Molecular Biology Foundations of Structural Bioinformatics Sebastian Will MIT, Math Department Fall 2011 S.Will, 18.417, Fall 2011 Credits: Slides borrow from slides of J er ome Waldisp

Basics of Molecular biology Molecular biology is the study of biology at molecular level.

4. Molecular dynamics Understanding Molecular Simulation Molecular Simulations Molecular

2019-20 DNA Biology New Products RNA Biology PROTEIN Biology MOLECULAR Biology Plant DNA

Computational Challenges in Computational Challenges in Genomics and Molecular Biology Genomics

1. Introduction to Molecular &amp; Systems Biology EECS 600: Systems Biology &amp;

Molecular Simulation Introduction Understanding Molecular Simulation Introduction Why to use

Introduction to Fetal Medicine: Genetics and Embryology Question: What do cancer biology,

CSE 417: Algorithms and Computational Complexity 1: Organization &amp; Overview Winter 2006

Formal Molecular Biology According to V. Danos &amp; C. Laneve J er ome Caffaro

Information problems in Information problems in molecular biology and molecular biology and

Small Angle Scattering (SAXS/SANS) Small Angle Scattering (SAXS/SANS) Small Angle Scattering

Probing dimerization and structural flexibility Probing dimerization and structural flexibility

Protein unfolding and flexible systems Protein unfolding and flexible systems Protein unfolding

Deep Computing in Biology Challenges and Progress Ajay K. Royyuru Computational Biology Center

Molecular vibrations Ask Hjorth Larsen Center for Atomic-scale Materials Design 2008 Molecular

3. Monte Carlo Simulations Understanding Molecular Simulation Molecular Simulations Molecular

14 Social Systems 14.1 Generating ontologies 14.2 Wisdom of the crowds 14.3 Folksonomies

Study of the Nuclear Symmetry Energy: Study of the Nuclear Symmetry Energy: Future theoretical

Thanks for your attention!

Bored by Classification ConvNets? End-to-end Learning of other Computer Vision Tasks Thomas Brox

DNA AND RNA ATI TEAS SCIENCE DNA &amp; RNA Questions related to DNA and RNA cover topics

Modelling Biochemical Reaction Networks Lecture 2: Overview of biochemistry Marc R. Roussel

Advanced Search Genetic algorithm Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department

An Introduction to Molecular Biology and RNA motifs Dimitrios Palitskaris 1 2 What is life?

Sambuz

Useful Links

Newsletter

Mail Us

1. Introduction to Molecular & Systems Biology EECS 600: Systems Biology &

CSE 417: Algorithms and Computational Complexity 1: Organization & Overview Winter 2006

Formal Molecular Biology According to V. Danos & C. Laneve J er ome Caffaro

DNA AND RNA ATI TEAS SCIENCE DNA & RNA Questions related to DNA and RNA cover topics