fastest origin of life human
play

Fastest Origin of Life? Human Life needs Gene? information - PowerPoint PPT Presentation

RNA Secondary Structure: CSE 527 RNA makes helices too Autumn 2007 U CA A Lectures 17-18 G C Base pairs AC G G C A U RNA U A C G C G Secondary Structure Prediction A U G CA A A AU C Fastest Origin of Life? Human Life


  1. RNA Secondary Structure: CSE 527 RNA makes helices too Autumn 2007 U CA A Lectures 17-18 G C Base pairs AC G G C A U RNA U A C G C G Secondary Structure Prediction A U G CA A A AU C Fastest Origin of Life? Human Life needs Gene? information carrier: DNA molecular machines, like enzymes: Protein making proteins needs DNA + RNA + proteins making (duplicating) DNA needs proteins Horrible circularities! How could it have arisen in an abiotic environment?

  2. Origin of Life? Outline RNA can carry information too Biological roles for RNA (RNA double helix) What is “secondary structure? RNA can form complex structures How is it represented? RNA enzymes exist (ribozymes) Why is it important? Examples The “RNA world” hypothesis: Approaches 1st life was RNA-based RNA Structure RNA Pairing Watson-Crick Pairing Primary Structure: Sequence C - G ~ 3 kcal/mole A - U ~ 2 kcal/mole Secondary Structure: Pairing “Wobble Pair” G - U ~1 kcal/mole Non-canonical Pairs (esp. if modified) Tertiary Structure: 3D shape

  3. tRNA 3d Structure Ribosomes Watson, Gilman, Witkowski, & Zoller, 1992 tRNA - Alt. Representations tRNA - Alt. Representations 3’ Anticodon 3’ 5’ loop 3’ 5’ 5’ Anticodon loop Anticodon Anticodon loop loop

  4. Semi-classical RNAs “Classical” RNAs (discovery in mid 90’s) tRNA - transfer RNA (~61 kinds, ~ 75 nt) rRNA - ribosomal RNA (~4 kinds, 120-5k nt) tmRNA - resetting stalled ribosomes snRNA - small nuclear RNA (splicing: U1, etc, 60-300nt) RNaseP - tRNA processing (~300 nt) Telomerase - (200-400nt) RNase MRP - rRNA processing; mito. rep. (~225 nt) snoRNA - small nucleolar RNA (many SRP - signal recognition particle; membrane targeting varieties; 80-200nt) (~100-300 nt) SECIS - selenocysteine insertion element (~65nt) 6S - ? (~175 nt) Recent discoveries Why? microRNAs (Nobel prize 2006, Fire & Mello) riboswitches RNA’s fold, and function many ribozymes regulatory elements … Nature uses what works Hundreds of families Rfam release 1, 1/2003: 25 families, 55k instances Rfam release 7, 3/2005: 503 families, 300k instances

  5. Noncoding Example: Glycine Regulation RNAs Dramatic discoveries in How is glycine level regulated? last 5 years Plausible answer: 100s of new families Many roles: Regulation, transport, g gce stability, catalysis, … protein g 1% of DNA codes for g g protein, but 90% of it is TF g copied into RNA, i.e. DNA TF glycine cleavage enzyme gene ncRNA >> mRNA Breakthrough of the Year Significance unclear, transcription factors (proteins) bind to controversial DNA to turn nearby genes on or off The Glycine Riboswitch Actual answer (in many bacteria): gce protein g g g g 5 ′ 3 ′ gce mRNA DNA glycine cleavage enzyme gene Mandal et al. Science 2004

  6. Gene Regulation: The MET Repressor The Alberts, et al, 3e. protein way SAM Riboswitch alternatives DNA Protein Corbino et al., Genome Biol. 2005 Alberts, et al, 3e. The Hammerhead 6S mimics an Ribozyme open promoter Bacillus/ Clostridium Involved in “rolling circle replication” of viruses. Actino- bacteria E.coli Barrick et al. RNA 2005 Trotochaud et al. NSMB 2005 Willkomm et al. NAR 2005

  7. Why is RNA hard to deal with? Wanted G A A A A A A A G A U C G U U C U C G A C U C C G U A G C G G U G C A A G G G G A G A C U C G C C Good structure prediction tools G G C A G C A A G A G G A G G A G A G G A C C A C C A Good motif descriptions/models U U G U A C C C Good, fast search tools C G A A A A (“RNA BLAST”, etc.) A G G C U G C C A Good, fast motif discovery tools A A A U A G A A A G U G A G A C A C U C U U U G U U G G U C (“RNA MEME”, etc.) C U C U G G C A G C G G U G C G A C G C A U U G C G A U A A A C G U G C U G U U U G Importance of structure makes last 3 hard U A G G C G A: Structure often more important than sequence RNA Pairing Task 1: Watson-Crick Pairing Structure Prediction C - G ~ 3 kcal/mole A - U ~ 2 kcal/mole “Wobble Pair” G - U ~ 1 kcal/mole Non-canonical Pairs (esp. if modified)

  8. Nested Precedes Definitions Sequence 5’ r 1 r 2 r 3 ... r n 3’ in {A, C, G, T} A Secondary Structure is a set of pairs i•j s.t. i < j-4, and no sharp turns if i•j & i’•j’ are two different pairs with i ≤ i’, then 2nd pair follows 1st, or j < i’, or is nested within it; i < i’ < j’ < j Pseudoknot no “pseudoknots.” Approaches to A Pseudoknot Structure Prediction A-C / \ Maximum Pairing 3’ - A-G-G-C-U U + works on single sequences + simple U-C-C-G-A-G-G-G - too inaccurate | C-C-C - 5’ \ / Minimum Energy U-C-U-C + works on single sequences - ignores pseudoknots - only finds “optimal” fold Partition Function + finds all folds - ignores pseudoknots

  9. “Optimal pairing of r i ... r j ” Nussinov: Max Pairing Two possibilities J Unpaired: i B(i,j) = # pairs in optimal pairing of r i ... r j Find best pairing of r i ... r j-1 j j-1 B(i,j) = 0 for all i, j with i ≥ j-4; otherwise J Paired: B(i,j) = max of: Find best r i ... r k-1 + i B(i,j-1) k-1 best r k+1 ... r j-1 plus 1 max { B(i,k-1)+1+B(k+1,j-1) | k i ≤ k < j-4 and r k -r j may pair} Why is it slow? j Time: O(n 3 ) k+1 Why do pseudoknots matter? j-1 Loop-based Energy Pair-based Energy Minimization Minimization Detailed experiments show it’s 1 E(i,j) = energy of pairs in optimal pairing of r i ... r j more accurate to model based 2 E(i,j) = ∞ for all i, j with i ≥ j-4; otherwise on loops, rather than just pairs Loop types 3 E(i,j) = min of: 1. Hairpin loop energy of j-k pair E(i,j-1) 2. Stack min { E(i,k-1) + e(r k , r j ) + E(k+1,j-1) | i ≤ k < j-4 } 4 3. Bulge 4. Interior loop Time: O(n 3 ) 5. Multiloop 5

  10. Base Pairs and Stacking The Double Helix cytosine uracil thymine guanine adenine Zuker: Loop-based Energy, I Loop W(i,j) = energy of optimal pairing of r i ... r j Examples V(i,j) = as above, but forcing pair i•j W(i,j) = V(i,j) = ∞ for all i, j with i ≥ j-4 W(i,j) = min(W(i,j-1), min { W(i,k-1)+V(k,j) | i ≤ k < j-4 } )

  11. Zuker: Loop-based Suboptimal Energy Energy, II There are always alternate folds with near-optimal bulge/ multi- energies. Thermodynamics: populations of identical hairpin stack interior loop molecules will exist in different folds; individual V(i,j) = min(eh(i,j), es(i,j)+V(i+1,j-1), VBI(i,j), VM(i,j)) molecules even flicker among different folds VM(i,j) = min { W(i,k)+W(k+1,j) | i < k < j } Mod to Zuker’s algorithm finds subopt folds VBI(i,j) = min { ebi(i,j,i ’ ,j ’ ) + V(i ’ , j ’ ) | McCaskill: more elaborate dyn. prog. algorithm i < i ’ < j ’ < j & i ’ -i+j-j ’ > 2 } calculates the “partition function,” which defines Time: O(n 4 ) bulge/ the probability distribution over all these states. interior O(n 3 ) possible if ebi(.) is “nice” (Key addition: recurrence must count each possibility exactly once.) Example of suboptimal folding Black dots: pairs in opt fold Colored dots: pairs in folds 2-5% worse than optimal fold Two competing secondary structures for the Leptomonas collosoma spliced leader mRNA.

  12. Approaches to Accuracy Structure Prediction Maximum Pairing + works on single sequences + simple Latest estimates suggest ~50-75% of base pairs - too inaccurate predicted correctly in sequences of up to Minimum Energy ~300nt + works on single sequences - ignores pseudoknots Definitely useful, but obviously imperfect - only finds “optimal” fold Partition Function + finds all folds - ignores pseudoknots Approaches, II Summary RNA has important roles beyond mRNA Comparative sequence analysis Many unexpected recent discoveries + handles all pairings (incl. pseudoknots) Structure is critical to function - requires several (many?) aligned, True of proteins, too, but they’re easier to find, appropriately diverged sequences due, e.g., to codon structure, which RNAs lack Stochastic Context-free Grammars RNA secondary structure can be predicted (to Roughly combines min energy & comparative, useful accuracy) by dynamic programming but no pseudoknots Next time: RNA “motifs” (seq + 2-ary struct) well- Physical experiments (x-ray crystalography, NMR) captured by “covariance models”

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend