Current Trends: Non-coding RNAs Central Dogma of molecular biology - - PDF document

current trends non coding rnas
SMART_READER_LITE
LIVE PREVIEW

Current Trends: Non-coding RNAs Central Dogma of molecular biology - - PDF document

Current Trends: Non-coding RNAs Central Dogma of molecular biology Reverse RNA virus transcriptase replication DNA RNA Protein Cellular functions (mRNA) in vitro (ncRNA) 1 Non-coding RNAs Found in prokaryotes (small RNAs)


slide-1
SLIDE 1

1

Current Trends: Non-coding RNAs

“Central Dogma” of molecular biology DNA RNA Protein

(mRNA)

Cellular functions

Reverse transcriptase RNA virus replication in vitro (ncRNA)

slide-2
SLIDE 2

2

  • Found in prokaryotes (small RNAs) and

eukaryotes (non-coding RNAs).

  • Well-characterized examples: tRNA, rRNA

Non-coding RNAs

  • Enzymatic activity
  • self-splicing introns
  • peptidyl transfer
  • viral replication
  • Regulation of other genes
  • eukaryotes: 21-25 nts; micro RNAs
  • prokaryotes: 50-550 nts; small RNAs

Non-coding RNAs

slide-3
SLIDE 3

3

Eukayotic vs. Prokaryotic ncRNAs

Gottesman, Trends in Genetics 21:399-404

RNAi RNAa…

ncRNAs can regulate gene expression at many steps

Storz et al., Ann. Rev.

  • Biochem. 74:199-217

Red = bacterial Blue = eukaryotes

slide-4
SLIDE 4

4

Targets of RNA Gene Regulation

UAGCAUGUACGUAGCUAGCUACGAUUGUUAUUACUGUCGUGCUUUCACUUCUCGCAGGAGUCCUCGUAUGGUA

A U G G A A A A A A A A A C C C C C C C C C C G G G G G G G G G U U U U U U U U U U U A A A A A A C C C C C C G G G G G G G C C C C C A A A A A U U U U U U U U G G G G G U U C C C C

RNA gene messenger RNA

Targets of RNA Gene Regulation

G C A CC A U U U ACA U GU ACAGCACGAAAGU AAGAG CG UCC CAG AGGACUAGCG UGU A UA UGUCGUGCUUUCA UUCUC GC AGG GUC UCGUAUGGUA ||| | || ||||||||||||| ||||| || ||| ||| || C C C C C C G

U G U G G G C C A A U U U G G G G U U C C C C

UUCAUUAUGACCUUCGUU UAGCAUGUACGUAGCUAGCUACGAU

slide-5
SLIDE 5

5

  • Not annotated in genomes: lack of defined sequence

features

  • Small, often missed in genetic studies
  • Missed in assays for protein function
  • None of 70-100 E. coli ncRNAs found by mutation

Non-coding RNAs are elusive

A U G G A A A A A A A A A C C C C C C C C C C G G G G G G G G G U U U U U U U U U U U A A A A A A C C C C C C G G G G G G G C C C C C A A A A A U U U U U U U U G G G G G U U C C C C

  • Overexpression of an intergenic region causes cell

and tissue overgrowth

  • Deletion of intergenic region surrounding EP element

results in slow growth

Drosophila bantam gene discovery

41kb EP

slide-6
SLIDE 6

6

bantam encodes a miRNA that regulates a conserved growth pathway

bantam growth cell death yorkie hippo salvador warts

growth suppressors growth promoter growth promoter

Why study ncRNAs in bacteria?

  • We live in a bacterial world
  • Bacteria serve as useful model organisms
  • Bacteria are diverse
  • Understanding bacteria is useful in many

important applications

Whooping cough Meningitis Botulism Dental Cavities Dysentery The Black Plague Syphilis Tetanus Scarlet Fever Yaws Pneumonia Gonorrhea Gastroenteritis Typhoid Fever Rocky Mountain Spotted Fever Rheumatic Fever Anthrax Leprosy Tuberculosis Diptheria Cholera Strep Throat Food Poisoning Lyme Disease Peptic Ulcers

slide-7
SLIDE 7

7

  • Gram-negative γ-proteobacterium
  • Found primarily in deep water anaerobic habitats
  • Can use a wide variety of compounds as terminal

electron acceptors

  • Bioremediation potential: reduces soluble chromium

and uranium to insoluble forms

Shewanella oneidensis

  • S. oneidensis genome overview

Genes assigned function Conserved hypothetical genes Hypothetical genes Total genes 5131416 bp total: chromosome is 4969803 bp; pMR-1 is 161613 bp 45.9% G-C content; 85.5% of genome is coding 5066 4938 (97.5%) Protein-coding genes 128 (2.5%) tRNA and rRNA genes 1159 (27%) 864 (17.5%) 2915 (59%)

slide-8
SLIDE 8

8

  • Most ncRNAs are intergenic, function in trans
  • Bacterial ncRNAs have promoters, terminators
  • Genes have distinct nucleotide composition
  • Conserved secondary structures (stem-loop)
  • Tiling microarray data
  • Data can be integrated into a set of predictions

Computational ncRNA prediction

GTCAGTATAGTCGCATTATAGCCGATCTGAGTCAGTCAGTCGTAGTATCGTAGTCAGTCGTACGTAGTCAGTCGTATCAGTCGAGTCAGTCGA GCTAGTCGATCGTATCACTATCATCGTACGTAGTGCTAGTCAGTGTCATCGATGCGTACGTAGTCAGTTACGTAGCATCGTACGTAGTCATGC ATGCTAGCTAGCTAGCTAGCTAGCTACGCGATCGTGCGTATGCGTATATTATATGCGCTAGCAGTCGTAGTACGTAGTACTATGTATGCGTAC GTGATGCTAGTTGCGTACGATAGCGATACGATCAGTCGTATCGATCGTATGCATCGAGAGTCGTAGTAGCGATTAGCGCTAGTCATTATAGTC GTACTTAGGTCGCGGCGATTACGGATAGTCTGATCACGACGTATGAGCTGACGCGGCGATCAGGAAGACCCTCGCGGAGAACCTGAAAGCACG ACATTGCTCACATTGCTTCCAGTATTACTTAGCCAGCCGGGTGCTGGCTTTTTGTACGTACTGAGTCGGCATTATAGCGTATGCATACGGAGT ACGAGTCGTACGGACAGTCGTAGTCAGTCTGATCAGTCAGTCGTAGTCGTATGCAGTCGACGAGTCGTACGTATGCAGTCGATCGTTAGGACT CGTAAGTCGTATCATATCGGATTATAGCATGCTAGAGCTAGTCGTATAGTCTACGAGTTATACGTCTAGTGGCTAGTGTACGTCAGTCGTACG ATGCAGTTAGTAGTCTAGTATTACGATTAGTCGTGATCTGAGTAGTTACGTCGATGGTAGCCATTATACGTACTTAC

ncRNA Genes

slide-9
SLIDE 9

9

1) Nucleotide composition

Frequency of nucleotides

GTCAGTATAGTCGCATTATAGCCGATCTGAGTCAGTCAGTCGTAGTATCGTAGTCAGTCGTACGTAGTCAGTCGTATCAGTCGAGTCAGTCGA GCTAGTCGATCGTATCACTATCATCGTACGTAGTGCTAGTCAGTGTCATCGATGCGTACGTAGTCAGTTACGTAGCATCGTACGTAGTCATGC ATGCTAGCTAGCTAGCTAGCTAGCTACGCGATCGTGCGTATGCGTATATTATATGCGCTAGCAGTCGTAGTACGTAGTACTATGTATGCGTAC GTGATGCTAGTTGCGTACGATAGCGATACGATCAGTCGTATCGATCGTATGCATCGAGAGTCGTAGTAGCGATTAGCGCTAGTCATTATAGTC GTACTTAGGTCGCGGCGATTACGGATAGTCTGATCACGACGTATGAGCTGACGCGGCGATCAGGAAGACCCTCGCGGAGAACCTGAAAGCACG ACATTGCTCACATTGCTTCCAGTATTACTTAGCCAGCCGGGTGCTGGCTTTTTGTACGTACTGAGTCGGCATTATAGCGTATGCATACGGAGT ACGAGTCGTACGGACAGTCGTAGTCAGTCTGATCAGTCAGTCGTAGTCGTATGCAGTCGACGAGTCGTACGTATGCAGTCGATCGTTAGGACT CGTAAGTCGTATCATATCGGATTATAGCATGCTAGAGCTAGTCGTATAGTCTACGAGTTATACGTCTAGTGGCTAGTGTACGTCAGTCGTACG ATGCAGTTAGTAGTCTAGTATTACGATTAGTCGTGATCTGAGTAGTTACGTCGATGGTAGCCATTATACGTACTTAC

A 0.23 C 0.27 G 0.27 T 0.23 A 0.23 C 0.27 G 0.27 T 0.23 A 0.28 C 0.22 G 0.22 T 0.28 A 0.28 C 0.22 G 0.22 T 0.28

Frequency of dinucleotides

2) Comparative Genomics: Mutation Patterns

GTCAGTATAGTCGCATTATAGCCGATCTGAGTCAGTCAGTCGTAGTATCGTAGTCAGTCGTACGTAGTCAGTCGTATCAGTCGAGTCAGTCGA GCTAGTCGATCGTATCACTATCATCGTACGTAGTGCTAGTCAGTGTCATCGATGCGTACGTAGTCAGTTACGTAGCATCGTACGTAGTCATGC ATGCTAGCTAGCTAGCTAGCTAGCTACGCGATCGTGCGTATGCGTATATTATATGCGCTAGCAGTCGTAGTACGTAGTACTATGTATGCGTAC GTGATGCTAGTTGCGTACGATAGCGATACGATCAGTCGTATCGATCGTATGCATCGAGAGTCGTAGTAGCGATTAGCGCTAGTCATTATAGTC GTACTTAGGTCGCGGCGATTACGGATAGTCTGATCACGACGTATGAGCTGACGCGGCGATCAGGAAGACCCTCGCGGAGAACCTGAAAGCACG ACATTGCTCACATTGCTTCCAGTATTACTTAGCCAGCCGGGTGCTGGCTTTTTGTACGTACTGAGTCGGCATTATAGCGTATGCATACGGAGT ACGAGTCGTACGGACAGTCGTAGTCAGTCTGATCAGTCAGTCGTAGTCGTATGCAGTCGACGAGTCGTACGTATGCAGTCGATCGTTAGGACT CGTAAGTCGTATCATATCGGATTATAGCATGCTAGAGCTAGTCGTATAGTCTACGAGTTATACGTCTAGTGGCTAGTGTACGTCAGTCGTACG ATGCAGTTAGTAGTCTAGTATTACGATTAGTCGTGATCTGAGTAGTTACGTCGATGGTAGCCATTATACGTACTTAC GATATCGGTACGTGTCTAGCATGAGTCTATCTATACTGTCGGCGTATCGTACGTATGCGTAATCGATCAGTGTCGTATCGAGTTACGATGCAT GAGTCGTACGTATCGTAGCATGCTAGCTACGATGCTAGCATGCTAGCATCGATGCATGCATGCTGACTAGATCGTACGTAGCTACGTAGTCGT AAGTCGTAGTCGTAGCTAGTTAGCGCGTATAGCGTACGTAGTACGTATCGATGCGTAGTCATTACGACTGATCGTAAGTCGAGCGATCAGCAA GACCCACGAGGAGAACCTGAAGCACGACATTGCTCAATTGCTTCCAGATTACGTAGCCAGGGCCGGGTGCTGGTTTTTCAGTCGTACGTAGCT AGTAGTCGTACTGAGCAGTCTAGCATCGTAGTCATGATTGCGTACGTATCGATCGAGTCGATGCATGTATATATGCCGCGTACTGACGTACGT AGTCTAGCTAGTCATGCTATATACGGCGCTAGTCGTAGTACGTCGTAGTCAGTGTCAGTATCGAGTCATGCATGTCGTACGTATGGCATGGCT AGTCATGGACTAGCTAGTAGCGTACGTAGTCATTATACGTACGTCGTATGATATATTAGCGCCGCGGTGTACTGCGTCGTGTCGTATACTACT GATCTGATCGTAGTACTGCTACGTAGTCGTAGCAGTCGATCGTATGCATGCGTAGTCGTAGTCTAGCTGATCTACGTAGTCGTAGTATGCGTA GTCTAGTCTATGCATTATATGCTATAGTCATGCTAGCATACGT

Genome (1) Genome (2) Shewanella amazonensis, Shewanella baltica, Shewanella denitrificans, Shewanella frigidimarina, Shewanella loihica, Vibrio cholerae, Yersinia pestis, Photorhabdus luminescens, Photobacterium profundum

slide-10
SLIDE 10

10

2) Mutation Patterns that Conserve RNA Structure

Rivas and Eddy, BMC Bioinformatics 2:8

A A A A C T G G C T T G T G A A A A G C C G G C T T G A G T

Derive score based on: # of compensatory mutations Length of sequence Sequence structure

2) Score of Conserved RNA Structure

0.00 0.02 0.04 0.06 0.08 0.10

  • 15
  • 12
  • 9
  • 6
  • 3

3 6 9 12 15 18 21 24 Score Frequency (probability) of score Documented RNAs Intergenic Regions EVD for Documented RNAs EVD for Intergenic Regions

slide-11
SLIDE 11

11

3) DNA Microarray Data

1) intergenic regions likely to produce RNA 2) for those much less likely to produce RNA Examine correlation of expression for probes near

  • ne another in the genome:

3) Correlation of Transcript Expression

10 20 30 40 50 60 70 80 90

  • 1
  • 0.9
  • 0.8
  • 0.7
  • 0.6
  • 0.5
  • 0.4
  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Correlation Coefficient # of IG Probes Correlation in non-operon IG regions Correlation in operon IG regions

slide-12
SLIDE 12

12

Integrating Heterogeneous Data

3) DNA Microarray Data 2) Conserved Structure Data

A U G G A A A A A A A A A C C C C C C C C C C G G G G G G G G G U U U U U U U U U U U A A A A A A C C C C C C G G G G G G G C C C C C A A A A A U U U U U U U U G G G G G U U C C C C A U G G A A A A A A A A A C C C C C C C C C C G G G G G G G G G U U U U U U U U U U U A A A A A A C C C C C C G G G G G G G C C C C C A A A A A U U U U U U U U G G G G G U U C C C C A U G G A A A A A A A A A C C C C C C C C C C G G G G G G G G G U U U U U U U U U U U A A A A A A C C C G C C G G G G G G C U C C C C A A A A A U U U U U U U U A G G G G U U C C C C A U G G A A A A A A A A A C C C C C C C C C C G G G G G G G G G U U U U U U U U U U U A A A A A A C C C G C C G G G G G G C U C C C C A A A A A U U U U U U U U A G G G G U U C C C C

1) Sequence Data ATGCATGCTAGTCATC GATCGATC

A 0.28 C 0.22 G 0.22 T 0.28 A 0.28 C 0.22 G 0.22 T 0.28 A 0.28 C 0.22 G 0.22 T 0.28 A 0.28 C 0.22 G 0.22 T 0.28 A 0.23 C 0.27 G 0.27 T 0.23 A 0.23 C 0.27 G 0.27 T 0.23 A 0.23 C 0.27 G 0.27 T 0.23 A 0.23 C 0.27 G 0.27 T 0.23

general Markov model

Integrating Heterogeneous Data

A U G G A A A A A A A A A C C C C C C C C C C G G G G G G G G G U U U U U U U U U U U A A A A A A C C C C C C G G G G G G G C C C C C A A A A A U U U U U U U U G G G G G U U C C C C A U G G A A A A A A A A A C C C C C C C C C C G G G G G G G G G U U U U U U U U U U U A A A A A A C C C C C C G G G G G G G C C C C C A A A A A U U U U U U U U G G G G G U U C C C C A U G G A A A A A A A A A C C C C C C C C C C G G G G G G G G G U U U U U U U U U U U A A A A A A C C C G C C G G G G G G C U C C C C A A A A A U U U U U U U U A G G G G U U C C C C A U G G A A A A A A A A A C C C C C C C C C C G G G G G G G G G U U U U U U U U U U U A A A A A A C C C G C C G G G G G G C U C C C C A A A A A U U U U U U U U A G G G G U U C C C C

ATGCATGCTAGTCATC GATCGATC

A 0.28 C 0.22 G 0.22 T 0.28 A 0.28 C 0.22 G 0.22 T 0.28 A 0.28 C 0.22 G 0.22 T 0.28 A 0.28 C 0.22 G 0.22 T 0.28 A 0.23 C 0.27 G 0.27 T 0.23 A 0.23 C 0.27 G 0.27 T 0.23 A 0.23 C 0.27 G 0.27 T 0.23 A 0.23 C 0.27 G 0.27 T 0.23 GTCAGTATAGTCGCATTATAGCCGATCTGAGTCAGTCAGTCGTAGTATCGTAGTCAGTCGTACGTAGTCAGTCGTATCAGTCGAGTCAGTCGA GCTAGTCGATCGTATCACTATCATCGTACGTAGTGCTAGTCAGTGTCATCGATGCGTACGTAGTCAGTTACGTAGCATCGTACGTAGTCATGC ATGCTAGCTAGCTAGCTAGCTAGCTACGCGATCGTGCGTATGCGTATATTATATGCGCTAGCAGTCGTAGTACGTAGTACTATGTATGCGTAC GTGATGCTAGTTGCGTACGATAGCGATACGATCAGTCGTATCGATCGTATGCATCGAGAGTCGTAGTAGCGATTAGCGCTAGTCATTATAGTC GTACTTAGGTCGCGGCGATTACGGATAGTCTGATCACGACGTATGAGCTGACGCGGCGATCAGGAAGACCCTCGCGGAGAACCTGAAAGCACG ACATTGCTCACATTGCTTCCAGTATTACTTAGCCAGCCGGGTGCTGGCTTTTTGTACGTACTGAGTCGGCATTATAGCGTATGCATACGGAGT ACGAGTCGTACGGACAGTCGTAGTCAGTCTGATCAGTCAGTCGTAGTCGTATGCAGTCGACGAGTCGTACGTATGCAGTCGATCG

3) DNA Microarray Data 2) Conserved Structure Data 1) Sequence Data

slide-13
SLIDE 13

13

Performance on known E. coli ncRNAs

0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100%

1.0 - Specificity Sensitivity (1) (3) (2) (QRNA) (1,2) (1,2,3) (2,3) (1,3) (1) Primary sequence (2) Conserved structure (3) Expression data % actual ncRNAs found False positive rate

  • Have robust tiling microarray data set: 144

experiments, wide variety of growth conditions

  • Generated predictions of ~160 ncRNAs
  • Some may be orthologous to characterized ncRNAs
  • Some may be novel ncRNAs
  • Some may not be ncRNAs at all

Predicting ncRNAs in Shewanella

slide-14
SLIDE 14

14

Regulation of RyhB

Fe2+ Fur Fur regulon RyhB sodB sdh

Fur repressor is active when bound to Fe2+ Fe2+ limitation induces RyhB expression

bipyridine

Validating Shewanella ncRNA predictions

Shewanella ryhB ~100 nt

  • Putative ryhB northern

blotting experiments

slide-15
SLIDE 15

15

Regulation of spot42 in E. coli

spot42 negatively regulates translation of galK but does not affect galE translation spot42 expression increases the GalE:GalK ratio Thus, glucose induces spot42 expression

Glucose CRP cAMP galK spot42

  • Putative spot42 northern blotting experiments

Validating Shewanella ncRNA predictions

~110 nt

slide-16
SLIDE 16

16

  • Target prediction is an inexact science
  • ncRNA sequences not exact matches to targets
  • May be multiple targets
  • Vaildate targets using northern blots, western

blots, exogenous expression

Predicting ncRNA targets Putative Shewanella ncRNA targets

slide-17
SLIDE 17

17

  • Is the interaction between a ncRNA and its

target RNA positive or negative?

  • What conditions regulate ncRNA expression?
  • What can we learn that will improve ncRNA

prediction and understanding of function?

Questions