CSE182-L12
Gene Finding
CSE182-L12 Gene Finding Quiz Who are these people, and what is - - PowerPoint PPT Presentation
CSE182-L12 Gene Finding Quiz Who are these people, and what is the occasion? De novo Gene prediction: Summary Various signals distinguish coding regions from non-coding HMMs are a reasonable model for Gene structures, and provide
Gene Finding
De novo Gene prediction: Summary
from non-coding
structures, and provide a uniform method for combining various signals.
improved signal detection
How many genes do we have?
Nature Science
detect genes
perhaps?), can you find the best parse of a genomic sequence that matches that target sequence
separately for introns, versus other gaps.
novo approach.
– Profiles/Regular Expression/HMMs
– Gene finding HMMs – DNA signals (splice signals)
stranded
separated, and a polymerase is used to copy the second strand.
terminate this process early.
here.
using electrophoresis allows us to get the position of each T
with every nucleotide. Color coding can help separate different nucleotides
detectors ‘read’ the terminating bases.
after 1000 bases.
(Mapping)
considered viable
researchers in 1999 proposed shotgunning the entire genome.
and introduce them into
bacteria multiply you will have many copies of the same clone.
back together from the pieces? Will be discussed in the next lecture.
to sequence, etc.?
– The answer to the statistical questions had already been given in the context of mapping, by Lander and Waterman.
G L
G = Genome Length L = Clone Length N = Number of Clones T = Required Overlap c = Coverage = LN/G a = N/G q = T/L s = 1-q
more areas of the genome are likely to be
expected number
increases at first, and gradually decreases.
AND no clone began in the next L-T positions)
i
L T
E(Xi) =a 1-a
( )
L-T =ae
Expected # islands = E(Xi) =
i
Gae-cs = Ne-cs
probability e-cs, it will never be continued. Therfore
Why?
L ecs -1 c Ê Ë Á ˆ ¯ ˜ + (1-s) È Î Í ˘ ˚ ˙