CSE182-L10
Gene Finding
November 09
CSE182-L10 Gene Finding November 09 HMM fair-coin example 0.6 0.6 1 - - PowerPoint PPT Presentation
CSE182-L10 Gene Finding November 09 HMM fair-coin example 0.6 0.6 1 0.4 0.4 E F (H)=0.5 E L (H)=0.1 November 09 0.6 0.6 1 0.4 0.4 E F (H)=0.5 E L (H)=0.1 H H T T T is the observed sequence 0.6 1.5e-1 4.5e-2 1.3e-2 5.8e-3 0.5 1 0 0.4
November 09
November 09
November 09
1 0.6 0.4 0.5 1.5e-1 4.5e-2 1.3e-2 5.8e-3 2e-2 5.4e-2 2.9e-2 1.6e-2
November 09
reads mRNA.
translated into a unique amino-acid until the STOP codon is encountered.
signal where translation starts, usually at the ATG (M) codon.
November 09
reads mRNA.
into a unique amino-acid until the STOP codon is encountered.
where translation starts, usually at the ATG (M) codon.
many ways can you translate it?
November 09
discontiguous regions (exons), separated by non-coding regions (introns).
entire region into RNA
form the mature mRNA (message)
intitiating ATG somewhere in the message.
November 09
ATG
5’ UTR intron exon 3’ UTR Acceptor Donor splice site Transcription start Translation start
November 09
(relative to the reference genome)
frames. AGTAGAGTATAGTGGACG S R V * W R V Q Y S G * S I V D
Frame 1 Frame 2 Frame 3
TCATCTCATATCACCTGC
November 09
– Location that codes for a protein – The transcript sequence(s) that encodes the protein – The protein sequence(s)
November 09
ATG
5’ UTR intron exon 3’ UTR Acceptor Donor splice site Transcription start Translation start
November 09
November 09
November 09
AAAAAA AAAAAC AAAAAG AAAAAT
10 5 20 10
10 5
November 09
– E= [10, 20] – I = [10, 5] – V3 = [6, 10] – V4 = [9, 15]
5 20 15 10 15 10 5
November 09
November 09
November 09
=T[AAAAA,A] T[AAAAA,C] T[AAAAC,G] T[AAACG,A]……
Exon Intron
November 09
November 09
November 09
November 09
November 09
i j
November 09
November 09
November 09
i1 i2 i3 i4
November 09
A-score[i3-1] + E-score[i3..i4] + …….
i1 i2 i3 i4
November 09
November 09
November 09
November 09
– Einit, Efin, Emid, – I, IG (intergenic)
November 09
November 09
– Einit, Efin, Emid, – I, IG (intergenic)
November 09
November 09
November 09
November 09
j<i
l ∈Q
Emission Prob.: Probability that you emitted Xi..Xj in state qk (given by the 5th order markov model) Forward Prob: Probability that you emitted i symbols and ended up in state qk Duration Prob.: Probability that you stayed in state qk for j-i+1 steps
November 09
November 09
ATG
5’ UTR intron exon 3’ UTR Acceptor Donor splice site Transcription start Translation start
November 09
November 09
according to a distribution
November 09
November 09
November 09
November 09
November 09
November 09
November 09
November 09
for introns, versus other gaps.
November 09
November 09
November 09
Protein sequence analysis ESTs Gene finding
November 09