Review Session I CS 466 Wesley Wei Qian March 10th 2020 Midterm - - PowerPoint PPT Presentation

review session i
SMART_READER_LITE
LIVE PREVIEW

Review Session I CS 466 Wesley Wei Qian March 10th 2020 Midterm - - PowerPoint PPT Presentation

Review Session I CS 466 Wesley Wei Qian March 10th 2020 Midterm Exam This Thursday! 03/12 class time Different building! Nature History Building Room 2079 Topics Topics we have covered: Molecular Biology


slide-1
SLIDE 1

Review Session I

CS 466

Wesley Wei Qian March 10th 2020

slide-2
SLIDE 2

Midterm Exam

  • This Thursday!

○ 03/12 class time

  • Different building!

○ Nature History Building Room 2079

  • Topics
slide-3
SLIDE 3

Topics we have covered:

  • Molecular Biology
  • Probability and Statistics
  • Sequence and Alignment
  • Pattern Matching
  • BLAST
slide-4
SLIDE 4

Molecular Biology

  • Molecules
  • DNA
  • RNA
  • Protein (polypeptide)
  • Molecular Process
  • Transcription
  • Translation
  • Protein folding
  • Gene splicing: intron/exon
  • Gene regulation
  • Genome
slide-5
SLIDE 5

Probability and Statistics

slide-6
SLIDE 6

Probability and Statistics

slide-7
SLIDE 7

Probability and Statistics

slide-8
SLIDE 8

Probability and Statistics

slide-9
SLIDE 9

Probability and Statistics

slide-10
SLIDE 10

Probability and Statistics

slide-11
SLIDE 11

Probability and Statistics

slide-12
SLIDE 12

Sequence and Alignment

slide-13
SLIDE 13

Sequence and Alignment

Global Alignment Local Alignment

slide-14
SLIDE 14

Sequence and Alignment

* D D O G C *

  • 2
  • 4
  • 6
  • 8
  • 10

D

  • 2

1

  • 1
  • 3
  • 5
  • 7

O

  • 4
  • 1
  • 2
  • 4

G

  • 6
  • 3
  • 2
  • 1

1

  • 1

Global Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

slide-15
SLIDE 15

Sequence and Alignment

* D D O G C *

  • 2
  • 4
  • 6
  • 8
  • 10

D

  • 2

1

  • 1
  • 3
  • 5
  • 7

O

  • 4
  • 1
  • 2
  • 4

G

  • 6
  • 3
  • 2
  • 1

1

  • 1

Global Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

slide-16
SLIDE 16

Sequence and Alignment

* D D O G C *

  • 2
  • 4
  • 6
  • 8
  • 10

D

  • 2

1

  • 1
  • 3
  • 5
  • 7

O

  • 4
  • 1
  • 2
  • 4

G

  • 6
  • 3
  • 2
  • 1

1

  • 1

Global Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

slide-17
SLIDE 17

Sequence and Alignment

* D D O G C *

  • 2
  • 4
  • 6
  • 8
  • 10

D

  • 2

1

  • 1
  • 3
  • 5
  • 7

O

  • 4
  • 1
  • 2
  • 4

G

  • 6
  • 3
  • 2
  • 1

1

  • 1

Global Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

slide-18
SLIDE 18

Sequence and Alignment

* D D O G C *

  • 2
  • 4
  • 6
  • 8
  • 10

D

  • 2

1

  • 1
  • 3
  • 5
  • 7

O

  • 4
  • 1
  • 2
  • 4

G

  • 6
  • 3
  • 2
  • 1

1

  • 1

Global Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

DDOGC D-OG-

slide-19
SLIDE 19

Sequence and Alignment

* D D O G C *

  • 2
  • 4
  • 6
  • 8
  • 10

D

  • 2

1

  • 1
  • 3
  • 5
  • 7

O

  • 4
  • 1
  • 2
  • 4

G

  • 6
  • 3
  • 2
  • 1

1

  • 1

Global Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

DDOGC

  • DOG-
slide-20
SLIDE 20

Sequence and Alignment

* D D O G C * D 1 1 O 2 G 3 1

Local Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

slide-21
SLIDE 21

Sequence and Alignment

* D D O G C * D 1 1 O 2 G 3 1

Local Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

slide-22
SLIDE 22

Sequence and Alignment

* D D O G C * D 1 1 O 2 G 3 1

Local Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

slide-23
SLIDE 23

Sequence and Alignment

* D D O G C * D 1 1 O 2 G 3 1

Local Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

slide-24
SLIDE 24

Sequence and Alignment

* D D O G C * D 1 1 O 2 G 3

Local Alignment: DDOGC vs DOG

+1 Match; -1 Mismatch; -2 Gap.

  • DOG-

DOG

slide-25
SLIDE 25

Sequence and Alignment Complexity?

slide-26
SLIDE 26

Sequence and Alignment Complexity?

slide-27
SLIDE 27

Sequence and Alignment Scoring function and BLOSUM matrix

rounding factor

slide-28
SLIDE 28

Sequence and Alignment Affine Gap Penalty

slide-29
SLIDE 29

Pattern Matching

slide-30
SLIDE 30

Pattern Matching Naive Approach

  • K: number of patterns
  • N: average length of pattern
  • M: length of the query string

Running Time: O(KMN)

slide-31
SLIDE 31

Pattern Matching Keyword Tree

  • K: number of patterns
  • N: average length of pattern
  • M: length of the query string

Running Time: O(KN + NM)

slide-32
SLIDE 32

Pattern Matching

  • K: number of patterns
  • N: average length of pattern
  • M: length of the query string

Running Time: O(KN + M)

Aho-Corasick

slide-33
SLIDE 33

Pattern Matching

One more example:

http://blog.ivank.net/aho-corasick-algorithm-in-as3.html

Aho-Corasick

slide-34
SLIDE 34

Pattern Matching

Fixed patterns with various query string… what if we have a fix string but different query patterns? compile the patterns -> compile the string

A different setting

slide-35
SLIDE 35

Pattern Matching

  • N: average length of pattern
  • M: length of the query string

Running time: O(M^2 + N) Build a keyword tree: {abcabx, bcabx, cabx, abx, bx, x}

Suffix Tree for {abcabx}

slide-36
SLIDE 36

Pattern Matching

  • N: average length of pattern
  • M: length of the query string

Running time: O(M^2 + N) O(M + N) if do Ukkonen Algo. but no required!

Suffix Tree for {abcabx}

slide-37
SLIDE 37

Good luck!