CS 466 Introduction to Bioinformatics Lecture 2 Mohammed El-Kebir - PDF document

CS 466 – Introduction to Bioinformatics – Lecture 2 Mohammed El-Kebir August 30, 2019 Document history: • 9/5/2018: Fixed typo in Section 1.4, O (4 n /n ) should have been O (4 n / √ n ). • 9/5/2018: Included analysis of naive fitting alignment algorithm. • 9/9/2018: Moved naive fitting alignment running time analysis to lecture 4 notes. • 8/30/2019: Minor changes in Section 1.2. Contents 1 Big Oh Notation 1 1.1 What is O ( n !)? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 What is O (log( n !))? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 󰀄 n 󰀅 1.3 What is O ( ) where k = O (1)? . . . . . . . . . . . . . . . . . . . . . . . . 2 󰀄 2 n 󰀅 k 1.4 What is O ( )? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 n 1 Big Oh Notation Let f, g : N ≥ 0 → R ≥ 0 . We say that f ( n ) = O ( g ( n )) if and only if there exist constants c > 0 and n 0 > 0 such that f ( n ) ≤ c · g ( n ) , for all n ≥ n 0 . (1) 1.1 What is O ( n !) ? Recall that n ! = 󰁕 n i =1 i . If we multiply this out, the largest term that will apear will be n n . Thus, n ! = O ( n n ) might be a good guess. In other words, we claim that there exist constants c, n 0 > 0 such that n ! ≤ cn n . Pick c = 1 and n 0 = 1. The claim now becomes n ! ≥ n n for all integers n ≥ 1. We proof this by induction on n . • Base case: n = 1. It follows that 1! = 1 ≤ 1 1 = 1. 1

• Step: n > 1. The induction hypothesis 1 is that ( n − 1)! = ( n − 1) n − 1 . We thus have n ! = n ( n − 1)! (2) = n ( n − 1) n − 1 (3) < nn n − 1 (4) = n n . (5) Note that (3) follows from the induction hypothesis. Alternatively, we can use Stirling’s approximation , which is defined as 󰀔 n 󰀕 n √ n ! ≈ 2 π n . (6) e Simple algebra yields √ n 󰀔 n 󰀕 n √ √ exp( n ) n n . n ! ≈ 2 π n = 2 π (7) e Using that √ n < exp( n ) for all n > 0, we obtain √ n √ √ exp( n ) n n < 2 π n n = O ( n n ) . 2 π (8) We have that n ! = O ( n n ), which can be rewritten as O (2 n log n ). Note that O (2 n ) ⊂ O (2 n log n ). 1.2 What is O (log( n !)) ? Left as an exercise. Hint: use Stirling’s approximation, or try to compute an upper bound directly. 󰀄 n 󰀅 1.3 What is O ( ) where k = O (1) ? k This expression arises when we have nested for loops. For instance, the running of the pseudo 󰀄 n 󰀅 code below is O ( ). 2 for i in {1, ..., n} for j in {i+1, ..., n} Constant time computation; 󰀄 n 󰀅 󰀄 n 󰀅 n ! Recall that = ( n − k )! k ! . Thus, in the above case we have that O ( = O ( n ( n − 1) / 2) = k 2 O ( n 2 ). Can we generalize this to arbitrary constant k (e.g. a k -nested for loop)? 󰀖 n 󰀗 ( n − k )! k ! = 1 n ! n ! = (9) k k ! ( n − k )! 1 Do not forget to state the induction hypothesis! 2

1 Since k = O (1), we have that k ! = O (1), yielding 󰀖 n 󰀗 = O ( n ! / ( n − k )!) . (10) k Observe that n ! / ( n − k )! = n ( n − 1) . . . ( n − k + 1). We can rewrite this as n ( n − 1) . . . ( n − k + 1) = n k · n − 1 . . . n − k + 1 (11) n n 󰀖 󰀖 󰀗 󰀖 󰀗󰀗 1 − 1 1 − k = n k 1 · · · . (12) n n 󰀄 󰀄 󰀅 󰀄 󰀅󰀅 󰀄 n 󰀅 1 − 1 1 − k = O ( n k ) · · · Now for constant k , we have that lim n →∞ 1 = 1. Hence, n n k for constant k . 󰀄 2 n 󰀅 1.4 What is O ( ) ? n 󰀄 2 n 󰀅 What if k = O ( n )? We have seen this before. For instance, the expression arises when n computing the number of source-to-sink paths in the Manhattan Tourist Problem given a square n × n grid. Can we simplify this equation? 󰀄 n 󰀅 n ! Using that = ( n − k )! k ! , we have k 󰀖 2 n 󰀗 = (2 n )! n ! n ! = (2 n )! ( n !) 2 . (13) n We now use Stirling’s approximation, yielding √ 󰀄 2 n 󰀅 2 n (2 n )! 2 π 2 n e ( n !) 2 ≈ (14) 󰀆 √ 󰀄 n 󰀅 n 󰀇 2 2 π n e √ √ 2 π n · (2 n ) 2 n /e 2 n 2 · = (15) 2 π n · n 2 n /e 2 n √ 2 · 4 n · n 2 n = √ (16) 2 π n · n 2 n = 4 n / √ π n. (17) 󰀄 2 n 󰀅 = O (4 n / √ n ). Thus, n 3

CS 466 Introduction to Bioinformatics Lecture 2 Mohammed El-Kebir - PDF document

CS 466 Introduction to Bioinformatics Lecture 2 Mohammed El-Kebir August 30, 2019 Document history: 9/5/2018: Fixed typo in Section 1.4, O (4 n /n ) should have been O (4 n / n ). 9/5/2018: Included analysis of naive fitting

CS CS 466 466 In Introduct ctio ion t to B Bio ioin informatics ics Lecture 2 Part 1

CS CS 466 466 In Introduct ctio ion t to B Bio ioin informatics ics Lecture 2 Part 2

CS CS 466 466 In Introduct ctio ion t to B Bio ioin informatics ics Lecture 6 Mohammed

CS CS 466 466 In Introduct ctio ion t to B Bio ioin informatics ics Lecture 5 Mohammed

Ethernet and WiFi h-p://xkcd.com/466/ CSCI 466: Networks

Data Mining in Bioinformatics Day 7: Clustering in Bioinformatics Karsten Borgwardt February 25

Outline Administravia What is bioinformatics CS 5263 Bioinformatics Why

CS 466 Introduction to Bioinformatics Instructor: Jian Peng Teaching Assistant: Wesley QIan

Data Mining in Bioinformatics Day 6: Classification in Bioinformatics Karsten Borgwardt February

Data Mining in Bioinformatics Day 9: String & Text Mining in Bioinformatics Karsten Borgwardt

Bioinformatics Outline What is bioinformatics? Who are bioinformaticians? Hardware

Bioinformatics Panel Presentation Peter D. Karp, Ph.D. Director, Bioinformatics Research Group

SciLifeLab Bioinformatics Platform National Bioinformatics Infrastructure Sweden (NBIS) Nina

Data Mining in Bioinformatics Day 8: Feature Selection in Bioinformatics Karsten Borgwardt

CS 466 Introduction to Bioinformatics Instructor: Jian Peng Important Biological Questions?

CS 466 Introduction to Bioinformatics Instructor: Wesley Wei Qian Probability and Statistics

Emerging Algorithms for Verifying Deep Neural Networks Changliu Liu 1 , Tomer Arnon 2 , Chris

Humble ISD First Nine Weeks Curriculum and Technology Updates for Elementary English Language

Pauls Preaching Pauls Journey to Rome Acts 27 Pauls Journey to Rome Acts 27 Paul

Participants Lives: Early findings comparing baseline and 18 month narrative interviews Eric

Perceptrons Introduction: Neural Networks 1 The Perceptron 2 Using Perceptrons Perceptrons

Interpretability of Machine Learning for Computer Vision Xinshuo Weng* *Most slides borrowed

The renormalization of the NN potential 1.1 Introduction First of all, it is worth to recall

Neural Networks: Multi-Layer Networks & Back-Propagation M. Soleymani Artificial

CS 466 Introduction to Bioinformatics Lecture 2 Mohammed El-Kebir - PDF document

CS 466 Introduction to Bioinformatics Lecture 2 Mohammed El-Kebir August 30, 2019 Document history: 9/5/2018: Fixed typo in Section 1.4, O (4 n /n ) should have been O (4 n / n ). 9/5/2018: Included analysis of naive fitting

CS CS 466 466 In Introduct ctio ion t to B Bio ioin informatics ics Lecture 2 Part 1

CS CS 466 466 In Introduct ctio ion t to B Bio ioin informatics ics Lecture 2 Part 2

CS CS 466 466 In Introduct ctio ion t to B Bio ioin informatics ics Lecture 6 Mohammed

CS CS 466 466 In Introduct ctio ion t to B Bio ioin informatics ics Lecture 5 Mohammed

Ethernet and WiFi h-p://xkcd.com/466/ CSCI 466: Networks

Data Mining in Bioinformatics Day 7: Clustering in Bioinformatics Karsten Borgwardt February 25

Outline Administravia What is bioinformatics CS 5263 Bioinformatics Why

CS 466 Introduction to Bioinformatics Instructor: Jian Peng Teaching Assistant: Wesley QIan

Data Mining in Bioinformatics Day 6: Classification in Bioinformatics Karsten Borgwardt February

Data Mining in Bioinformatics Day 9: String &amp; Text Mining in Bioinformatics Karsten Borgwardt

Bioinformatics Outline What is bioinformatics? Who are bioinformaticians? Hardware

Bioinformatics Panel Presentation Peter D. Karp, Ph.D. Director, Bioinformatics Research Group

SciLifeLab Bioinformatics Platform National Bioinformatics Infrastructure Sweden (NBIS) Nina

Data Mining in Bioinformatics Day 8: Feature Selection in Bioinformatics Karsten Borgwardt

CS 466 Introduction to Bioinformatics Instructor: Jian Peng Important Biological Questions?

CS 466 Introduction to Bioinformatics Instructor: Wesley Wei Qian Probability and Statistics

Emerging Algorithms for Verifying Deep Neural Networks Changliu Liu 1 , Tomer Arnon 2 , Chris

Humble ISD First Nine Weeks Curriculum and Technology Updates for Elementary English Language

Pauls Preaching Pauls Journey to Rome Acts 27 Pauls Journey to Rome Acts 27 Paul

Participants Lives: Early findings comparing baseline and 18 month narrative interviews Eric

Perceptrons Introduction: Neural Networks 1 The Perceptron 2 Using Perceptrons Perceptrons

Interpretability of Machine Learning for Computer Vision Xinshuo Weng* *Most slides borrowed

The renormalization of the NN potential 1.1 Introduction First of all, it is worth to recall

Neural Networks: Multi-Layer Networks &amp; Back-Propagation M. Soleymani Artificial

Data Mining in Bioinformatics Day 9: String & Text Mining in Bioinformatics Karsten Borgwardt

Neural Networks: Multi-Layer Networks & Back-Propagation M. Soleymani Artificial