codes and chains

Codes and Chains [o c p d e f ... f [o c p d e f ... f [o c p d h - PowerPoint PPT Presentation

[a b c d e f ... z] [a c b d e f ... z] [a c b d e f ... r] [x c b d e f ... r] [r c b d e f... f] [r c b d e f ... f [o c b d e f ... f Codes and Chains [o c p d e f ... f [o c p d e f ... f [o c p d h f ... f [o h p d c f ... f [n h p


  1. [a b c d e f ... z] [a c b d e f ... z] [a c b d e f ... r] [x c b d e f ... r] [r c b d e f... f] [r c b d e f ... f [o c b d e f ... f Codes and Chains [o c p d e f ... f [o c p d e f ... f [o c p d h f ... f [o h p d c f ... f [n h p d c f ... f [n h v d c f ... f Paulo Orenstein [n h v m c f ... f Joint work with Juliana Freire [n h v m i f ... f Mathematics Department, PUC-Rio [n m v h i f ... f

  2. Some words -- or not? ! 2

  3. Some words -- or not? ! 2

  4. Correspondences [a b d g ... h] [p e u l ... k] [m i t v ... r] . . . ! 3

  5. Correspondences [a b d g ... h] [p e u l ... k] 26! [m i t v ... r] . . . ! 3

  6. All the correspondences

  7. But we want a single one...

  8. Counting letters Each vertex is a correspondence between letters and ciphers. The one we are looking for makes the text as similar to portuguese as possible. Some correspondences are more plausible -- let’s quantify that. Counting the frequencies of letters is a good idea. Counting the frequencies of pairs of letters is even better. As a pattern, Dom Casmurro , with 288.892 characters. ! 6

  9. Counting letters Each vertex is a correspondence between letters and ciphers. The one we are looking for makes the text as similar to portuguese as possible. Some correspondences are more plausible -- let’s quantify that. Counting the frequencies of letters is a good idea. Counting the frequencies of pairs of letters is even better. As a pattern, Dom Casmurro , with 288.892 characters. ! 6

  10. Counting letters Each vertex is a correspondence between letters and ciphers. The one we are looking for makes the text as similar to portuguese as possible. Some correspondences are more plausible -- let’s quantify that. Counting the frequencies of letters is a good idea. Counting the frequencies of pairs of letters is even better. As a pattern, Dom Casmurro , with 288.892 characters. ! 6

  11. Counting letters Each vertex is a correspondence between letters and ciphers. The one we are looking for makes the text as similar to portuguese as possible. Some correspondences are more plausible -- let’s quantify that. Counting the frequencies of letters is a good idea. Counting the frequencies of pairs of letters is even better. As a pattern, Dom Casmurro , with 288.892 characters. ! 6

  12. Counting letters Each vertex is a correspondence between letters and ciphers. The one we are looking for makes the text as similar to portuguese as possible. Some correspondences are more plausible -- let’s quantify that. Counting the frequencies of letters is a good idea. Counting the frequencies of pairs of letters is even better. As a pattern, Dom Casmurro , with 288.892 characters. ! 6

  13. Counting letters Each vertex is a correspondence between letters and ciphers. The one we are looking for makes the text as similar to portuguese as possible. Some correspondences are more plausible -- let’s quantify that. Counting the frequencies of letters is a good idea. Counting the frequencies of pairs of letters is even better. As a pattern, Dom Casmurro , with 288.892 characters. ! 6

  14. What Dom Casmurro tells us Most frequent pairs Least frequent pairs AS (5511) CJ, MG, PB (0) RA (5374) FG, XD, VC (1) OT (5186) WA, DC, HN (2) ET (5019) TN, LJ, BT (3) DE (4902) DJ, ZH (4) ! 7

  15. Imitating Portuguese The plausibility of a correspondence c is Pl ( c ) ⇣ ) ⌘ c ( Pl ( c ) = Π ( ) , where port(par) is the number of time that the pair of letters appears on Dom Casmurro , cod c (par) is the number of times that the pair appears in the cipher text using the correspondence c . ! 8

  16. What we have so far...

  17. What we have so far...

  18. What we have so far... 32 7 87 3 11 2 55 63 29 80 2 4 15 174 81 294 332 407 564

  19. Adjacent correspondences [A B C D] [A C B D] ! 10

  20. A graph of correspondences

  21. 32 7 32 87 3 11 2 2 87 55 30 63 11 29 2 55 77 2 4 63 15 29 80 23 2 4 32 15 81 174 5 7 87 332 3 81 11 897 2 407 55 332 63 29 8 564 80 4 76 32 15 564 174 32 7 87 87 3 81 11 7 294 2 55 332 3 11 9 63 2 29 55 407 80 2 63 29 15 564 80 2 4 199 15 123 76 174 81 81 294

  22. Our first Markov Chain ! 14

  23. Our first Markov Chain Sun Cloudy Rain 0.6 0.3 0.1 Sun 0.35 0.35 0.3 Cloudy 0.2 0.4 0.4 Rain ! 14

  24. Walking on the tetrahedron

  25. Walking on the tetrahedron

  26. Walking on the tetrahedron

  27. Walking on the tetrahedron

  28. Walking on the tetrahedron

  29. Walking on the tetrahedron

  30. Walking on the tetrahedron

  31. Stationary distribution Initial distribution Text Text Stationary distribution ! 16

  32. Stationary distribution 2 2 16 16 3 3 16 16 2 2 2 16 16 16 ! 16

  33. Stationary distribution Indeed, this is the stationary distribution T 0 1 1 1 0 2 1 0 0 0 0 0 2 2 16 1 1 2 0 0 0 0 0 B C B C 2 2 16 B C B C 1 1 1 3 ✓ 2 0 0 0 0 B C B C ◆ 3 3 3 16 2 3 2 3 2 2 B C B C 1 1 2 0 0 0 0 0 B C B C = 2 2 16 16 16 16 16 16 16 16 B C B C 1 1 1 3 B 0 0 0 0 C B C 3 3 3 16 B C B C 1 1 2 B C B C 0 0 0 0 0 @ A @ A 2 2 16 1 1 2 0 0 0 0 0 2 2 16 ! 17

  34. Stationary distribution Theorem X M ( x, y ) X π n → ∞ M n ( x, y ) → π ( y ) ∀ x, y ∈ X . M n ( x, y ) ≥ 0 n 0 n > n 0 ! 18

  35. The key insight Walking around the graph, we want to find vertices of high plausibility. It would be good to have a Markov Chain with a stationary distribution that gives higher probability to more plausible vertices. Let’s construct a Markov chain that has as its stationary distribution precisely the (normalized) plausibility: the Metropolis-Hastings algorithm. ! 19

  36. The key insight Walking around the graph, we want to find vertices of high plausibility. It would be good to have a Markov Chain with a stationary distribution that gives higher probability to more plausible vertices. Let’s construct a Markov chain that has as its stationary distribution precisely the (normalized) plausibility: the Metropolis-Hastings algorithm. ! 19

  37. The key insight Walking around the graph, we want to find vertices of high plausibility. It would be good to have a Markov Chain with a stationary distribution that gives higher probability to more plausible vertices. Let’s construct a Markov chain that has as its stationary distribution precisely the (normalized) plausibility: the Metropolis-Hastings algorithm. ! 19

  38. The key insight Walking around the graph, we want to find vertices of high plausibility. It would be good to have a Markov Chain with a stationary distribution that gives higher probability to more plausible vertices. Let’s construct a Markov chain that has as its stationary distribution precisely the (normalized) plausibility: the Metropolis-Hastings algorithm. ! 19

  39. Metropolis-Hastings Given a symmetric matrix M associated to a Markov chain and a vector , let us find another chain that has as P P = [ P ( 1 ) . . . P ( n )] its stationary distribution.  M ( x, y ) x 6 = y, P ( y ) � P ( x )    M ( x, y ) P ( y ) M ( x, y ) = x 6 = y, P ( y ) < P ( x ) P ( x )  M ( x, y ) + x = y.   where correction is such that the sum of entries in each line of M is 1. ! 20

  40. Metropolis-Hastings Given a symmetric matrix M associated to a Markov chain and a vector , let us find another chain that has as P P = [ P ( 1 ) . . . P ( n )] its stationary distribution.  M ( x, y ) x 6 = y, P ( y ) � P ( x )    M ( x, y ) P ( y ) M ( x, y ) = x 6 = y, P ( y ) < P ( x ) P ( x )  M ( x, y ) + x = y.   where correction is such that the sum of entries in each line of M is 1. ! 20

  41. Metropolis-Hastings Given a symmetric matrix M associated to a Markov chain and a vector , let us find another chain that has as P P = [ P ( 1 ) . . . P ( n )] its stationary distribution.  M ( x, y ) x 6 = y, P ( y ) � P ( x )    M ( x, y ) P ( y ) M ( x, y ) = x 6 = y, P ( y ) < P ( x ) P ( x )  M ( x, y ) + x = y.   where correction is such that the sum of entries in each line of M is 1. ! 20

  42. Metropolis-Hastings, 3x3 Let , with . p = ( p 1 p 2 p 3 ) p 1 > p 3 > p 2 The matrix indeed has as its stationary distribution: M p   p 2 p 3 α M 12 M 13 p 1 p 1 ( p 1 p 2 p 3 )  = ( p 1 p 2 p 3 ) . M 12 β M 23    p 2 M 13 M 23 γ p 3 ! 21

  43. Decoding ‣ Pick a correspondence and calculate its plausibility. ‣ Randomly pick an adjacent correspondence by exchanging a pair of letters, and compare its plausibility with the previous one. ‣ If it improves, accept the candidate correspondence; else, accept the change with a very small probability. ‣ Repeat the last few steps several times. ‣ Read the ciphered text after making the prescribed substitutions. ! 22

  44. It could all go wrong... - Is the text written from left to right? - English, Portuguese, Chinese? - Are there space characters? - Is there even any punctuation? - Uppercase, lowercase, accents? ! 23

  45. Cross your fingers... ! 24

  46. Cross your fingers... 1.692 characters! ! 24

Recommend


More recommend