reconstructing patterns of information diffusion from
play

Reconstructing Patterns of Information Diffusion from Incomplete - PowerPoint PPT Presentation

Reconstructing Patterns of Information Diffusion from Incomplete Observations Sapienza University Flavio Chierichetti Cornell University Jon Kleinberg Carleton College David Liben-Nowell Internet Activism Very important phenomenon.


  1. Iraq Chain Letter Dear all: 4) Jok FERRAND, Mont St. Martin, France 5) Emmanuelle PIGNOL, St Martin d'Heres, FRANCE The US Congress has just authorized the President 6) Marie GAUTHIER, Grenoble, FRANCE of the US to go to war against Iraq. The UN is 7) Laurent VESCALO, Grenoble, FRANCE gathering signatures in an effort to avoid this 8) Mathieu MOY, St Egreve, FRANCE tragic world event. 9) Bernard BLANCHET, Mont St Martin,FRANCE 10) Tassadite FAVRIE, Grenoble, FRANCE Please consider this an urgent request: UN 11) Loic GODARD, St Ismier, FRANCE Petition for Peace - Stand for Peace. Islam is not 12) Benedicte PASCAL, Grenoble, FRANCE the Enemy. 13) Khedaidja BENATIA, Grenoble, FRANCE War is NOT the Answer. 14) Marie-Therese LLORET, Grenoble,FRANCE 15) Benoit THEAU, Poitiers, FRANCE Today we are at a point of imbalance in the world 16) Bruno CONSTANTIN, Poitiers, FRANCE and are moving toward what may be the beginning 17) Christian COGNARD, Poitiers, FRANCE of a THIRD WORLD WAR. 18) Robert GARDETTE, Paris, FRANCE 19) Claude CHEVILLARD, Montpellier, FRANCE Please COPY (rather than Forward) this e-mail in a 20) Gilles FREISS, Montpellier, FRANCE new message, sign at the end of the list, and send 21) Patrick AUGEREAU, Montpellier, FRANCE it to all the people whom you know. 22) Jean IMBER! T, Marseille, FRANCE 23) Jean-Claude MURAT, Toulouse, France If you receive this list with more than 500 names 24) Anna BASSOLS, Barcelona, Catalonia signed, please send a copy of the message to: 25) Mireia DUNACH, Barcelona, Catalonia 26) Michel VILLAZ, Grenoble, France usa@un.int 27) Pages Frederique, Dijon, France president@whitehouse.gov 28) Rodolphe FISCHMEISTER,Chatenay-Malabry, France 29) Francois BOUTEAU, Paris, France Even if you decide not to sign, please consider 30) Patrick PETER, Paris, France forwarding the petition 31) Lorenza RADICI, Paris, France on instead of 32) Monika Siegenthaler, Bern, Switzerland deleting it. 33) Mark Philp,Glasgow,Scotland 34) Tomas Andersson, Stockholm, Sweden 1) Suzanne Dathe, Grenoble, France 35) Jonas Eriksson, Stockholm, Sweden 2) Laurence COMPARAT, Grenoble, France 36) Karin Eriksson, Stockholm, Sweden 3) Philippe MOTTE, Grenoble, France ...

  2. Iraq Chain Letter Dear all: 4) Jok FERRAND, Mont St. Martin, France 5) Emmanuelle PIGNOL, St Martin d'Heres, FRANCE The US Congress has just authorized the President 6) Marie GAUTHIER, Grenoble, FRANCE of the US to go to war against Iraq. The UN is 7) Laurent VESCALO, Grenoble, FRANCE gathering signatures in an effort to avoid this 8) Mathieu MOY, St Egreve, FRANCE tragic world event. 9) Bernard BLANCHET, Mont St Martin,FRANCE 10) Tassadite FAVRIE, Grenoble, FRANCE Please consider this an urgent request: UN 11) Loic GODARD, St Ismier, FRANCE Petition for Peace - Stand for Peace. Islam is not 12) Benedicte PASCAL, Grenoble, FRANCE the Enemy. 13) Khedaidja BENATIA, Grenoble, FRANCE War is NOT the Answer. 14) Marie-Therese LLORET, Grenoble,FRANCE 15) Benoit THEAU, Poitiers, FRANCE Today we are at a point of imbalance in the world 16) Bruno CONSTANTIN, Poitiers, FRANCE and are moving toward what may be the beginning 17) Christian COGNARD, Poitiers, FRANCE of a THIRD WORLD WAR. 18) Robert GARDETTE, Paris, FRANCE 19) Claude CHEVILLARD, Montpellier, FRANCE Please COPY (rather than Forward) this e-mail in a 20) Gilles FREISS, Montpellier, FRANCE new message, sign at the end of the list, and send 21) Patrick AUGEREAU, Montpellier, FRANCE it to all the people whom you know. 22) Jean IMBER! T, Marseille, FRANCE 23) Jean-Claude MURAT, Toulouse, France If you receive this list with more than 500 names 24) Anna BASSOLS, Barcelona, Catalonia signed, please send a copy of the message to: 25) Mireia DUNACH, Barcelona, Catalonia 26) Michel VILLAZ, Grenoble, France usa@un.int 27) Pages Frederique, Dijon, France president@whitehouse.gov 28) Rodolphe FISCHMEISTER,Chatenay-Malabry, France 29) Francois BOUTEAU, Paris, France Even if you decide not to sign, please consider 30) Patrick PETER, Paris, France forwarding the petition 31) Lorenza RADICI, Paris, France on instead of 32) Monika Siegenthaler, Bern, Switzerland deleting it. 33) Mark Philp,Glasgow,Scotland 34) Tomas Andersson, Stockholm, Sweden 1) Suzanne Dathe, Grenoble, France 35) Jonas Eriksson, Stockholm, Sweden 2) Laurence COMPARAT, Grenoble, France 36) Karin Eriksson, Stockholm, Sweden 3) Philippe MOTTE, Grenoble, France ...

  3. IRAQ revealed tree Liben-Nowell, Kleinberg, PNAS’08 18,119 nodes

  4. IRAQ revealed tree Liben-Nowell, Kleinberg, PNAS’08 18,119 nodes 17,079 nodes with one child (94%)

  5. IRAQ revealed tree Liben-Nowell, Kleinberg, PNAS’08 18,119 nodes 17,079 nodes with one child (94%) 00,620 exposed nodes 00,557 (exposed) leaves

  6. IRAQ revealed tree Liben-Nowell, Kleinberg, PNAS’08 18,119 nodes 17,079 nodes with one child (94%) 00,620 exposed nodes 00,557 (exposed) leaves Why is this fraction so high?

  7. IRAQ revealed tree Liben-Nowell, Kleinberg, PNAS’08 18,119 nodes 17,079 nodes with one child (94%) 00,620 exposed nodes 00,557 (exposed) leaves Why is this fraction so high? What can we infer about the original, unknown , Chain Letter Tree?

  8. Tree-Revealing Process Liben-Nowell, Kleinberg, PNAS’08 Aaron Betty Charles David Earl Fran George Hilary

  9. Tree-Revealing Process Liben-Nowell, Kleinberg, PNAS’08 Aaron Betty Charles David Earl Fran George Hilary Each node is exposed independently with prob. δ > 0

  10. Tree-Revealing Process Liben-Nowell, Kleinberg, PNAS’08 Aaron Betty Charles David Earl Fran George Hilary Each node is exposed independently with prob. δ > 0

  11. Tree-Revealing Process Liben-Nowell, Kleinberg, PNAS’08 Aaron Betty Charles David Earl Fran George Hilary Each node is exposed independently with prob. δ > 0

  12. Tree-Revealing Process Liben-Nowell, Kleinberg, PNAS’08 Aaron Betty Charles David Earl Fran George Hilary Each node is exposed independently with prob. δ > 0

  13. Tree-Revealing Process Liben-Nowell, Kleinberg, PNAS’08 Aaron Betty Charles David Earl Fran George Hilary Ancestors of exposed nodes are revealed

  14. Tree-Revealing Process Liben-Nowell, Kleinberg, PNAS’08 Aaron Betty Charles David Earl Fran George Hilary Ancestors of exposed nodes are revealed

  15. Previous Work • Golub, Jackson, PNAS’10 perform simulations, • using branching process trees near the critical threshold as the Chain Letter Trees, • and exposing nodes as in Kleinberg, Liben-Nowell, PNAS’08 . • They observe that the revealed tree has a high fraction of nodes with only one child (and some other properties).

  16. Our Contribution • Our 1st result, informally , states that the tree-revealing process, is enough to explain the high fraction of single-child nodes, assuming only a degree bound on the unknown chain letter tree.

  17. Our Contribution • Our 1st result, informally , states that the tree-revealing process, is enough to explain the high fraction of single-child nodes assuming only a degree bound on the unknown chain letter tree.

  18. Revealed vs. Unknown We see a “revealed” tree... Aaron Betty David Earl George Hilary

  19. Revealed vs. Unknown We see a ...we would like to study “revealed” tree... the “unknown” tree! Aaron Aaron Betty David Betty Charles David Earl George George Earl Fran Hilary Ian Hilary Jason Kurt Larry

  20. Revealed vs. Unknown We see a ...we would like to study “revealed” tree... the “unknown” tree! Aaron Aaron Betty David Betty Charles David Earl George George Earl Fran Hilary Ian Hilary Jason Kurt Larry Size? Width? Height? Degree Distribution? ...

  21. Revealed vs. Unknown We see a ...we would like to study “revealed” tree... the “unknown” tree! Aaron Aaron Betty David Betty Charles David Earl George George Earl Fran Hilary Ian Hilary Jason Kurt Larry Size? Width? Height? Degree Distribution? ...

  22. Our Contribution • Our 2nd result, informally , states that (under reasonable assumptions) it is possible to estimate the size of the unknown chain letter tree with a small error, with high probability.

  23. Our Contribution • Our 2nd result, informally , states that (under reasonable assumptions) it is possible to estimate the size of the unknown chain letter tree with a small error, with high probability. Observe that we do not know the exposing probability δ

  24. Our Contribution • Our 2nd result, informally , states that (under reasonable assumptions) it is possible to estimate the size of the unknown chain letter tree with a small error, with high probability. We use this theorem to estimate that ~ 173k people that signed the IRAQ chain letter This estimate is backed by a probability bound (on the probability space induced by the revealing process)

  25. Our Contribution • Our 2nd result, informally , states that (under reasonable assumptions) it is possible to estimate the size of the unknown chain letter tree with a small error, with high probability. We use this theorem to estimate that ~ 173k people that signed the IRAQ chain letter The chain letter generated ~ 3.5M emails

  26. Single-Child Fraction • Nodes are exposed with probability δ > 0 • We assume that the unknown tree’s maximum degree is at most k

  27. Single-Child Fraction We partition the tree into subforests,

  28. Single-Child Fraction We partition the tree into subforests,

  29. Single-Child Fraction We partition the tree into subforests,

  30. Single-Child Fraction We partition the tree into subforests, in such a way that each subforest has nodes ' δ − 1 and the median height in the subforest is . log k − 1 δ − 1 � � Ω

  31. Single-Child Fraction F We partition the tree into subforests, in such a way that each subforest has nodes ' δ − 1 and the median height in the subforest is . log k − 1 δ − 1 � � Ω

  32. Single-Child Fraction F We partition the tree into subforests, in such a way that each subforest has nodes ' δ − 1 and the median height in the subforest is . log k − 1 δ − 1 � � Ω

  33. Single-Child Fraction � log k − 1 δ − 1 � Ω F We partition the tree into subforests, in such a way that each subforest has nodes ' δ − 1 and the median height in the subforest is . log k − 1 δ − 1 � � Ω

  34. Single-Child Fraction δ − 1 2 � log k − 1 δ − 1 � Ω δ − 1 F 2 We partition the tree into subforests, in such a way that each subforest has nodes ' δ − 1 and the median height in the subforest is . log k − 1 δ − 1 � � Ω

  35. Single-Child Fraction Pr[some node is exposed δ − 1 in F ’s lower half] = Θ (1) 2 � log k − 1 δ − 1 � Ω δ − 1 F 2 We partition the tree into subforests, in such a way that each subforest has nodes ' δ − 1 and the median height in the subforest is . log k − 1 δ − 1 � � Ω

  36. Single-Child Fraction Pr[some node is exposed δ − 1 in F ’s lower half] = Θ (1) 2 � log k − 1 δ − 1 � Ω δ − 1 F 2 We partition the tree into subforests, in such a way that each subforest has nodes ' δ − 1 and the median height in the subforest is . log k − 1 δ − 1 � � Ω

  37. Single-Child Fraction Pr[some node is exposed δ − 1 in F ’s lower half] = Θ (1) 2 � log k − 1 δ − 1 � Ω If this happens, δ − 1 F nodes Ω (log k − 1 δ − 1 ) 2 will be revealed in . F We partition the tree into subforests, in such a way that each subforest has nodes ' δ − 1 and the median height in the subforest is . log k − 1 δ − 1 � � Ω

  38. Single-Child Fraction Pr[some node is exposed in F ’s lower half] = Θ (1) If this happens, nodes Ω (log k − 1 δ − 1 ) will be revealed in . F F We partition the tree into subforests, in such a way that each subforest has nodes ' δ − 1 and the median height in the subforest is . log k − 1 δ − 1 � � Ω

  39. Single-Child Fraction Pr[some node is exposed in F ’s lower half] = Θ (1) # of forests ' n · δ If this happens, nodes Ω (log k − 1 δ − 1 ) will be revealed in . F F F We partition the tree into subforests, in such a way that each subforest has nodes ' δ − 1 and the median height in the subforest is . log k − 1 δ − 1 � � Ω

  40. Single-Child Fraction Pr[some node is exposed in F ’s lower half] = Θ (1) # of forests ' n · δ If this happens, nodes Ω (log k − 1 δ − 1 ) will be revealed in . F F F Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1)

  41. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1)

  42. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[at most 2 · n · δ nodes will be exposed] = 1 − o (1)

  43. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[at most 2 · n · δ nodes will be exposed] = 1 − o (1) Each leaf in the revealed tree is an exposed node.

  44. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[at most 2 · n · δ nodes will be exposed] = 1 − o (1) Each leaf in the revealed tree is an exposed node. Pr[the revealed tree will have at most 2 · n · δ leaves] = 1 − o (1)

  45. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[at most 2 · n · δ nodes will be exposed] = 1 − o (1) Each leaf in the revealed tree is an exposed node. Pr[the revealed tree will have at most 2 · n · δ leaves] = 1 − o (1) In an arbitrary tree, the number of internal nodes with more than one child is upper-bounded by the number of leaves.

  46. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[at most 2 · n · δ nodes will be exposed] = 1 − o (1) Each leaf in the revealed tree is an exposed node. Pr[the revealed tree will have at most 2 · n · δ leaves] = 1 − o (1) In an arbitrary tree, the number of internal nodes with more than one child is upper-bounded by the number of leaves. Pr[the revealed tree has ≤ 4 n δ non-single-child nodes] = 1 − o (1)

  47. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[the revealed tree has ≤ 4 n δ non-single-child nodes] = 1 − o (1)

  48. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[the revealed tree has ≤ 4 n δ non-single-child nodes] = 1 − o (1)

  49. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[the revealed tree has ≤ 4 n δ non-single-child nodes] = 1 − o (1) � n · δ · log k − 1 δ − 1

  50. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[the revealed tree has ≤ 4 n δ non-single-child nodes] = 1 − o (1) � n · δ · log k − 1 δ − 1

  51. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[the revealed tree has ≤ 4 n δ non-single-child nodes] = 1 − o (1) � n · δ · log k − 1 δ − 1 ⌧ n · δ

  52. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[the revealed tree has ≤ 4 n δ non-single-child nodes] = 1 − o (1) � n · δ · log k − 1 δ − 1 ⌧ n · δ 1 A fraction of the set. log k − 1 δ − 1

  53. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[the revealed tree has ≤ 4 n δ non-single-child nodes] = 1 − o (1) Pr [the fraction of single-child nodes in the ✓ ◆� 1 revealed tree is ≥ 1 − O = 1 − o (1) log k − 1 δ − 1

  54. Single-Child Fraction Pr[ Ω ( n · δ · log k − 1 δ − 1 ) nodes will be revealed] = 1 − o (1) Pr[the revealed tree has ≤ 4 n δ non-single-child nodes] = 1 − o (1) Pr [the fraction of single-child nodes in the ✓ ◆� 1 revealed tree is ≥ 1 − O = 1 − o (1) log k − 1 δ − 1 The high single-child fraction can be explained by assuming just a degree bound on the unknown tree

  55. Number of Signers Revealed Tree Unknown Tree Aaron Aaron Betty David Betty Charles David Earl George George Earl Fran Hilary Ian Hilary Jason Kurt Larry How to guess the size of the unknown tree?

  56. Unknown Tree Exposure

  57. Unknown Tree Exposure ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

  58. Unknown Tree Exposure ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

  59. Unknown Tree Exposure ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

  60. Unknown Tree Exposure ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

  61. Unknown Tree Exposure ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

  62. Unknown Tree Exposure ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

  63. Unknown Tree Exposure ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

  64. Unknown Tree Exposure ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

  65. Unknown Tree Exposure ? ? ? ? ? ? ? ? ? ?

  66. Revealed Tree ? ? ? ? ? ? ? ? ? ?

  67. Revealed Tree Nodes exposures are IID here! ? ? ? ? ? ? ? ? ? ?

  68. Size Estimation Nodes exposures are IID here! ? ? ? ? ? ? ? ? ? ? 1. Estimate δ

  69. Size Estimation Nodes exposures are IID here! 1. Estimate δ

  70. Size Estimation Nodes exposures are IID here! δ ' 3 10 1. Estimate δ

  71. Size Estimation Nodes exposures are IID here! δ ' 3 10 2. Estimate using the number of n · δ exposed nodes in the revealed tree

  72. Size Estimation Nodes exposures are IID here! δ ' 3 10 n · δ ' 7 2. Estimate using the number of n · δ exposed nodes in the revealed tree

  73. Size Estimation Nodes exposures are IID here! δ ' 3 10 n · δ ' 7 n ' 23 . ¯ 3 3. Take the ratio

  74. Size Estimation Nodes exposures are IID here! δ ' 3 10 n · δ ' 7 n ' 23 . ¯ 3 What can go wrong?

  75. Size Estimation Nodes exposures are IID here! δ ' 3 10 n · δ ' 7 n ' 23 . ¯ 3 The “yellow area” could contain too few nodes for the estimation of to be successful. δ

  76. Size Estimation ? The “yellow area” could contain too few nodes for the estimation of to be successful. δ

  77. Size Estimation ? The “yellow area” could contain too few nodes for the estimation of to be successful. δ

  78. Theorem • The previous algorithm can guess the size with high probability if n > ˜ δ − 2 , δ − 1 · k � � �� Ω max is the maximum number of children in k the unknown tree, is the exposing probability. δ • No algorithm can do it otherwise.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend