influence identification on independent cascade model
play

Influence Identification on Independent Cascade Model - PowerPoint PPT Presentation

Influence Identification on Independent Cascade Model Part 1 Introduction Part 2 Related Work Part 3 Algorithm Part 4 Experiment Part 5 Conclusion Part 6 References


  1. Influence Identification on Independent Cascade Model ������������� ��� ��� ���

  2. Part 1 Introduction Part 2 Related Work Part 3 Algorithm Part 4 Experiment Part 5 Conclusion Part 6 References

  3. ��������� ����������� �������������� ��������������������� l Modeled as a graph l Spread of information, ideas, and influence l Relationships and interactions l From several sources to a scope of vertices l Plays a vital role in information diffusion l Instrumental in privacy protection etc.

  4. ��������������������� ����������������������������������� l Viral marketing is a business strategy that uses existing social networks to promote a product l A fraction of customers are provided with free copies of a product, and the retailer desires the number of adoptions triggered by such trials to be maximized l Consider the first customers as virus carrier, viral marketing expects a maximal scale of infection

  5. ��������� ����������� ������� ������ ������� ��������� ����� Select a fixed ����������� number of nodes The problem also has important applications beyond social graphs, such as placing sensors in water distribution networks for detecting contamination.

  6. ����������(-��(������ ��)���()�((��� ��)���()�((��������(���� � l Ranks nodes directly according to degree l Refined adaptive version of HD l Only needs local information l Recalculates the degrees after each removal l Easy to implement l Represents the one-body scenario where the l It cannot deal with the circumstance in influencers are considered in isolation, resulting in which hubs form tight community a lack of the collective influence effects from the such that their spreading areas would neighborhood heavily overlap

  7. ���-�������)�-������( (����� �������(��� l Nodes are ranked based on their ks values l Once used by Google to rank websites l k -shell decomposition l Calculation formula: l Iteratively remove nodes l All the nodes are divided into different shells according to their relative locations in l C ( A ) is defined as the number of links going out networks of page A l Core nodes have higher probabilities to l It outputs a probability distribution cause large-scale diffusions used to represent the likelihood that a person l Influence areas would heavily overlap randomly clicking on links will arrive at any under the circumstance of multiple nodes particular page

  8. ��������� ����������� ����������������������� The idea is to measure the influential power of each node based on its local topological • structure. Then the seeds can be selected by greedily choosing the one with highest influential power. • This concept was originally proposed for solving the optimal • percolation problem, which is, to disconnect the network with as few as possible nodes. Define ! "→$ as the probability that node % belongs to the giant • component in & ∖ ( .

  9. ��������� ����������� ����������������������� Clearly all ! = 0 is a solution. But in order to be an attainable solution by iteration, it must be • stable. Define $ % as the Jacobian matrix at point all ! = 0 . The solution is stable if and only if the • leading eigenvalue of $ % is less than 1.

  10. ��������� ����������� ����������������������� The leading eigenvalue (spectrum radius) can be computed by power method. • The cost function can be simplified into the following form. •

  11. ��������� ����������� ����������������������� The cost function can be written as the sum of collective influence of each node. • To make the zero solution stable, we can iteratively remove the node with maximum CI • value until the leading eigenvalue ! < 1 . $ is proportional to the order of power iteration. Higher values of $ is more accurate, and • harder to compute. As CI depends on the status of neighboring nodes, it should be re-computed in each • iteration.

  12. ������������������� ����������� ������������� ������������������ Collective influence was not designed for influence maximization, but it gives a way to • measure the influential power of each node. Our work is to derive the formulas of collective influence for independent cascade model. • The problem is complicated as there are now two types of interactions: transmission of • infection and giant component. Define ! "# as the probability that node $ is infected in % ∖ ' . Given seed set, this can be • calculated by iteration.

  13. ������������������� ����������� ������������� ������������������ We are interested in the appearance of giant components in infected nodes. A breakout of • infection occurs when most infected nodes are connected to each other. Now we define ! "# as the probability that node $ is infected and belongs to the giant • component without % . To list equations, three more symbols need to be defined: • & "# and ' "# , which are conditional probabilities of event in ! "# given the occurrence or • non-occurrence of event in ( "# . These two variables are independent for each edge $, % . * "# , conditional probability of event in ! "# given the knowledge that node % is not • infected successfully by $ .

  14. ������������������� ����������� ������������� ������������������ Now we can write the relationship between these variables, firstly in ! "# and $ "# , then in % "# • and & "# .

  15. ������������������� ����������� ������������� ������������������ Now we can write the relationship between these variables, firstly in ! "# and $ "# , then in % "# • and & "# .

  16. ������������������� ����������� ������������� ������������������ Then we can take partial derivative to write the Jacobian matrix. •

  17. ������������������� ����������� ������������� ������������������ And our formula of collective influence can be written in a similar way as in [1]. •

  18. ������������������� ����������� ������������� ������������������ l Finally, our algorithm for seed selection.

  19. ����� �������������� ���������������� Language and Module: • Python 2.7 networkx 1.11 Network: a random graph follows • power law Node state: if infected, state = 1; • otherwise, state = 0 Spread Range: •

  20. ����� �������������� �������������������������� Essentially, graph traversal

  21. ����� �������������� �������������������������� Sampled graph for fair comparison Unfair circumstance in random information spread •

  22. ����� �������������� ����������������� • 3000 nodes, 2999 edges, average degree 2 • Achieve highest spread range in a very small q ratio (1-100 seed nodes) • HD, PageRank and CI performance are close to each other, but all a little worse than our algorithm. • K-core performs worst because it will destroy the cluster structure of power law distribution graph.

  23. ����� �������������� ����������������� • Can’t outperform other heuristic algorithm. • Maybe perform better in more complicated spreading model, say linear threshold model.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend