Influence Identification on Independent Cascade Model - - PowerPoint PPT Presentation

influence identification on independent cascade model
SMART_READER_LITE
LIVE PREVIEW

Influence Identification on Independent Cascade Model - - PowerPoint PPT Presentation

Influence Identification on Independent Cascade Model Part 1 Introduction Part 2 Related Work Part 3 Algorithm Part 4 Experiment Part 5 Conclusion Part 6 References


slide-1
SLIDE 1

Influence Identification on Independent Cascade Model

slide-2
SLIDE 2 Part 1 Introduction Part 2 Related Work Part 3 Algorithm Part 4 Experiment Part 6 References Part 5 Conclusion
slide-3
SLIDE 3
slide-4
SLIDE 4
  • l Modeled as a graph
l Relationships and interactions l Plays a vital role in information diffusion
  • l Spread of information, ideas, and influence
l From several sources to a scope of vertices l Instrumental in privacy protection etc.
slide-5
SLIDE 5
  • l Viral marketing is a business strategy that uses existing social networks to promote a
product l A fraction of customers are provided with free copies of a product, and the retailer desires the number of adoptions triggered by such trials to be maximized l Consider the first customers as virus carrier, viral marketing expects a maximal scale
  • f infection
slide-6
SLIDE 6 Select a fixed number of nodes
  • The problem also has important applications beyond social graphs, such as
placing sensors in water distribution networks for detecting contamination.
slide-7
SLIDE 7
slide-8
SLIDE 8 (-( )()(( l Ranks nodes directly according to degree l Only needs local information l Easy to implement l It cannot deal with the circumstance in which hubs form tight community such that their spreading areas would heavily overlap )()((( l Refined adaptive version of HD l Recalculates the degrees after each removal l Represents the one-body scenario where the influencers are considered in isolation, resulting in a lack of the collective influence effects from the neighborhood
slide-9
SLIDE 9
  • )-(
( l Nodes are ranked based on their ks values l k-shell decomposition l Iteratively remove nodes l All the nodes are divided into different shells according to their relative locations in networks l Core nodes have higher probabilities to cause large-scale diffusions l Influence areas would heavily
  • verlap
under the circumstance of multiple nodes ( l Once used by Google to rank websites l Calculation formula: l C(A) is defined as the number of links going out
  • f page A
l It
  • utputs
a probability distribution used to represent the likelihood that a person randomly clicking
  • n
links will arrive at any particular page
slide-10
SLIDE 10
  • The idea is to measure the influential power of each node based on its local topological
structure.
  • Then the seeds can be selected by greedily choosing the one with highest influential power.
  • This concept was originally proposed for solving the optimal
percolation problem, which is, to disconnect the network with as few as possible nodes.
  • Define !"→$as the probability that node % belongs to the giant
component in & ∖ (.
slide-11
SLIDE 11
  • Clearly all ! = 0 is a solution. But in order to be an attainable solution by iteration, it must be
stable.
  • Define $
% as the Jacobian matrix at point all ! = 0. The solution is stable if and only if the leading eigenvalue of $ % is less than 1.
slide-12
SLIDE 12
  • The leading eigenvalue (spectrum radius) can be computed by power method.
  • The cost function can be simplified into the following form.
slide-13
SLIDE 13
  • The cost function can be written as the sum of collective influence of each node.
  • To make the zero solution stable, we can iteratively remove the node with maximum CI
value until the leading eigenvalue ! < 1.
  • $ is proportional to the order of power iteration. Higher values of $ is more accurate, and
harder to compute.
  • As CI depends on the status of neighboring nodes, it should be re-computed in each
iteration.
slide-14
SLIDE 14
slide-15
SLIDE 15
  • Collective influence was not designed for influence maximization, but it gives a way to
measure the influential power of each node.
  • Our work is to derive the formulas of collective influence for independent cascade model.
  • The problem is complicated as there are now two types of interactions: transmission of
infection and giant component.
  • Define !"# as the probability that node $is infected in % ∖ '. Given seed set, this can be
calculated by iteration.
slide-16
SLIDE 16
  • We are interested in the appearance of giant components in infected nodes. A breakout of
infection occurs when most infected nodes are connected to each other.
  • Now we define !"# as the probability that node $is infected and belongs to the giant
component without %.
  • To list equations, three more symbols need to be defined:
  • &"# and '"#, which are conditional probabilities of event in !"# given the occurrence or
non-occurrence of event in ("#. These two variables are independent for each edge $, % .
  • *"# , conditional probability of event in !"# given the knowledge that node % is not
infected successfully by $.
slide-17
SLIDE 17
  • Now we can write the relationship between these variables, firstly in !"# and $"#, then in %"#
and &"#.
slide-18
SLIDE 18
  • Now we can write the relationship between these variables, firstly in !"# and $"#, then in %"#
and &"#.
slide-19
SLIDE 19
  • Then we can take partial derivative to write the Jacobian matrix.
slide-20
SLIDE 20
  • And our formula of collective influence can be written in a similar way as in [1].
slide-21
SLIDE 21
  • l Finally,
  • ur
algorithm for seed selection.
slide-22
SLIDE 22
slide-23
SLIDE 23
  • Language and Module:
Python 2.7 networkx 1.11
  • Network: a random graph follows
power law
  • Node state: if infected, state = 1;
  • therwise, state = 0
  • Spread Range:
slide-24
SLIDE 24
  • Essentially, graph traversal
slide-25
SLIDE 25
  • Sampled graph for fair comparison
  • Unfair circumstance in random information spread
slide-26
SLIDE 26
  • 3000 nodes, 2999 edges,
average degree 2
  • Achieve highest spread range
in a very small q ratio (1-100 seed nodes)
  • HD, PageRank and CI
performance are close to each other, but all a little worse than our algorithm.
  • K-core performs worst
because it will destroy the cluster structure of power law distribution graph.
slide-27
SLIDE 27
  • Can’t outperform other
heuristic algorithm.
  • Maybe perform better in more
complicated spreading model, say linear threshold model.
slide-28
SLIDE 28
slide-29
SLIDE 29
  • We studied the influence maximization problem on independent cascade
(IC) models.
  • We derive a new algorithm based on the idea of collective influence,
calculating the leading eigenvalue by power iteration and so on.
  • We conduct a simulation experiments on random graph that follows power
law distribution. The result shows that our algorithm have achieved basic superiority.
  • Further work may include:
  • Optimization to reduce computation complexity
  • Test the performance on more practical network
  • Test the performance on more complicated information cascade
model
slide-30
SLIDE 30 l Work division
slide-31
SLIDE 31
slide-32
SLIDE 32 Hernán A. Makse Flaviano Morone. “Influence maximization in complex networks through optimal percolation”. In: Nature 524 (), p. 65. url: http://dx.doi.org/10.1038/nature14604 (cit. on pp. 1, 4–6).
slide-33
SLIDE 33

!