Inférence de dates d’activité à partir d’un réseau d’interactions datées
Fabrice Rossi & Pierre Latouche
SAMM EA 4543
Infrence de dates dactivit partir dun rseau dinteractions dates - - PowerPoint PPT Presentation
Infrence de dates dactivit partir dun rseau dinteractions dates Fabrice Rossi & Pierre Latouche SAMM EA 4543 JDS 2013 1370 1370 1318 1345 General setting Decorated interaction networks interaction between
SAMM EA 4543
◮ interaction between “actors” ◮ each interaction is described by some characteristics ◮ multiple interactions between the same actors
1370 1370 1345 1318
◮ interaction between “actors” ◮ each interaction is described by some characteristics ◮ multiple interactions between the same actors
◮ very precise recording of
◮ not so precise description of the
1370 1370 1345 1318
◮ propagate information associated to interactions to actors ◮ for instance with notarial acts:
◮ dates of acts ⇒ living period ◮ geographical position of the goods ⇒ living area ◮ status in unbalanced interactions ⇒ social status
◮ propagate information associated to interactions to actors ◮ for instance with notarial acts:
◮ dates of acts ⇒ living period ◮ geographical position of the goods ⇒ living area ◮ status in unbalanced interactions ⇒ social status
◮ temporal decoration: a time stamp is associated to each
◮ the network may outlives the actors (notarial acts) ◮ estimate a central date of activity for each actor, based on the
◮ an activity interval can be estimated in some situations
◮ “propagate” interaction associated characteristics to the actors ◮ summarize the data (if needed)
1370 1370 1345 1318
◮ “propagate” interaction associated characteristics to the actors ◮ summarize the data (if needed)
◮ central actor : 1318, 1345, 1370,
◮ other actors : their unique (or
1370 1370 1345 1318
◮ based only on local interactions not at all on non interaction ◮ summarizes the characteristics but not the network
◮ interaction characteristics are close to actors characteristics ◮ interactions happen preferably between actors who share similar
◮ interaction characteristics are close to actors characteristics ◮ interactions happen preferably between actors who share similar
◮ actor i has characteristics Zi ∈ Z (dissimilarity space) ◮ i ↔ j with some probability decreasing with d(Zi, Zj) ◮ if i ↔ j, then the decoration is generated
◮ “around” Zi and Zj (same space Z) ◮ or at least in a way “consistent” with Zi and Zj (possible in another
◮ data: A adjacency matrix, D decoration table ◮ parameters: (Zi)1≤i≤N, θ ◮ likelihood:
◮ data: A adjacency matrix, D decoration table ◮ parameters: (Zi)1≤i≤N, θ ◮ likelihood:
◮ logistic connection model (related to Hoff et al., 2002):
◮ Gaussian decoration: Dij|Zi, Zj, Σ ∼ N
2
◮ connection probability: P(Aij = 1|Zi, Zj, α, β) =
◮ 1 1+e−α : maximal density of the interaction network ◮ 1 β : interaction “radius”
◮ connection probability: P(Aij = 1|Zi, Zj, α, β) =
◮ 1 1+e−α : maximal density of the interaction network ◮ 1 β : interaction “radius”
◮ Zi ∈ R: (central) activity date, Dij ∼ N
2
◮ 1 β and σ: lifespan of actors
◮ connection probability: P(Aij = 1|Zi, Zj, α, β) =
◮ 1 1+e−α : maximal density of the interaction network ◮ 1 β : interaction “radius”
◮ Zi ∈ R: (central) activity date, Dij ∼ N
2
◮ 1 β and σ: lifespan of actors
◮ here by maximum likelihood: non convex/concave optimization
◮ other techniques could be used
◮ data generated according to the
◮ realistic values for β and σ = 20
◮ α varies to simulate different
◮ the Zi are uniformly distributed in
◮ mean square error (MSE) between true Zi and estimated one ◮ baseline: local average ◮ quality: reduction in MSE with respect to the baseline
◮ roughly 2200 networks generated ◮ break even at ∼ 1.3 interaction
◮ (almost) systematic improvement
◮ some convergence issues (easy
1 2 3 4 5 6 −300 −200 −100 100 200 Noise free Average number of edges per vertex MSE improvement
◮ very bad for low density network: below 1.1 interaction per actor,
◮ good with respect to misspecification of the date distribution, e.g.
◮ decorations are assumed to be exact or at least precise ◮ but they can be attached to a wrong pair of actors
◮ notarial acts were exact at their redaction time ◮ but we miss accurate registry of the persons, in particular, many
◮ this leads to ambiguous assignment of persons to acts
◮ generate a network
◮ generate a network ◮ select (randomly)
◮ generate a network ◮ select (randomly)
◮ chose (randomly) a
◮ generate a network ◮ select (randomly)
◮ chose (randomly) a
◮ keep the original
◮ roughly 2200 networks
◮ break even at ∼ 2.1 interaction
◮ good behavior after 3 interactions
◮ more convergence issues (easy
1 2 3 4 5 6 −400 −300 −200 −100 100 200 Noise level: 5% Average number of edges per vertex MSE improvement
◮ a low level of noise (e.g. 1 %) has almost no effect on the
◮ a high level of noise (10 %) has strong adverse effects
◮ introduces a way to “push” edges decorations to agents ◮ estimate characteristics that explain both the network and the
◮ exhibit some robustness to misspecification
◮ real world data ◮ mixture model: generative model + a noise component (ongoing
◮ more complex model: explains the network with the