Breaking the News: Extracting the Sparse Citation Network Backbone
- f Online News Articles
Breaking the News: Extracting the Sparse Citation Network Backbone - - PowerPoint PPT Presentation
Breaking the News: Extracting the Sparse Citation Network Backbone of Online News Articles Andreas Spitz and Michael Gertz Heidelberg University Institute of Computer Science Database Systems Research Group http://dbs.ifi.uni-heidelberg.de
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 1 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 1 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 2 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 2 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 3 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 4 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 5 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 6 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 6 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 7 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 7 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 8 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 9 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 9 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
frequency by outlet frequency by category
3363 3363 668 668 668 9544 9544 5207 5207 7630 7630 7630 7630 142 11010 11010 11010 11010 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k 11k welt zeit faz
politics business none source welt zeit faz
politics business none
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 10 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 11 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
aggregated politics business welt zeit faz
100 101 102 103 100 101 102 103 100 101 102 103 104100 101 102 103 104100 101 102 103 104
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 12 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
100 10−1 10−2 10−3 10−4 100 10−1 10−2 10−3 10−4 1 2 4 8 16 1 2 4 8 16 1 2 4 8 16
degree
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 13 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 14 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 15 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
average degree global clustering coefficient undirected diameter average path length
0.0 0.5 1.0 1.5 2.0 0.0 0.1 0.2 20 40 60 5 10 15 20 1 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300
days measure value network aggregated politics business
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 16 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 17 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 17 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 18 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 19 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
14 30 60 1 2 4 8 16
degree
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 20 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 21 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 22 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 22 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 22 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 22 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 23 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
α = −0.88 α = −0.93 α = −0.98
1000 2000 3000 1000 2000 3000 1000 2000 3000
β = 0.33 β = 0.38 β = 0.43
5000 10000 15000 5000 10000 15000 5000 10000 15000
node index value of measure
measure
∆ model ∆ observed λ model λ observed Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 24 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
0.00 0.25 0.50 0.75 1.00 −2.0 −1.5 −1.0 −0.5 0.0
temporal attachment exponent α neighbour connection probability β
25000 50000 75000 goodness
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 25 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
0.00 0.25 0.50 0.75 1.00 −2.0 −1.5 −1.0 −0.5 0.0
temporal attachment exponent α neighbour connection probability β
25000 50000 75000 goodness
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 25 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 26 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
normalization by day normalization by week normalization by month
100 10−1 10−2 10−3 10−4 100 10−1 10−2 10−3 10−4 0.25 0.5 1 2 4 8 16 0.25 0.5 1 2 4 8 16
degree complementary cumulative probability
news
zeit welt Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 27 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 28 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 29 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 30 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
din pr-rank
category date headline 20 7 zeit politics 2014.07.21 Ukraine – MH17-Absturz: was wann geschah 15 343 zeit politics 2014.12.05 Ukraine-Krise – Wieder Krieg in Europa: Nicht in unserem Namen! 14 13 zeit politics 2014.09.07 Ukraine – OSZE gibt Details des Minsker Abkommens bekannt 13 178 welt politics 2014.10.15 Asylbewerber – Deutschland ist das Fl¨ uchtlingsheim Europas 12 312 zeit business 2015.02.04 Yanis Varoufakis – “Ich bin Finanzminister eines bankrotten Staates”
din pr-rank
category date headline 6 1 zeit politics 2014.08.08 Erbil – Blitzvormarsch der Dschihadisten ließ USA angreifen 6 2 zeit politics 2014.08.10 Irak – Zehntausende Jesiden bringen sich in Sicherheit 9 3 zeit politics 2014.06.10 Irak – Aufst¨ andische besetzen Teile der Stadt Mossul 7 4 zeit politics 2014.06.10 Al-Kaida in Mossul – Der Staat Irak schwindet 7 5 zeit politics 2014.07.19 Irak – Tausende Christen fliehen aus Mossul Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 31 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
0.0 0.2 0.4 0.6 0.8 1.0 5000 10000 15000 3000 6000 9000 2000 4000 6000
news outlet welt zeit faz
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 32 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 33 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 33 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 34 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
100 10−1 10−2 10−3 10−4 100 101 102 103
degree
centrality profile
0.0 0.2 0.4 0.6 0.8 1.0 5000 10000 15000
news outlet welt zeit faz Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 35 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 36 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 37 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 37 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Albert-L´ aszl´
asi and R´ eka Albert. Emergence of scaling in random networks. Science, 286(5439):509–512, 1999. Sergey N Dorogovtsev and Jos´ e FF Mendes. Evolution of networks with aging of sites. Physical Review E, 62(2):1842, 2000. Eugene Garfield. Citation analysis as a tool in journal evaluation. Science, 178(4060):471–479, 1972. Jorge E Hirsch. An index to quantify an individual’s scientific research output. PNAS, 102(46):16569–16572, 2005. Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon. Network motifs: Simple building blocks of complex networks. Science, 298:824–827, 2002. Mark EJ Newman. Mixing patterns in networks. Physical Review E, 67(2):026126, 2003. Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: Bringing order to the web. 1999. Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 38 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Derek de Solla Price. Networks of scientific papers. Science, 149(3683):510–515, 1965. Filippo Radicchi, Santo Fortunato, and Claudio Castellano. Universality of citation distributions: Toward an objective measure of scientific impact. PNAS, 105(45):17268–17272, 2008. Zhi-Xi Wu and Petter Holme. Modeling scientific-citation patterns and other triangle-rich acyclic networks. Physical review E, 80(3):037101, 2009. Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 39 of 37
Motivation Data Extraction Network Structure Citation Characteristics Applications Traditional Networks Summary
Extracting the Sparse Citation Network Backbone of Online News Articles c Andreas Spitz 40 of 37