Monitoring Network Structure and Content Quality of Signal - - PowerPoint PPT Presentation

monitoring network structure and content quality of
SMART_READER_LITE
LIVE PREVIEW

Monitoring Network Structure and Content Quality of Signal - - PowerPoint PPT Presentation

Monitoring Network Structure and Content Quality of Signal Processing Articles on Wikipedia Tao C. Lee and Jayakrishnan Unnikrishnan LCAV, EPFL ICASSP, Vancouver, Canada { tao.lee, jay.unnikrishnan } @epfl.ch May 31, 2013 A Google Users


slide-1
SLIDE 1

Monitoring Network Structure and Content Quality of Signal Processing Articles on Wikipedia

Tao C. Lee and Jayakrishnan Unnikrishnan

LCAV, EPFL ICASSP, Vancouver, Canada {tao.lee, jay.unnikrishnan}@epfl.ch

May 31, 2013

slide-2
SLIDE 2

A Google User’s Impression

Searching for sampling theorem on Google ...

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 2 / 19

slide-3
SLIDE 3

A Google User’s Impression

An article with rich information ...

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 2 / 19

slide-4
SLIDE 4

A Google User’s Impression

Viewed by many people ...

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 2 / 19

slide-5
SLIDE 5

A Google User’s Impression

Searching for image denoising on Google ...

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 2 / 19

slide-6
SLIDE 6

A Google User’s Impression

An article with limited information ...

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 2 / 19

slide-7
SLIDE 7

A Google User’s Impression

Viewed by some people ...

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 2 / 19

slide-8
SLIDE 8

Wikipedia and Signal Processing

Wikipedia

A widely-used resource Freelance editing model: anyone can edit

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 3 / 19

slide-9
SLIDE 9

Wikipedia and Signal Processing

Wikipedia

A widely-used resource Freelance editing model: anyone can edit

Signal Processing (SP) articles on Wikipedia

>1000 Articles, still growing Grouped by subcategories Need to monitor their quality!

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 3 / 19

slide-10
SLIDE 10

Outline

Ranking article importance

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 4 / 19

slide-11
SLIDE 11

Outline

Ranking article importance Assessing article quality

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 4 / 19

slide-12
SLIDE 12

Outline

Ranking article importance Assessing article quality Generating an improvement list

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 4 / 19

slide-13
SLIDE 13

Outline

Ranking article importance Assessing article quality Generating an improvement list Conclusions & future work

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 4 / 19

slide-14
SLIDE 14

Importance Ranking: PageRank and HITS

How to rank SP articles on Wikipedia ...

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 5 / 19

slide-15
SLIDE 15

Importance Ranking: PageRank and HITS

PageRank [Brin98]

Rank the probability of visiting an article A random walk model An eigenvalue problem: find the eigenvector with eigenvalue 1 for a stochastic matrix

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 5 / 19

slide-16
SLIDE 16

Importance Ranking: PageRank and HITS

PageRank [Brin98]

Rank the probability of visiting an article A random walk model An eigenvalue problem: find the eigenvector with eigenvalue 1 for a stochastic matrix

HITS [Kleinberg99]

Rank the authority of an article Two scores

Authority: summation of hubness of point-to neighbors Hubness: summation of authority of point-by neighbors

Iterative computation

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 5 / 19

slide-17
SLIDE 17

Top-15 Articles by PageRank

Ranking Article 1 Kalman filter 2 Signal-to-noise ratio 3 Bilinear time–frequency distribution 4 Signal processing 5 Itakura–Saito distance 6 Ridge detection 7 Short-time Fourier transform 8 Thunder 9 Nyquist–Shannon sampling theorem 10 A-weighting 11 Image processing 12 Nyquist frequency 13 Hilbert transform 14 Wigner distribution function 15 Gaussian noise

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 6 / 19

slide-18
SLIDE 18

Top-15 Articles by HITS

Ranking Article 1 Dirac delta function 2 Dirac comb 3 Nyquist–Shannon sampling theorem 4 Whittaker–Shannon interpolation formula 5 Nyquist frequency 6 Fourier analysis 7 Discrete Fourier transform 8 Digital signal processing 9 Fast Fourier transform 10 LTI system theory 11 Kalman filter 12 Nyquist rate 13 Short-time Fourier transform 14 Discrete-time Fourier transform 15 Wiener filter

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 7 / 19

slide-19
SLIDE 19

Island Structure: The Case of Itakura–Saito Distance

Island structure is favored by PageRank

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 8 / 19

slide-20
SLIDE 20

Where Is Image Denoising?

Important but under-ranked

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 9 / 19

slide-21
SLIDE 21

Where Is Image Denoising?

Visibility can be improved by adding links

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 9 / 19

slide-22
SLIDE 22

Importance Ranking via Crowdsourcing

Contributed by 19/50 researchers from EPFL and elsewhere

Ranking Article 1 Convolution 2 Fast Fourier transform 3 Nyquist-Shannon sampling theorem 4 Sampling (signal processing) 5 Filter (signal processing) 6 Fourier analysis 7 Kalman filter 8 Cross-correlation 9 Wavelet transform 10 Impulse response 11 Kalman filter 12 Discrete Fourier transform

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 10 / 19

slide-23
SLIDE 23

Information Quality Analysis

Heuristics-based metrics [Stvilia07]

Reputation Completeness Metric = Σ (Parameter · Weight) Metric Parameter Weight Reputation # editors 0.2 # edits 0.2 # articles connected through common editors 0.1 # reverts 0.3 # external links 0.2 # registered user edits 0.1 # anonymous user edits 0.2 Completeness # internal links 0.4 article length 0.6

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 11 / 19

slide-24
SLIDE 24

Top-15 Articles by Reputation

Ranking Article 1 Analog-to-digital converter 2 Charge-coupled device 3 Convolution 4 Noise 5 Microelectromechanical systems 6 Sensor 7 Digital signal processing 8 Discrete Fourier transform 9 Pixel 10 Computer vision 11 Relay 12 White noise 13 Doppler effect 14 Dirac delta function 15 Potentiometer

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 12 / 19

slide-25
SLIDE 25

Top-15 Articles by Completeness

Ranking Article 1 Geophysical MASINT 2 Dirac delta function 3 Kalman filter 4 Avizo (software) 5 Noise in music 6 Allan variance 7 Mathematics of radio engineering 8 Discrete Fourier transform 9 Mechanical filter 10 JPEG 2000 11 Ordinary least squares 12 Color vision 13 Maximum likelihood 14 Hilbert transform 15 Nyquist–Shannon sampling theorem

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 13 / 19

slide-26
SLIDE 26

Information Quality v.s. Importance

Scores

Importance score = (total articles - HITS ranking) Information quality score = reputation/completeness scores

Proportional?

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 14 / 19

slide-27
SLIDE 27

Information Quality v.s. Importance

Strong fluctuations

(c) Reputation v.s. Importance (d) Completeness v.s. Importance

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 14 / 19

slide-28
SLIDE 28

Generating an Improvement List

Articles to be improved

High ranking difference between importance and information quality High importance ranking (high HITS ranking) Still incomplete (low completeness score)

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 15 / 19

slide-29
SLIDE 29

Generating an Improvement List

Articles to be improved

High ranking difference between importance and information quality High importance ranking (high HITS ranking) Still incomplete (low completeness score)

Need For Improvement (NFI) score

NFI score = Γ · θ(d) · δ(c)

where Γ = (total articles − HITS ranking) d = difference score, c = completeness score θ(d) =

  • d

: d > thresholddifference : otherwise δ(c) =

  • c

: c < thresholdcompleteness : otherwise

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 15 / 19

slide-30
SLIDE 30

Top-15 Articles on the Improvement List

Ranking Article 1 Noise reduction 2 Continuous wavelets 3 Gabor limit 4 Gaussian noise 5 Modified Morlet wavelet 6 Noiselet 7 Spectral density estimation 8 Noise pollution 9 Noise spectral density 10 Periodic summation 11 Coherent sampling 12 N-jet 13 Bispectrum 14 Digital audio 15 Effective input noise temperature

(thresholddifference, thresholdcompleteness) = (50, 600)

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 16 / 19

slide-31
SLIDE 31

Top-15 Articles on the Improvement List

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 16 / 19

slide-32
SLIDE 32

Top-15 Articles on the Improvement List

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 16 / 19

slide-33
SLIDE 33

Conclusions

Importance and quality of articles are mismatched High Importance Low Importance Good quality Nyquist–Shannon sampling theorem Bad quality

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 17 / 19

slide-34
SLIDE 34

Conclusions

Importance and quality of articles are mismatched High Importance Low Importance Good quality Nyquist–Shannon sampling theorem Bad quality Gaussian noise

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 17 / 19

slide-35
SLIDE 35

Conclusions

Importance and quality of articles are mismatched High Importance Low Importance Good quality Nyquist–Shannon sampling theorem Avizo (software) Bad quality Gaussian noise

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 17 / 19

slide-36
SLIDE 36

Conclusions

Importance and quality of articles are mismatched High Importance Low Importance Good quality Nyquist–Shannon sampling theorem Avizo (software) Bad quality Gaussian noise AutoCollage 2008

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 17 / 19

slide-37
SLIDE 37

Conclusions

Some important articles are highlighted for improvement

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 17 / 19

slide-38
SLIDE 38

Conclusions

Visibility of articles could be improved by adding links

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 17 / 19

slide-39
SLIDE 39

Conclusions

Audio/speech articles could benefit from further improvement

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 17 / 19

slide-40
SLIDE 40

Future Work

Multiple articles dealing with the same topic could be merged

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 18 / 19

slide-41
SLIDE 41

Future Work

Exploring the interaction with other categories (e.g. mathematics)

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 18 / 19

slide-42
SLIDE 42

Future Work

Crowdsourcing of article quality

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 18 / 19

slide-43
SLIDE 43

Future Work

Realtime monitoring of SP articles (trailHead as a starting point)

  • T. C. Lee (EPFL)

SP Wiki May 31, 2013 18 / 19

slide-44
SLIDE 44

Thank you, questions please.

{tao.lee, jay.unnikrishnan}@epfl.ch Software, dataset, results are available at http://lcav.epfl.ch/page-87349-en.html