Scientometrics & Altmetrics Dr. Peter Kraker VU Science 2.0, - - PowerPoint PPT Presentation

scientometrics altmetrics
SMART_READER_LITE
LIVE PREVIEW

Scientometrics & Altmetrics Dr. Peter Kraker VU Science 2.0, - - PowerPoint PPT Presentation

www.know-center.at Scientometrics & Altmetrics Dr. Peter Kraker VU Science 2.0, 25.11.2015 funded within the Austrian Competence Center Programme Why Metrics? 2 One of the diseases of this


slide-1
SLIDE 1

funded ¡within ¡the ¡Austrian ¡Competence ¡Center ¡Programme ¡

www.know-­‑center.at ¡

Scientometrics & Altmetrics

  • Dr. Peter Kraker

VU Science 2.0, 25.11.2015

slide-2
SLIDE 2

2 ¡

Why Metrics?

slide-3
SLIDE 3

3 ¡

“One of the diseases of this age is the multiplicity of books; they doth so

  • vercharge the world that it is not

able to digest the abundance of idle matter that is every day hatched and brought forth into the world.“

Attributed to Barnaby Rich in 1613 (Price 1963)

slide-4
SLIDE 4

4 ¡

Information Overload in Science

Information overload is NOT a contemporary problem in science Science has been growing exponentially for the last 400 years (Price 1961, 1963)

Number of papers (Larsen/von Ins 2010) Number of researchers (NSB 2010)

Instruments to deal with the overload

Journals and conferences Peer review Quantitative analysis à Scientometrics

Price (1963)

slide-5
SLIDE 5

5 ¡

Pathways through Science

Science Citation Index (Garfield 1955) Web of Science An index of incoming citations Purpose

Discovery of literature that

is not linked thematically

Increased collaboration between

researchers

Evaluation of science

Garfield et al. (1964)

Relational scientometrics Evaluative scientometrics

slide-6
SLIDE 6

6 ¡

Relational Scientometrics

Example: Genetics research (Garfield et al. 1964) From the beginnings in the 1800s to the dis- covery of DNA Relationships given by history of science (red), citations (yellow), and both (blue)

Garfield et al. (1964)

slide-7
SLIDE 7

7 ¡

Map of Information Science

Van Eck and Waltman (2010)

slide-8
SLIDE 8

8 ¡

Knowledge Domain Visualization Process (Börner et al. 2003)

  • 1. Selection of an appropriate data source
  • 2. Definition of unit of analysis

§ Words, articles, authors, journals, categories…

  • 3. Determination of measures &

calculation of similarities

§ Linkages, co-occurrences, Vector Space Model…

  • 4. Ordination and/or detection of sub-areas

§ Dimensionality reduction (e.g. multidimensional scaling), cluster analysis, spatial configuration (e.g. force-directed placement)

  • 5. Visualization and interaction design
slide-9
SLIDE 9

9 ¡

Citations in Retrieval: Google Scholar

slide-10
SLIDE 10

10 ¡

Citation-based Metrics: h-Index

An metric to quantify the scientific output of an individual scientist “A scientist has index h if h of his or her Np papers have at least h citations each and the

  • ther (Np – h) papers

have ≤h citations each.” (Hirsch 2005)

slide-11
SLIDE 11

11 ¡

Citation-based Metrics: h-Index

Source: Scopus

Paper ¡ Cita;ons ¡ Paper ¡1 ¡ 33 ¡ Paper ¡2 ¡ 20 ¡ Paper ¡3 ¡ 10 ¡ Paper ¡4 ¡ 9 ¡ Paper ¡5 ¡ 9 ¡ Paper ¡6 ¡ 9 ¡ Paper ¡7 ¡ 8 ¡ Paper ¡8 ¡ 8 ¡ Paper ¡9 ¡ 7 ¡ Paper ¡10 ¡ 7 ¡ Paper ¡11 ¡ 6 ¡ Paper ¡12 ¡ 6 ¡ Paper ¡13 ¡ 6 ¡ Paper ¡14 ¡ 5 ¡ … ¡ … ¡ Paper ¡86 ¡ 0 ¡

slide-12
SLIDE 12

12 ¡

Citation-based Metrics: Impact Factor ​𝐽𝐺↓2013 =​█□𝐷𝑗𝑢𝑏𝑢𝑗𝑝𝑜𝑡 ¡𝑗𝑜 ¡2013 ¡ 𝑢𝑝 ¡⁠𝑏𝑠𝑢𝑗𝑑𝑚𝑓𝑡 ¡𝑞𝑣𝑐𝑚𝑗𝑡ℎ𝑓𝑒 ¡𝑐𝑧 ¡⁠𝐾𝑝𝑣𝑠𝑜𝑏𝑚 ¡𝑍 ¡ 𝑗𝑜 ¡⁠2011 ¡𝑏𝑜𝑒 ¡2012 /█□𝑂𝑣𝑛𝑐𝑓𝑠 ¡𝑝𝑔 ¡ 𝑏𝑠𝑢𝑗𝑑𝑚𝑓𝑡 ¡⁠𝑞𝑣𝑐𝑚𝑗𝑡ℎ𝑓𝑒 ¡𝑐𝑧 ¡𝐾𝑝𝑣𝑠𝑜𝑏𝑚 ¡𝑍⁠𝑗𝑜 ¡ 2011 ¡𝑏𝑜𝑒 ¡2012

An measure to quantify the relative importance

  • f a scientific journal

The average number of citations in a given year y to papers of a journal in the years y-1 and y-2

slide-13
SLIDE 13

13 ¡

Citation-based Metrics: Impact Factor

Source: Thomson Reuters

slide-14
SLIDE 14

14 ¡

Citation-based Metrics: Exercise

  • Get together in groups of 2 or 3
  • Calculate the impact factor for 2013 for

the two journals below and create a ranking. Journal X: Published 6 articles in 2011 and 2012. Journal Y: Published 6 articles in 2011 and 2012

  • Discuss the results: how justified is the ranking?

Where do you see problems?

Article ID ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ Citations in 2013 ¡ 15 ¡ 17 ¡ 14 ¡ 18 ¡ 15 ¡ 15 ¡ Article ID ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ Citations in 2013 ¡ 100 ¡ 2 ¡ 1 ¡ 2 ¡ 1 ¡ 2 ¡

​𝑱𝑮↓ 𝑱𝑮↓𝟑𝟏𝟐𝟒 𝟑𝟏𝟐𝟒 =​ █□𝑫𝒋𝒖𝒃𝒖𝒋𝒑𝒐𝒕 𝒐𝒕 ¡𝒋𝒐 𝒋𝒐 ¡ 𝟑𝟏𝟐𝟒 𝟑𝟏𝟐𝟒 ¡𝒖𝒑 𝒖𝒑 ¡⁠𝒃𝒔 𝒃𝒔𝒖𝒋𝒅𝒎𝒇𝒕 𝒖𝒋𝒅𝒎𝒇𝒕 ¡ 𝒒𝒗𝒄𝒎𝒋𝒕𝒊 𝒕𝒊𝒇𝒆 ¡ 𝒄𝒛 𝒄𝒛 ¡⁠𝑲𝒑𝒗𝒔𝒐𝒃 𝒐𝒃𝒎 ¡𝒁 ¡ 𝒋𝒐 𝒋𝒐 ¡⁠𝟑𝟏𝟐𝟐 𝟑𝟏𝟐𝟐 ¡𝒃𝒐𝒆 𝒐𝒆 ¡ 𝟑𝟏𝟐𝟑 𝟑𝟏𝟐𝟑 /█□𝑶𝒗𝒏𝒄𝒇 𝑶𝒗𝒏𝒄𝒇𝒔 ¡ 𝒑𝒈 𝒑𝒈 ¡ 𝒃𝒔 𝒃𝒔𝒖𝒋𝒅𝒎𝒇𝒕 𝒖𝒋𝒅𝒎𝒇𝒕 ¡⁠𝒒𝒗𝒄𝒎𝒋𝒕𝒊 𝒕𝒊 𝒇𝒆 𝒇𝒆 ¡𝒄𝒛 𝒄𝒛 ¡𝑲𝒑𝒗𝒔𝒐𝒃 𝒐𝒃𝒎 ¡ 𝒁⁠𝒋𝒐 𝒋𝒐 ¡𝟑𝟏𝟐𝟐 𝟑𝟏𝟐𝟐 ¡𝒃𝒐𝒆 𝒐𝒆 ¡ 𝟑𝟏𝟐𝟑 𝟑𝟏𝟐𝟑

slide-15
SLIDE 15

15 ¡

Citation-based Metrics: Exercise

Solution

Name ¡ IF 2013 ¡ Rank ¡ Journal X ¡ 15.5 ¡ 2 ¡ Journal Y ¡ 18 ¡ 1 ¡ Median ¡ Rank ¡

  • Std. Dev. ¡

15 ¡ 1 ¡ 1.5 ¡ 2 ¡ 2 ¡ 36.7 ¡

5 10 15 20 1 2 3 4 5 6 # Citations Paper

Journal X

20 40 60 80 100 120 1 2 3 4 5 6 # Citations Paper

Journal Y

slide-16
SLIDE 16

16 ¡

Criticisms of the Impact Factor

The IF is volatile as it uses the arithmetic mean, even though citation distributions usually follow a power law

  • „Blockbuster“ papers can skew the IF

A change in the number of „citable“ papers can influence the IF considerably The IF is field dependent – publication and citation behavior varies wildly between fields

slide-17
SLIDE 17

17 ¡

Criticisms of Citation-based Metrics

Citations take very long to appear in meaningful quantities

Source: Amin & Mabe (2000)

slide-18
SLIDE 18

18 ¡

Criticisms of Citation-based Metrics

Citations take very long to appear in meaningful quantities Citation metrics are dependent on the corpus that is used for calculation A single indicator is not sufficient to assess impact

slide-19
SLIDE 19

19 ¡

Setting the Stage for Alternative Metrics

Increased use of online services in the scientific community

E-Journals and pre-print/data archives Collaborative reference management systems (Micro-)blogs & social networks

Seeing academic literature through the eyes of the readers (Rowlands & Nicholas 2007)

  • Usage data (downloads, readership)

Links, likes and shares

slide-20
SLIDE 20

20 ¡

Altmetrics

Altmetrics: alternative metrics based on data generated in online systems Promises of altmetrics

  • Assess publications quicker and on a broader scale
  • Consider all outputs of research, not just papers

The altmetrics manifesto: http://altmetrics.org

slide-21
SLIDE 21

21 ¡

Example: PLOS Article-Level Metrics (ALM)

Source: http://www.plosone.org/ article/metrics/info%3Adoi %2F10.1371%2Fjournal.pon e.0047523#close

slide-22
SLIDE 22

22 ¡

Examples: Altmetric.com

Source: http://www.altmetric.com/details.php?domain=www.altmetric.com&citation_id=843656

slide-23
SLIDE 23

23 ¡

Example: ImpactStory

Source: https:// impactstory.org/ CarlBoettiger

slide-24
SLIDE 24

24 ¡

Relational Altmetrics and KDViz

Based on implicit and explicit links created in altmetrics sources Example: Bollen et al. (2009)

  • Based on user clickstreams in digital libraries and

bibliographic databases

  • Co-occurrence matrix of journals in clickstreams
  • Force-directed placement applied to the matrix
  • Produces an overview map of all of science
slide-25
SLIDE 25

25 ¡

Bollen et al. (2009)

slide-26
SLIDE 26

26 ¡

Relational Altmetrics

Example: Head Start (Kraker 2013)

  • Based on Mendeley readership
  • Co-readership as a measure of subject similarity
  • Matrix of document co-occurrences in user

libraries

  • Multidimensional scaling and hierarchical

clustering applied to the matrix; force-directed placement applied to the resulting map; naming heuristic for labels

  • Produces an overview map
  • f a research field
slide-27
SLIDE 27

27 ¡

http://openknowledgemaps.org http://github.com/pkraker/headstart

slide-28
SLIDE 28

28 ¡

Popular Altmetrics Data Sources

APIs

Name ¡ Type ¡ Indicators ¡License ¡ Open Data ¡ URL ¡ Mendeley ¡ Reference Management ¡Readership ¡ CC-BY 3.0 ¡ Yes ¡ http://dev.mendeley.com/ ¡ figshare ¡ Repository ¡ Views/ Downloads ¡ CC0 ¡ Yes ¡ http://api.figshare.com ¡ PLOS ALM ¡ Publisher ¡ Various ¡ CC0 ¡ Yes ¡ http://api.plos.org ¡ Altmetric.com ¡ Meta- Provider ¡ Various ¡ Propriet ary ¡ No ¡ http://api.altmetric.com/ ¡

SDKs

Name Language License Data sources URL rAltmetric R CC0 Altmetric.com http://ropensci.org/packages alm R MIT PLOS ALM http://ropensci.org/packages Mendeley SDK Python/JS Apache Mendeley http://dev.mendeley.com/ code/sdks.html

More: https://pad.okfn.org/p/mozfest-visualization

slide-29
SLIDE 29

29 ¡

Relationship between different indicators

r=0.73, n=150 r=0.77, n=150 r=0.51, n=150 r=0.66, n=528 r=0.76, n=528 r=0.59, n=528

JoSIS I&M

Source: Schlögl et al. (2014)

slide-30
SLIDE 30

30 ¡

Altmetrics: Exercise

  • Discuss the two examples below: what are

possible reasons for these high altmetrics scores?

slide-31
SLIDE 31

31 ¡

Problems of Altmetrics

Intention unknown: What does it mean to download/save/tweet a paper? What does it mean to aggregate these numbers? Reliability and validity of altmetrics Altmetrics are prone to sample biases (Bollen & van de Sompel 2008, Kraker et al. 2014) Gaming is a potential threat There is a need for a better understanding of altmetrics Altmetrics data needs to be open and reproducible

slide-32
SLIDE 32

32 ¡

References

Bollen, J., & Sompel, H. Van De. (2008). Usage Impact Factor : The Effects of Sample Characteristics on Usage-Based Impact Metrics. Journal of the American Society for Information Science, 59(1998), 136–149. Bollen, J., Van de Sompel, H., Hagberg, A., Bettencourt, L., Chute, R., Rodriguez, M. A., & Balakireva, L. (2009). Clickstream data yields high-resolution maps of science. PloS One, 4(3), e4803. Börner, K., Chen, C., & Boyack, K. (2003). Visualizing knowledge

  • domains. Annual Review of Information Science & Technology, 37,

1–58. Garfield, E. (1955). Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas. Science, 122(3159), 108–111. Garfield, E., Sher, I., & Torpie, R. (1964). The use of citation data in writing the history of science (p. 75). Kraker, P. (2013). Visualizing Research Fields based on Scholarly Communication on the Web. University of Graz. Kraker, P., Schlögl, C., Jack, K. & Lindstaedt, S. (2014). Visualization of Co-Readership Patterns from an Online Reference Management System. Submitted to Journal of Informetrics. http://arxiv.org/abs/1409.0348

slide-33
SLIDE 33

33 ¡

References

Amin, M., & Mabe, M. (2000). Impact factors: use and abuse. Perspectives in Publishing, 1(2000), 1–6. National Science Board. (2010). Science and Engineering Labor

  • Force. Science and Engineering Indicators (Vol. 22 Suppl 1).

National Science Foundation. Larsen, P. O., & von Ins, M. (2010). The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics, 84(3), 575–603. Price, D.(1961). Science since Babylon. Yale University Press. Price, D. (1963). Little science, big science. Columbia Univ. Press. Rowlands, I., & Nicholas, D. (2007). The missing link: journal usage metrics. Aslib Proceedings, 59(3), 222–228. Schlögl, C., Gorraiz, J., Gumpenberger, C., Jack, K., & Kraker, P. (2014). Comparison of downloads, citations and readership data for two information systems journals. Scientometrics. Van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538. Images on slides 7 and 25 by Maxi Schramm

slide-34
SLIDE 34

34 ¡

  • Dr. Peter Kraker

Know-Center pkraker@know-center.at http://twitter.com/PeterKraker http://science20.wordpress.com

Thank You For Your Attention! Questions?