collaboration signatur collaboration signatures es reveal
play

Collaboration Signatur Collaboration Signatures es Reveal - PowerPoint PPT Presentation

Collaboration Signatur Collaboration Signatures es Reveal Scientific Impact Reveal Scientific Impact Yuxiao Dong, Reid A. Johnson, Yang Yang, Nitesh V. Chawla Interdisciplinary Center for Network Science and Applications Department of Computer


  1. Collaboration Signatur Collaboration Signatures es Reveal Scientific Impact Reveal Scientific Impact Yuxiao Dong, Reid A. Johnson, Yang Yang, Nitesh V. Chawla Interdisciplinary Center for Network Science and Applications Department of Computer Science and Engineering University of Notre Dame

  2. Collaboration is an integral element of the scientific process that often leads to findings with significant impact. 2 �

  3. A real-world academic dataset from . 1,712,433 � 2,092,356 � 4,258,615 � Authors � Papers � Collaborators � 1. J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, Z. Su. ArnetMiner: Extraction and Mining of Academic Social Networks. KDD’08. 3 � 2. https://aminer.org/billboard/AMinerNetwork.

  4. Year vs. Number of Authors per Publication 3.0 #authors per publication 2.5 2.0 1.5 1970 1980 1990 2000 2010 Years Research collaborations are becoming increasingly prevalent. 4 �

  5. Year (1950-2010) vs.: ¡ Number of Publications Number of Authors Avg. of Paper/Author 4 #authors #papers per author 10 5 #new-authors #authors per paper 10 5 3.5 3 10 4 #average value 10 4 2.5 #authors #papers 10 3 2 10 3 1.5 10 2 10 2 1 10 1 0.5 10 1 0 1960 1970 1980 1990 2000 2010 1960 1970 1980 1990 2000 2010 1960 1970 1980 1990 2000 2010 year year year Average publication output has remained roughly constant. Collaboration has substantially expanded. 5 �

  6. d 1 ∑ w uv = Tie Weight: u e n p p ∈ P w uv w uv c u v s uv = Tie Strength: ∑ w uk ( ) b f k ∈Γ u g P : set of publications that u and v co-authored n p : number of authors of each publication p Γ (u ) : u ’s collaborations in ego network u ’s collaboration ego network consists of the ego u and u ’s collaboration relationships, including the self-collaboration self-collaboration with u . 6 �

  7. d Sociability u e Dependence w uv c u v Diversity Self-Collaboration b f g Sociability: the number of collaborators . ( ) Γ u This metric examines the number of collaboration relationships that researchers can maintain throughout their academic careers. 7 �

  8. d Sociability u e Dependence w uv c u v Diversity Self-Collaboration b f g Dependence: the fraction of a researcher’s collaborators fulfilling: ∑ ( ) I s uv > s vu ( ) v ∈Γ u s uv > s vu , ( ) Γ u This metric indicates the level of one’s research dependence. 8 �

  9. d Sociability u e Dependence w uv c u v Diversity Self-Collaboration b f g Diversity: the Shannon entropy of collaboration strength distribution: ∑ ( ) s uv × log s uv − ( ) v ∈Γ u This metric investigates how researchers distribute scientific collaborations among different collaborators. 9 �

  10. d Sociability u e Dependence w uv c u v Diversity Self-Collaboration b f g Self-Collaboration: the fraction of ties that are self-collaboration, S uu . This metric measures the efforts that are independent research, as compared to collaborative endeavors. 10 �

  11. What are Turing Award winners’ collaboration signatures? Are they distinct from other researchers’? Do we have distinctive collaboration signatures conditioned on our scientific impact? – Turing Award winners – h -index – Number of top-venue publications – Big-hit publications 11 �

  12. h -index A researcher’s h -index can be used to quantify his/her scientific impact. 12 � J. E. Hirsch. An Index to Quantify an Individual’s Scientific Research Output. PNAS 102(45). 2005.

  13. 40 Given a researcher’s h -index in Turing Winners h -index [1, 9] 2012 and the year of his/her first 35 h -index [10, 19] h -index [20, 29] h -index [30, 39] publication, his/her collaboration h -index [40, 49] 30 h -index [50, 59] h -index [60, 123] signature at each year is 25 sociability characterized. 20 15 x -axis: the x th year of one’s research career. 10 5 0 0 5 10 15 20 25 30 career year Researchers with higher h -indices have relatively greater sociability , though sociability increases to a peak for all groups. 13 �

  14. 90 80 70 h-index 60 50 40 30 20 10 15 20 25 30 35 40 45 #years x -axis: Number of years since first publication. y -axis: h -index. h -indices range from 25 to 83 in 2012. 14 �

  15. 1 Given a researcher’s h -index in 0.9 2012 and the year of his/her 0.8 first publication, his/her 0.7 dependence collaboration signature at each 0.6 year is characterized. 0.5 0.4 0.3 x -axis: the x th year of one’s research career. 0.2 0.1 0 0 5 10 15 20 25 30 career year Researchers’ dependence scores generally decrease at the initial career stages and take time to increase. 15 �

  16. 3 Given a researcher’s h -index in 2012 and the year of his/her 2.5 first publication, his/her 2 collaboration signature at each diversity year is characterized. 1.5 1 x -axis: the x th year of one’s research career. 0.5 0 0 5 10 15 20 25 30 career year Between groups of researchers with different h -indices,’ diversity values tend to diverge over time , eventually stabilizing. 16 �

  17. 1 Given a researcher’s h -index in 0.9 2012 and the year of his/her 0.8 first publication, his/her self-collaboration 0.7 collaboration signature at each 0.6 year is characterized. 0.5 0.4 0.3 x -axis: the x th year of one’s research career. 0.2 0.1 0 0 5 10 15 20 25 30 career year Between groups of researchers with different h -indices, a long- term di ff erence in self-collaboration is identifiable early . 17 �

  18. Top Venues A researcher’s number of top-venue publications can be used to quantify his/her scientific impact. Extracted from 8 computer science focus areas. Choose top 3 venues for each area. 18 �

  19. Artificial Intelligence (AI) 250 IJCAI, AAAI, ECAI AI - IR CV Information Retrieval (IR) ML 200 TH SIGIR, ECIR, TREC - DB DM Computer Vision (CV) sociability NLP CVPR, ICCV, ECCV - 150 Machine Learning (ML) ICML, NIPS, ECML - 100 Theory (TH) FOCS, STOC, SODA - Databases (DB) 50 SIGMOD, VLDB, ICDE - Data Mining (DM) KDD, ICDM, SDM 0 - 10 0 10 1 Natural Language Processing (NLP) #top-venue papers ACL, EMNLP , COLING - Regardless of research area, the degree of sociability exhibited by researchers tends to increase with top-venue publications . 19 �

  20. Artificial Intelligence (AI) 1 IJCAI, AAAI, ECAI AI - 0.9 IR CV Information Retrieval (IR) ML 0.8 TH SIGIR, ECIR, TREC - DB dependence DM 0.7 Computer Vision (CV) NLP CVPR, ICCV, ECCV - 0.6 Machine Learning (ML) 0.5 ICML, NIPS, ECML - 0.4 Theory (TH) FOCS, STOC, SODA 0.3 - Databases (DB) 0.2 SIGMOD, VLDB, ICDE - 0.1 Data Mining (DM) KDD, ICDM, SDM 0 - 10 0 10 1 Natural Language Processing (NLP) #top-venue papers ACL, EMNLP , COLING - Regardless of research area, research dependence decreases with the number of publications in top-venues . 20 �

  21. Artificial Intelligence (AI) 5 IJCAI, AAAI, ECAI AI - 4.5 IR CV Information Retrieval (IR) ML 4 TH SIGIR, ECIR, TREC - DB DM 3.5 Computer Vision (CV) NLP diversity CVPR, ICCV, ECCV - 3 Machine Learning (ML) 2.5 ICML, NIPS, ECML - 2 Theory (TH) FOCS, STOC, SODA 1.5 - Databases (DB) 1 SIGMOD, VLDB, ICDE - 0.5 Data Mining (DM) KDD, ICDM, SDM 0 - 10 0 10 1 Natural Language Processing (NLP) #top-venue papers ACL, EMNLP , COLING - Regardless of research area, the degree of diversity exhibited by researchers tends to increase with top-venue publications . 21 �

  22. Big-Hit Papers A researcher’s most cited publication can be used to quantify his/her scientific impact. 22 �

  23. 150 bighit [10, 100) bighit [100, 1000) bighit [1000, 10000) bighit [10000, +) sociability 100 50 0 1975 1980 1985 1990 1995 2000 year Researchers with high sociability tend to have big-hit publications. 23 �

  24. 1 bighit [10, 100) 0.9 bighit [100, 1000) bighit [1000, 10000) bighit [10000, +) 0.8 dependence 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1975 1980 1985 1990 1995 2000 year Researchers with big-hit publications tend to have relatively low dependence. 24 �

  25. 5 bighit [10, 100) 4.5 bighit [100, 1000) bighit [1000, 10000) bighit [10000, +) 4 3.5 diversity 3 2.5 2 1.5 1 0.5 0 1975 1980 1985 1990 1995 2000 year Researchers with high diversity tend to have big-hit publications. 25 �

  26. Based on these findings, we use collaboration signatures to predict scientific impact . 26 �

  27. Predictiveness vs. First Year Publishing 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 R 2 0.1 PCC 0 1950 1960 1970 1980 1990 2000 year Scientific impact can be reasonably inferred from our four simple collaboration signatures even across generations of researchers. 27 �

  28. Predictiveness vs. Number of Years Publishing 1 R 2 0.9 PCC 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 5 10 15 20 25 30 career year With longer collaboration signatures, future scientific impact can be predicted with increasing fidelity (as measured by R 2 and PCC). 28 �

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend