networks of computational social science
play

Networks of Computational Social Science Ian Dennis Miller - PowerPoint PPT Presentation

Networks of Computational Social Science Ian Dennis Miller 2018-11-22 Ian Dennis Miller Networks of Computational Social Science 2018-11-22 1 / 56 Introduction Ian Dennis Miller Networks of Computational Social Science 2018-11-22 2 / 56


  1. Networks of Computational Social Science Ian Dennis Miller 2018-11-22 Ian Dennis Miller Networks of Computational Social Science 2018-11-22 1 / 56

  2. Introduction Ian Dennis Miller Networks of Computational Social Science 2018-11-22 2 / 56

  3. Objective 1 In order to study the scholarly literature within which my own work is embedded : 2 this work will discuss the construction of a citation library 3 and the analysis of its co-authorship network 4 to identify communities of collaboration. Figure 1: https://commons.wikimedia.org/wiki/File: Goal_Japan_vs_Uz_2009.JPG Ian Dennis Miller Networks of Computational Social Science 2018-11-22 3 / 56

  4. Structure Introduction Motivation Literature Review Background Methods Results Discussion Conclusion Figure 2: https://commons.wikimedia.org/wiki/File: Structure_Paris_les_Halles.jpg Ian Dennis Miller Networks of Computational Social Science 2018-11-22 4 / 56

  5. Motivation Locate myself within the literature No specific literature seems to exist Relevant methods in “distant” literatures Applied insights from network science Network methods for scholarly synthesis Goal: become the bridge I need to discover the audience Figure 3: https://commons.wikimedia.org/wiki/File: Motivation%3F.jpg Ian Dennis Miller Networks of Computational Social Science 2018-11-22 5 / 56

  6. Starting Point Reading list for my PhD oral defense Also basis for a chapter in dissertation Seeded bibliography with 90 articles 30 articles about contagion 30 articles about social networks analysis 30 articles about memes Find common thread that ties articles together Figure 4: https://commons.wikimedia.org/wiki/File: Arokia_Rajiv(Silver)_For_India_Starts_To_Run.jpg Ian Dennis Miller Networks of Computational Social Science 2018-11-22 6 / 56

  7. Background Ian Dennis Miller Networks of Computational Social Science 2018-11-22 7 / 56

  8. Small World Problem Travers and Milgram (1967) Mail letters to Kansas Return to Boston by hand (not mail) on a “first-name-basis” with peers Do all our social circles overlap? (yes) AKA: “Six Degrees of Separation” Figure 5: Path length distribution Ian Dennis Miller Networks of Computational Social Science 2018-11-22 8 / 56

  9. Strength of Weak Ties Granovetter (1973): close is influential Why? network is denser Many links among all possible connections But weak ties connect distant “clumps” A Small World: unexpectedly short distances How much does this matter for scholarship? Libraries and search tools vs. weak ties Figure 6: https://commons.wikimedia.org/wiki/File: Weak_tie_bridge.png Ian Dennis Miller Networks of Computational Social Science 2018-11-22 9 / 56

  10. Small World Networks Watts and Strogatz (1998) “Starting from a ring lattice. . . ” “with n vertices and k edges per vertex” “rewire edges at random with probability p ” Regularity: p = 0 Disorder: p = 1 Small-world coupling facilitates epidemics “Shortcuts” connect across long distances Dynamic structure of co-authorship Figure 7: https://commons.wikimedia.org/wiki/File: Small-world-network-example.png Ian Dennis Miller Networks of Computational Social Science 2018-11-22 10 / 56

  11. Scale-free Networks Barabasi and Albert (1999) Applies to: World Wide Web, genetics. . . Scale-free; power-law distribution Preferential attachment network “Rich get richer” Dynamic structure of scholarly citation (These systems tend to be large) Figure 8: https://commons.wikimedia.org/wiki/File: Scale-free_network_sample.png Ian Dennis Miller Networks of Computational Social Science 2018-11-22 11 / 56

  12. Scientific Collaboration Networks Newman (2001) Examination of coauthorship Mined: biomedical, physics, comp sci Cornell arXiv Detected small world networks How are “silos” even possible? Clustering, from a network perspective Average path lengths are longer Figure 9: Newman. (2001) PNAS. p. 408 Ian Dennis Miller Networks of Computational Social Science 2018-11-22 12 / 56

  13. Scholarly Communication and Bibliometrics Borgman and Furner (2002) Review of bibliometrics Information Sciences perspective Provides taxonomy: Behavior: writing, linking, submit, collaboration Aggregation: person, group, domain, nation Format: paper, lit review, reference Figure 10: Borgman & Furner (2002), p. 9 Ian Dennis Miller Networks of Computational Social Science 2018-11-22 13 / 56

  14. Co-authorship: Structural/Socio-academic Groups Rodriguez and Pepe (2008) Community detection study “Even in interdisciplinary research. . . ” “coauthorship is driven by. . . ” “departmental” “institutional affiliation.” Figure 11: Rodriguez & Pepe (2008) Ian Dennis Miller Networks of Computational Social Science 2018-11-22 14 / 56

  15. Choosing coauthorship Identify the people to identify the beliefs Assume authors endorse the publication Stronger ties than citation network Loosely tracks institutional network Can leverage biographical info Fundamentally: a tractable proposition Does not require comprehensive data mining Meaningful results from under 2,500 articles Figure 12: https://commons.wikimedia.org/wiki/File: Accademia_-_Maggiotto_Self- portrait_with_two_students.jpg Ian Dennis Miller Networks of Computational Social Science 2018-11-22 15 / 56

  16. Alternatives to coauthorship Citation Easy to acquire structured data Co-citation Related works are probably co-cited Acknowledgment Stronger ties than citations Institutional/mentorship Totally unstructured Figure 13: https://commons.wikimedia.org/wiki/File: Option-key.jpg Ian Dennis Miller Networks of Computational Social Science 2018-11-22 16 / 56

  17. Hypotheses Coauthorship indicates shared belief Beliefs accumulate into a discipline Metaphors/synonyms: Colleges, schools, arms, lines of reasoning Weak ties connect the silos But weak ties may be harder to spot For weak ties that would be bridges: Longer chains of strong ties also exist Figure 14: https://commons.wikimedia.org/wiki/File: Mad_scientist_caricature.png Ian Dennis Miller Networks of Computational Social Science 2018-11-22 17 / 56

  18. Methods Ian Dennis Miller Networks of Computational Social Science 2018-11-22 18 / 56

  19. Methods Overview Data Methods Acquisition, storage Scholarship Methods Search Analysis Methods Networks, statistics Reporting Visualization, interaction Figure 15: https://commons.wikimedia.org/wiki/File: The_Earth_seen_from_Apollo_17.jpg Ian Dennis Miller Networks of Computational Social Science 2018-11-22 19 / 56

  20. Data Methods: BibTeX Imperfect but ubiquitous No canonical standard Highest adoption rate format, online Many incomplete parsers/writers Parsers for all languages Zotero, LaTeX, R, and Python Good compromise Figure 16: https://commons.wikimedia.org/wiki/File: Example_bibtex2.png Ian Dennis Miller Networks of Computational Social Science 2018-11-22 20 / 56

  21. Data Methods: Zotero Open source citation manager Manage database of citations Provides plug-in system Native import/export BibTeX BetterBibTeX: handy plug-in Sync Zotero library to .bib file Good integration with web browsers (Facilitates scholarship) Figure 17: My Zotero library Ian Dennis Miller Networks of Computational Social Science 2018-11-22 21 / 56

  22. Scholarship Methods Search tools Google Scholar Problems with rate limits Dystopia WorldCat, Citeseer DBLP; APA; arXiv Library portal Multiple publisher licenses permissions aggregation Figure 18: University of Toronto Library Website Ian Dennis Miller Networks of Computational Social Science 2018-11-22 22 / 56

  23. Observations of academic publishing over time post-2010: nearly complete, largely computer-readable post-2000: excellent availability, maybe not OCR post-1990: good availability, lower OCR rate, irregularities 1950-1990: good indexing, okay availability, irregular 1920-1950: okay indexing, some availability, very irregular pre-1920: classic offline scholarship methods Figure 19: https://commons.wikimedia.org/wiki/File: Osgoode_Library_Stacks_2007.jpg Ian Dennis Miller Networks of Computational Social Science 2018-11-22 23 / 56

  24. Biographic methods Necessary farther back in history single-author papers sources Wikipedia university biographical resources used to identify contemporaries mentors Figure 20: https://commons.wikimedia.org/wiki/File: Isaac_Newton_grave_in_Westminster_Abbey.jpg Ian Dennis Miller Networks of Computational Social Science 2018-11-22 24 / 56

  25. Analysis Methods Preparation Co-authorship Connected Components Calculate Path Length Clustering Coefficient Modularity Have I observed a small world? Figure 21: https://commons.wikimedia.org/wiki/File: Mechanical-calculator-Brunsviga-800-02.jpg Ian Dennis Miller Networks of Computational Social Science 2018-11-22 25 / 56

  26. Analysis: Extract Co-authorships challenge: fix irregularities with citations regularize .bib file with bibclean (Beebe, 2015) load .bib into R via bibtex package (Francois, et al., 2017) clean author names capitalization abbreviations accents spaces iterate through citations in bibliography extract pairwise author combinations append author pairs to edge list Figure 22: https://commons.wikimedia.org/wiki/File: Sorting_machine_(Census)_LCCN2016823355.jpg Ian Dennis Miller Networks of Computational Social Science 2018-11-22 26 / 56

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend