Computational semantics for the humanities Diarmuid O S eaghdha - PowerPoint PPT Presentation

Computational semantics for the humanities Diarmuid ´ O S´ eaghdha Natural Language and Information Processing Group Computer Laboratory University of Cambridge do242@cam.ac.uk Translation and the Digital 25 April 2014

Introduction ◮ “Big Data” revolution: ◮ We have access to more textual data than any human could ever read. ◮ We can perform some kinds of automated analysis over large datasets. ◮ For humanities researchers: ◮ Data mining is a tool that facilitates asking questions about language use. ◮ Data mining is not a question or an answer. ◮ Natural Language Processing (NLP) research gives us computational methods for analysing and interpreting text.

Corpus frequency Proportional frequencies in Google Books corpus · 10 − 4 2 . 5 computer mouse 2 1 . 5 1 0 . 5 0 1900 1920 1940 1960 1980 2000

Semantics: The distributional hypothesis ◮ Imagine that tezg¨ uino is a rare English word, and you saw the word used in the following sentences: 1. A bottle of tezg¨ uino is on the table. 2. Everyone likes tezg¨ uino . 3. Tezg¨ uino makes you drunk. 4. We make tezg¨ uino out of corn. (Lin, 1998) ◮ Can you guess what tezg¨ uino means? ◮ What kind of things do you expect will be similar to tezg¨ uino ? ◮ The Distributional Hypothesis: Two words are expected to be semantically similar if they have similar patterns of co-occurrence in observed text.

Co-occurrences and similarity ◮ We can produce a distributional “profile” of a word from a corpus: farmer : part-time, sheep, peasant, tenant, wife, crop , . . . doctor : nurse, junior, prescribe, consult, patient, surgery ,. . . hospital : psychiatric, memorial, discharge, admission, clinic , . . . ◮ We can compute similarity between words by comparing their profiles.

Semantic space visualisation British National Corpus, top 5000 dependencies kangaroo shark woman pet worker man doctor cat nurse chicken surgeon dog vet fish apple wine hospital food factory salad cinema hammer beer pizza surgery tool computer

Discovering semantic classes BNC nouns, method related to Latent Dirichlet Allocation (topic modelling) Class 1 Class 2 Class 3 Class 4 attack test line university raid examination axis college assault check section school campaign testing circle polytechnic operation exam path institute incident scan track institution bombing assessment arrow library offensive sample curve hospital

Tracking meaning over time ◮ Ongoing project (with Meng Zhang) ◮ We know that language changes over time. ◮ Words change their meaning by adding and losing senses and associations. ◮ Can we study this behaviour in a large corpus? ◮ Goal: “word biographies”. ◮ A historian of ideas might be interested in what a word meant to people at different points in time.

Tracking meaning over time Meaning consistency in Google Books corpus computer 1 mouse 0 . 8 0 . 6 0 . 4 0 . 2 1900 1920 1940 1960 1980 2000

Conclusion ◮ We have methods for extracting meaning from document collections: ◮ Comparing words and texts ◮ Clustering words/concepts ◮ Identifying themes in a corpus ◮ Identifying associations between words/concepts ◮ We need users in other fields to provide interesting questions. ◮ If you have ideas, say hi! Or send me an email at do242@cam.ac.uk .

Computational semantics for the humanities Diarmuid O S eaghdha - PowerPoint PPT Presentation

Computational semantics for the humanities Diarmuid O S eaghdha Natural Language and Information Processing Group Computer Laboratory University of Cambridge do242@cam.ac.uk Translation and the Digital 25 April 2014 Introduction

Computational humanities Computational humanities 2019-07-17 Michael Piotrowski humanities.

Semantics 1 / 21 Outline What is semantics? Denotational semantics Semantics of naming What

SCHOOL OF HUMANITIES NEW GRADUATE STUDENT ORIENTATION 2015 HUMANITIES OFFICE OF GRADUATE STUDY

Computational Semantics Computational semantics Logic First-Order Scott Farrar Predicate

Operational Semantics 1 / 14 Outline What is semantics? Operational Semantics What is

15-411: Dynamic Semantics Jan Ho ff mann Dynamic Semantics Static semantics: definition of

Keck Undergraduate Humanities Research Fellowship Program Keck Humanities Fellows THIS

Keck Undergraduate Humanities Research Fellowship Program Keck Humanities Fellows THIS

Computational Semantics: More Calculus -calculus Recap NLTK semantics operations Scott

Computational Semantics with Haskell Yulia Zinova Winter 2016/2017 We follow ? , electronic

Polyteam Semantics Team Semantics Axiomatizations in team semantics Polyteams and Jonni

Semantics in Practice Semantics of Practice How do we write semantics? 1: pen-and-paper How do

Introductory Notes Jigsaw Semantics or: Dynamic Semantics Put Together Again Formal semantics

Polyteam Semantics Team Semantics Axiomatisations in team semantics Polyteams and

Computational Semantics Ling 571 Deep Processing Techniques for NLP February 7, 2011 Roadmap

Computational Semantics: Events NL to FOL: Loose ends Misc syn. categories VPs, Verbs Problems

Small-Footprint Block Cipher Design - How far can you go? A. Bogdanov 1 , L.R. Knudsen 2 , G.

3rd Grade PSI Ecosystems: Group Behavior www.njctl.org Slide 3 / 78 Ecosystems: Group Behavior

Route map of our journey this evening Ciphers - coming of age The Enigma Machine Poles

Making Sense at Scale with Algorithms, Machines & People PI: Michael Franklin

Towards Large-Scale Incident Response and Interactive Network Forensics Matthias Vallentin UC

XSS & CSRF Alex Infhr Whoami Alex Infhr @insertscript Cure53 MS IE Team

HR Analytics Workshop 6th June 2018 C R A F T E D B Y C O N C E N T R A Welcome &

Algebraic Aspects of Symmetric-key Cryptography Carlos Cid (carlos.cid@rhul.ac.uk) Information

Computational semantics for the humanities Diarmuid O S eaghdha - PowerPoint PPT Presentation

Computational semantics for the humanities Diarmuid O S eaghdha Natural Language and Information Processing Group Computer Laboratory University of Cambridge do242@cam.ac.uk Translation and the Digital 25 April 2014 Introduction

Computational humanities Computational humanities 2019-07-17 Michael Piotrowski humanities.

Semantics 1 / 21 Outline What is semantics? Denotational semantics Semantics of naming What

SCHOOL OF HUMANITIES NEW GRADUATE STUDENT ORIENTATION 2015 HUMANITIES OFFICE OF GRADUATE STUDY

Computational Semantics Computational semantics Logic First-Order Scott Farrar Predicate

Operational Semantics 1 / 14 Outline What is semantics? Operational Semantics What is

15-411: Dynamic Semantics Jan Ho ff mann Dynamic Semantics Static semantics: definition of

Keck Undergraduate Humanities Research Fellowship Program Keck Humanities Fellows THIS

Keck Undergraduate Humanities Research Fellowship Program Keck Humanities Fellows THIS

Computational Semantics: More Calculus -calculus Recap NLTK semantics operations Scott

Computational Semantics with Haskell Yulia Zinova Winter 2016/2017 We follow ? , electronic

Polyteam Semantics Team Semantics Axiomatizations in team semantics Polyteams and Jonni

Semantics in Practice Semantics of Practice How do we write semantics? 1: pen-and-paper How do

Introductory Notes Jigsaw Semantics or: Dynamic Semantics Put Together Again Formal semantics

Polyteam Semantics Team Semantics Axiomatisations in team semantics Polyteams and

Computational Semantics Ling 571 Deep Processing Techniques for NLP February 7, 2011 Roadmap

Computational Semantics: Events NL to FOL: Loose ends Misc syn. categories VPs, Verbs Problems

Small-Footprint Block Cipher Design - How far can you go? A. Bogdanov 1 , L.R. Knudsen 2 , G.

3rd Grade PSI Ecosystems: Group Behavior www.njctl.org Slide 3 / 78 Ecosystems: Group Behavior

Route map of our journey this evening Ciphers - coming of age The Enigma Machine Poles

Making Sense at Scale with Algorithms, Machines &amp; People PI: Michael Franklin

Towards Large-Scale Incident Response and Interactive Network Forensics Matthias Vallentin UC

XSS &amp; CSRF Alex Infhr Whoami Alex Infhr @insertscript Cure53 MS IE Team

HR Analytics Workshop 6th June 2018 C R A F T E D B Y C O N C E N T R A Welcome &amp;

Algebraic Aspects of Symmetric-key Cryptography Carlos Cid (carlos.cid@rhul.ac.uk) Information

Making Sense at Scale with Algorithms, Machines & People PI: Michael Franklin

XSS & CSRF Alex Infhr Whoami Alex Infhr @insertscript Cure53 MS IE Team

HR Analytics Workshop 6th June 2018 C R A F T E D B Y C O N C E N T R A Welcome &