Cross-domain Authorship Attribution Overview of the Author - PowerPoint PPT Presentation

Mar 24, 2023 •261 likes •418 views

Cross-domain Authorship Attribution Overview of the Author Identification Task at PAN-2018 PAN@CLEF2018, Avignon, 11 September 2018 Mike Kestemont, Efstathios Stamatatos, Walter Daelemans, Benno Stein, Martin Potthast Authorship attribution

Cross-domain Authorship Attribution Overview of the Author Identification Task at PAN-2018 PAN@CLEF2018, Avignon, 11 September 2018 Mike Kestemont, Efstathios Stamatatos, Walter Daelemans, Benno Stein, Martin Potthast
Authorship attribution • Closed-set: assign anonymous text to one author from set of candidate authors (classification problem) • Importance and difficulty of benchmarking: need for • Large but varied corpora • Accessible data (free of rights) • Control over topic and genre (domain) • Multilingual, yet comparable datasets
What is fan fiction? • Fiction produced by non-professional authors • that explicitly builds on previously published fiction (characters, themes, settings, etc.)
Canon Fandom
Attractive? Characteristic Advantage Online, open platforms Digitally accessible Unmediated No editorial interference Explicit about canon Rich metadata Global phenomenon Language-independent
Balanced cross-domain design All test texts, across 5 languages (!), from target fandom (Harry Potter) not represented in the training data. Each author: 7+ training texts
Submissions Compared to a SVM char 3gram baseline
Effect of number of authors
Significance
Model criticism Dominance of ngrams (TF-IDF), instance-based, SVMs
Post-hoc analyses More varied training data helps (cf. Sapkota 2014) — influence of original author is not a major factor
Observations • Fanfiction validated: feasible, but not easy, so room for progress • (Stylistic) influence of canon author not an issue? Focus on (semantic) domain • Some stagnation in the field, both in feature extraction and classification • (Where is deep learning? Cf. Bagnall@PAN2016)
Stay tuned • Next year at PAN 2019 (Lugano) • Focus on open-set attribution in fan fiction • No longer a single target fandom: more “adversarial” set up • Less restricted design: larger, more complex problems to push innovation
References • Douglas Bagnall. Authorship Clustering Using Multi-headed Recurrent Neural Networks—Notebook for PAN at CLEF 2016. • Kestemont at al. Overview of the Author Identification Task at PAN-2018 Cross-domain Authorship Attribution and Style Change Detection. PAN 2018. • Hellekson, K., Busse, K. (eds.): The Fan Fiction Studies Reader. University of Iowa Press (2014). • Sapkota, U. et al. Not all character n-grams are created equal: A study in authorship attribution. COLING 2014. • Stamatatos, E.: A Survey of Modern Authorship Attribution Methods. Journal of the American Society for Information Science and Technology 60, 538–556 (2009)

Recommend

Authorship & Publication August 4, 2009 Authorship Publication Authorship Each author

Authorship Publication Authorship & Publication August 4, 2009 Authorship Publication Authorship Each author should have made substantial contribution to research Each author should have participated sufficiently in the

308 views • 3 slides

Authorship: why not just toss a coin? Benefits and responsibilities of authorship Tactics

Authorship: why not just toss a coin? Benefits and responsibilities of authorship Tactics of authorship abuse Authorship policies and requirements Examples of authorship disputes How to avoid problems Kevin Strange, PhD,

543 views • 25 slides

A Mathematical Study A Mathematical Study of Authorship Attribution of Authorship Attribution

Yang Wang Yang Wang Department of Mathematics Department of Mathematics Michigan State University Michigan State University A Mathematical Study A Mathematical Study of Authorship Attribution of Authorship Attribution Who Wrote Who Wrote

480 views • 34 slides

Leveraging discourse information effectively for authorship attribution Elisa Ferracane, Su

Leveraging discourse information effectively for authorship attribution Elisa Ferracane, Su Wang, Raymond J. Mooney University of Texas at Austin Task Authorship Attribution: identify the author of a text, given a set of author-labeled

736 views • 40 slides

Bootstrapped Authorship Attribution in Compression Space Ramon de Graaf Leiden Institute of

Bootstrapped Authorship Attribution in Compression Space Ramon de Graaf Leiden Institute of Advanced Computer Science Cor Veenman Digital Technology and Biometrics Department Bootstrapped Authorship Attribution in Compression Space de Graaff

379 views • 3 slides

Authorship Attribution: Using Rich Linguistic Features when Training Data is Scarce Ludovic

Authorship Attribution: Using Rich Linguistic Features when Training Data is Scarce Ludovic Tanguy, Franck Sajous, Basilio Calderone and Nabil Hathout CLLE-ERSS: CNRS & University of Toulouse, France PAN 2012 Authorship Attribution - CLEF

434 views • 10 slides

Grieve 2007: Quantitative Authorship Attribution: An Vocabulary Richness Measures Evaluation of

Grieve 2007: Quantitative Authorship Attribution: An Evaluation of Techniques Zarah Wei Introduction Textual Measurements Length Measures Grieve 2007: Quantitative Authorship Attribution: An Vocabulary Richness Measures Evaluation of

406 views • 40 slides

Authorship Attribution of Micro-Messages Roy Schwartz + , Oren Tsur + , Ari Rappoport + and Moshe

Authorship Attribution of Micro-Messages Roy Schwartz + , Oren Tsur + , Ari Rappoport + and Moshe Koppel * + The Hebrew University, * Bar Ilan University In proceedings of EMNLP 2013 Overview Authorship attribution of tweets Users tend to

584 views • 47 slides

EACH-USP Ensemble Cross-domain Authorship Attribution for PAN-CLEF-2018 J. Eleandro Cust odio,

EACH-USP Ensemble Cross-domain Authorship Attribution for PAN-CLEF-2018 J. Eleandro Cust odio, Ivandr e Paraboni { eleandro,ivandre } @usp.br Avignon, 11 September 2018 School of Arts, Sciences and Humanities University of S ao Paulo

520 views • 15 slides

Kernel Methods and String Kernels for Authorship Analysis Marius Popescu 1 Cristian Grozea 2 1

String Kernels Authorship Attribution Authorship Clustering Sexual Predator Identification Kernel Methods and String Kernels for Authorship Analysis Marius Popescu 1 Cristian Grozea 2 1 University of Bucharest, Romania popescunmarius@gmail.com

207 views • 18 slides

Kicking Down the Cross Domain Door Techniques for Cross Domain Exploitation Billy K Rios (BK) and

Kicking Down the Cross Domain Door Techniques for Cross Domain Exploitation Billy K Rios (BK) and Raghav Dube Tabbed Browsing Rich Content Mash-ups Cookies JSON Ajax Implication of Cross Domain Attacks Implication of Cross Domain Attacks

583 views • 31 slides

A multitude of linguistically- rich features for authorship attribution Ludovic Tanguy, Assaf

A multitude of linguistically- rich features for authorship attribution Ludovic Tanguy, Assaf Urieli, Basilio Calderone, Nabil Hathout, and Franck Sajous CLLE-ERSS: CNRS & University of Toulouse, France PAN 2011 Workshop Authorship

647 views • 27 slides

GLAD: Groningen Lightweight Authorship Detection PAN, Authorship verification, 2015 Manuela

GLAD: Groningen Lightweight Authorship Detection PAN, Authorship verification, 2015 Manuela Hrlinmann, Benno Weck, Esther van den Berg, Simon uster, Malvina Nissim The challenge given: a set of Known documents written by the same Author

581 views • 34 slides

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

302 views • 27 slides

Recognizing and Imitating Programmer Style: Adversaries in Program Authorship Attribution Lucy

Recognizing and Imitating Programmer Style: Adversaries in Program Authorship Attribution Lucy Simko , Luke Zettlemoyer, Tadayoshi Kohno simkol@cs.washington.edu homes.cs.washington.edu/~simkol sim Source Code Attribution B int main() { A

570 views • 52 slides

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The Shadow of the Cross The Shadow of the Cross OT Glimpses of the Cross OT Glimpses of the Cross Heb 8:5 & 10:1 Heb 8:5 & 10:1 OT Glimpses

361 views • 35 slides

Deep Bayes Factor Scoring for Authorship Verifjcation Benedikt Boenninghoff Dorothea Kolossa

Deep Bayes Factor Scoring for Authorship Verifjcation Benedikt Boenninghoff Dorothea Kolossa Julian Rupp Robert M. Nickel PAN@CLEF2020 * Authorship verifjcation (AV) tasks at PAN 2020 to 2022 1 (Kestemont, Manjavacas, et al. 2020) Task:

814 views • 67 slides

Intro to Perception, part II Jonathan Pillow Sensation & Perception (PSY 345 / NEU 325)

Intro to Perception, part II Jonathan Pillow Sensation & Perception (PSY 345 / NEU 325) Spring 2015, Princeton University 1 summary of last time Epistemology - theory of knowledge - (psychological) nativism - empiricism

773 views • 46 slides

The Crossroads Versus the Seesaw: Getting a Fix on Recent International Tax Policy

The Crossroads Versus the Seesaw: Getting a Fix on Recent International Tax Policy Developments Daniel Shaviro, NYU Law School 9 th Annual Symposium, Oxford University Centre for Business Taxation June 23, 2015 1 Background for this paper In

633 views • 13 slides

Philosophy of Mind Philipp Koehn 7 February 2017 Philipp Koehn Artificial Intelligence:

Philosophy of Mind Philipp Koehn 7 February 2017 Philipp Koehn Artificial Intelligence: Philosophy of Mind 7 February 2017 1 mind Philipp Koehn Artificial Intelligence: Philosophy of Mind 7 February 2017 Ren e Descartes 2 French

1.19k views • 42 slides

ProtoDUNE-ND: Containment Studies Near Detector Workshop Fermilab Mai 25th, 2019 Patrick

ProtoDUNE-ND: Containment Studies Near Detector Workshop Fermilab Mai 25th, 2019 Patrick Koller (patrick.koller@lhep.unibe.ch) marvelcinematicuniverse.fandom.com Simulations for ProtoDUNE-ND LAr Component ArgonCube 2x2 Demonstrator : 1.4 m

417 views • 25 slides

Want to chat with everyone? Please send chats to All Participants Keep this number handy Contact

Want to chat with everyone? Please send chats to All Participants Keep this number handy Contact WebEx Support at 1.866.569.3239 if you experience any technical or connection issues. #NoveListConversations Re b e c c a Ho ne yc utt K a thy

606 views • 42 slides

Jim Bray Northwestern University Expert Finder Systems Forum March 1 st , 2019 ~1500

Jim Bray Northwestern University Expert Finder Systems Forum March 1 st , 2019 ~1500 Tenure/Tenure Track Faculty #10 National University ~8,000 Undergraduates $702 Million Annual Research Funding 1 2 Hardened Steel Alloy

457 views • 22 slides

Zuul, the Third Throws Away Any Dirt! Szymon Datko Roman Dobosz szymon.datko@corp.ovh.com

Zuul, the Third Throws Away Any Dirt! Szymon Datko Roman Dobosz szymon.datko@corp.ovh.com rdobosz@redhat.com 6th November 2019 Sz. Datko, R. Dobosz Zuul, the Third - Throws Away Any Dirt! 6th November 2019 1 / 42 About us Szymon Datko

535 views • 42 slides

Cross-domain Authorship Attribution Overview of the Author - PowerPoint PPT Presentation

Cross-domain Authorship Attribution Overview of the Author Identification Task at PAN-2018 PAN@CLEF2018, Avignon, 11 September 2018 Mike Kestemont, Efstathios Stamatatos, Walter Daelemans, Benno Stein, Martin Potthast Authorship attribution

Authorship & Publication August 4, 2009 Authorship Publication Authorship Each author

Authorship: why not just toss a coin? Benefits and responsibilities of authorship Tactics

A Mathematical Study A Mathematical Study of Authorship Attribution of Authorship Attribution

Leveraging discourse information effectively for authorship attribution Elisa Ferracane, Su

Bootstrapped Authorship Attribution in Compression Space Ramon de Graaf Leiden Institute of

Authorship Attribution: Using Rich Linguistic Features when Training Data is Scarce Ludovic

Grieve 2007: Quantitative Authorship Attribution: An Vocabulary Richness Measures Evaluation of

Authorship Attribution of Micro-Messages Roy Schwartz + , Oren Tsur + , Ari Rappoport + and Moshe

EACH-USP Ensemble Cross-domain Authorship Attribution for PAN-CLEF-2018 J. Eleandro Cust odio,

Kernel Methods and String Kernels for Authorship Analysis Marius Popescu 1 Cristian Grozea 2 1

Kicking Down the Cross Domain Door Techniques for Cross Domain Exploitation Billy K Rios (BK) and

A multitude of linguistically- rich features for authorship attribution Ludovic Tanguy, Assaf

GLAD: Groningen Lightweight Authorship Detection PAN, Authorship verification, 2015 Manuela

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

Recognizing and Imitating Programmer Style: Adversaries in Program Authorship Attribution Lucy

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

Deep Bayes Factor Scoring for Authorship Verifjcation Benedikt Boenninghoff Dorothea Kolossa

Intro to Perception, part II Jonathan Pillow Sensation & Perception (PSY 345 / NEU 325)

The Crossroads Versus the Seesaw: Getting a Fix on Recent International Tax Policy

Philosophy of Mind Philipp Koehn 7 February 2017 Philipp Koehn Artificial Intelligence:

ProtoDUNE-ND: Containment Studies Near Detector Workshop Fermilab Mai 25th, 2019 Patrick

Want to chat with everyone? Please send chats to All Participants Keep this number handy Contact

Jim Bray Northwestern University Expert Finder Systems Forum March 1 st , 2019 ~1500

Zuul, the Third Throws Away Any Dirt! Szymon Datko Roman Dobosz szymon.datko@corp.ovh.com

Sambuz

Useful Links

Newsletter

Mail Us

Cross-domain Authorship Attribution Overview of the Author - PowerPoint PPT Presentation

Cross-domain Authorship Attribution Overview of the Author Identification Task at PAN-2018 PAN@CLEF2018, Avignon, 11 September 2018 Mike Kestemont, Efstathios Stamatatos, Walter Daelemans, Benno Stein, Martin Potthast Authorship attribution

Authorship &amp; Publication August 4, 2009 Authorship Publication Authorship Each author

Authorship: why not just toss a coin? Benefits and responsibilities of authorship Tactics

A Mathematical Study A Mathematical Study of Authorship Attribution of Authorship Attribution

Leveraging discourse information effectively for authorship attribution Elisa Ferracane, Su

Bootstrapped Authorship Attribution in Compression Space Ramon de Graaf Leiden Institute of

Authorship Attribution: Using Rich Linguistic Features when Training Data is Scarce Ludovic

Grieve 2007: Quantitative Authorship Attribution: An Vocabulary Richness Measures Evaluation of

Authorship Attribution of Micro-Messages Roy Schwartz + , Oren Tsur + , Ari Rappoport + and Moshe

EACH-USP Ensemble Cross-domain Authorship Attribution for PAN-CLEF-2018 J. Eleandro Cust odio,

Kernel Methods and String Kernels for Authorship Analysis Marius Popescu 1 Cristian Grozea 2 1

Kicking Down the Cross Domain Door Techniques for Cross Domain Exploitation Billy K Rios (BK) and

A multitude of linguistically- rich features for authorship attribution Ludovic Tanguy, Assaf

GLAD: Groningen Lightweight Authorship Detection PAN, Authorship verification, 2015 Manuela

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

Recognizing and Imitating Programmer Style: Adversaries in Program Authorship Attribution Lucy

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

Deep Bayes Factor Scoring for Authorship Verifjcation Benedikt Boenninghoff Dorothea Kolossa

Intro to Perception, part II Jonathan Pillow Sensation &amp; Perception (PSY 345 / NEU 325)

The Crossroads Versus the Seesaw: Getting a Fix on Recent International Tax Policy

Philosophy of Mind Philipp Koehn 7 February 2017 Philipp Koehn Artificial Intelligence:

ProtoDUNE-ND: Containment Studies Near Detector Workshop Fermilab Mai 25th, 2019 Patrick

Want to chat with everyone? Please send chats to All Participants Keep this number handy Contact

Jim Bray Northwestern University Expert Finder Systems Forum March 1 st , 2019 ~1500

Zuul, the Third Throws Away Any Dirt! Szymon Datko Roman Dobosz szymon.datko@corp.ovh.com

Sambuz

Useful Links

Newsletter

Mail Us

Authorship & Publication August 4, 2009 Authorship Publication Authorship Each author

Intro to Perception, part II Jonathan Pillow Sensation & Perception (PSY 345 / NEU 325)