ELECTRONIC TEXT REUSE ACQUISITION PROJECT INTRODUCTION & - - PowerPoint PPT Presentation

electronic text reuse acquisition project introduction
SMART_READER_LITE
LIVE PREVIEW

ELECTRONIC TEXT REUSE ACQUISITION PROJECT INTRODUCTION & - - PowerPoint PPT Presentation

ELECTRONIC TEXT REUSE ACQUISITION PROJECT INTRODUCTION & MOTIVATION M arco Bchler, Greta Franzini, Emily Franzini TABLE OF CONTENTS 1. Who are we? 2. Motivation 2/16 WHO ARE WE? WHO AM I? 2001-2002: Head of Quality Assurance


slide-1
SLIDE 1

ELECTRONIC TEXT REUSE ACQUISITION PROJECT INTRODUCTION & MOTIVATION

Marco Büchler, Greta Franzini, Emily Franzini

slide-2
SLIDE 2

TABLE OF CONTENTS

  • 1. Who are we?
  • 2. Motivation

2/16

slide-3
SLIDE 3

WHO ARE WE?

slide-4
SLIDE 4

WHO AM I?

  • 2001-2002: Head of Quality Assurance department in a software

company;

  • 2006: Diploma in Computer Science on big scale co-occurrence

analysis;

  • 2007: Consultant for several SMEs in IT sector;
  • 2008: Technical project management of the eAQUA project;
  • 2011: PI and project manager of the eTRACES project;
  • 2013: PhD in Digital Humanities on Text Reuse;
  • 2014: Head of Early Career Research Group eTRAP at the University
  • f Göttingen.

4/16

slide-5
SLIDE 5

ABOUT ME

Education

  • Humanities & Further Maths Diploma (IT)
  • Classics BA Honours (UK)
  • Digital Humanities MA (UK)
  • Part-time PhD student (UCLDH, UK): Digital Editions
  • Catalogue of Digital Editions (now a collaboration with ACDH)
  • Digital edition of an ancient Latin manuscript

Work

  • Full-time post-doctoral researcher for eTRAP Early Career Research

Group (DE): Automatic Text Reuse Detection and Analysis

5/16

slide-6
SLIDE 6

WHO AM I?

  • 2008-2011: BA Latin & Ancient Greek at University College London
  • 2011-2012: MSc Management Science & Innovation at University

College London

  • 2012-2013: Liaison Officer and Administration for the preservation
  • f cultural assets at FAI
  • 2013-2014: Research Associate at University of Leipzig (Chair for

Digital Humanities)

  • 2014-2016: Research Associate at University of Göttingen (Digital

Humanities in Dept. Computer Science)

6/16

slide-7
SLIDE 7

MOTIVATION

slide-8
SLIDE 8

VENICE 2016 - TRACER TUTORIAL

8/16

slide-9
SLIDE 9

WHO IS THIS PERSON?

9/16

slide-10
SLIDE 10

“REUSE FROM SAME SOURCE”: COMMONALITIES & DIFFERENCES

10/16

slide-11
SLIDE 11

WITTGENSTEIN’S “FAMILY RESEMBLANCE”

Family resemblance is an equivalence relation that clusters common

  • bjects of similar and not identical characteristics together.

Family resemblance is hierarchical such as in the examples before “Greta”, “Franzinis”, “Human”, ”creature“.

11/16

slide-12
SLIDE 12

FORENSIC VIEW

Evaluation of the reuse detection process by forensic criterions (standard in biometry):

  • Universality: How univeral can a characteristic be? (example: for

about 2% of all humans no fingerprint can be taken)

  • Uniqueness: Different and independent “instances” should not share

common characteristic.

  • Permanence: How resistent is a characteristic over time?
  • Collectability: Characteristics should be easy and simple to detect.
  • Performance: It includes precision, speed and robustness of the

measuring technique.

  • Acceptability: Acceptance of the technique in (academic) usage.
  • Circumvention: It should be as difficult as possible to cheat a

detection system.

12/16

slide-13
SLIDE 13

ETRAP’S OBJECTIVE

Title: eTRAP - electronic Text Reuse Acquisition Project Premise: Language is a changing system. Compared to biometry the volatility is much higher.

  • Research on the characteristics
  • What are good characteristics?
  • Which characteristics are stable and which are volatile and therefore

not helpful in the detection process?

  • Research on the reuse process
  • Begins with: Why do we quote what we quote?
  • Passes by: If changes in the reuse process happen, why do they happen

and what is the model behind (if one exists)?

  • Ends with: Understanding paraphrases and allusions

13/16

slide-14
SLIDE 14

ABOUT ETRAP

Electronic Text Reuse Acquisition Project (eTRAP) Interdisciplinary Early Career Research Group funded by the German Ministry of Education & Research (BMBF). Budget: e1.6M. Duration: March 2015 - February 2019. Research since October 2015. Team: 4 core staff; 5-9 research & student assistants; Bachelor, Masters and PhD thesis students.

  • Interdisciplinary: Classics, Computer Science, German Literature,

Mathematics, Philosophy, Cognitive Psychology and Literature Studies.

  • International: Currently from eight nationalities.

14/16

slide-15
SLIDE 15

CONTACT

Visit us http://www.etrap.eu contact@etrap.eu Stealing from one is plagiarism, stealing from many is research (Wilson Mitzner, 1876-1933)

15/16

slide-16
SLIDE 16

LICENCE

The theme this presentation is based on is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Changes to the theme are the work of eTRAP.

cba

16/16