Betweenness relation orientated by Guttman effect in critical - - PowerPoint PPT Presentation

betweenness relation orientated by guttman effect in
SMART_READER_LITE
LIVE PREVIEW

Betweenness relation orientated by Guttman effect in critical - - PowerPoint PPT Presentation

Betweenness relation orientated by Guttman effect in critical edition M. Le Pouliquen 1 M. Csernel 2 1 Telecom Bretagne, Labsticc UMR 3192 , BP 832, 29285 Brest Cedex - France 2 AXIS Inria-Rocqencourt, BP-105- 78180 Le Chesnay Cedex - France CARME


slide-1
SLIDE 1

Betweenness relation orientated by Guttman effect in critical edition

  • M. Le Pouliquen1
  • M. Csernel2

1Telecom Bretagne, Labsticc UMR 3192 , BP 832, 29285 Brest Cedex - France 2AXIS Inria-Rocqencourt, BP-105- 78180 Le Chesnay Cedex - France

CARME 2011 The 10th International Conference on CORRESPONDENCE ANALYSIS AND RELATED METHODS

1 / 32

slide-2
SLIDE 2

Plan of the presentation

1

Introduction

2

Different characterizations for betweenness

3

Textual tradition De Nuptiis and first result

4

Application to seriation

5

Conclusions

2 / 32

slide-3
SLIDE 3

Plan

1

Introduction

2

Different characterizations for betweenness

3

Textual tradition De Nuptiis and first result

4

Application to seriation

5

Conclusions

3 / 32

slide-4
SLIDE 4

Critical edition Critical edition project consists in different steps : Inventory of the manuscripts called witnesses in the corpus. Codicologic and paleographic studies of the manuscripts in order to carry out a first classification of the manuscripts. Text collation. Development of the stemma codicum in order to explain the text history. Reading selection. Critical edition.

4 / 32

slide-5
SLIDE 5

Example of stemma sodicum

FIGURE: Stemma codicum established by Danuta Shanzer : Each letter is a manuscript ; the Greek letters indicate lost or supposed manuscripts.

5 / 32

slide-6
SLIDE 6

Idea of Don Quentin Don Quentin came up with the idea of using the betweenness in order to draw up a stemma. In fact, he restored small chains of three manuscripts where one is between the others , he assembled these small chains in order to infer the complete tree.

b b b b b

Q G I F P

6 / 32

slide-7
SLIDE 7

Idea of Don Quentin Don Quentin came up with the idea of using the betweenness in order to draw up a stemma. In fact, he restored small chains of three manuscripts where one is between the others , he assembled these small chains in order to infer the complete tree.

b b b b b

Q G I F P The mns Q is between I and G. That is written : (I, Q, G)

7 / 32

slide-8
SLIDE 8

Idea of Don Quentin Don Quentin came up with the idea of using the betweenness in order to draw up a stemma. In fact, he restored small chains of three manuscripts where one is between the others , he assembled these small chains in order to infer the complete tree.

b b b b b

Q G I F P

  • (I, Q, G)
  • (I, Q, F)

...

8 / 32

slide-9
SLIDE 9

Idea of Don Quentin Method : If B is between A and C then : (i) The readings of the middle manuscript B agree alternately with these of A or C (ii) The readings A et C never agree against these of B.

9 / 32

slide-10
SLIDE 10

Idea of Don Quentin Method : If B is between A and C then : (i) The readings of the middle manuscript B agree alternately with these of A or C (ii) The readings A et C never agree against these of B. As an example, consider the three following sentences corresponding to the same sentence of manuscripts copied one from the other. A =This is a sentence invented for the example B =Here is a sentence invented for the example C =Here is a sentence built for the example

10 / 32

slide-11
SLIDE 11

Idea of Don Quentin Method : If B is between A and C then : (i) The readings of the middle manuscript B agree alternately with these of A or C (ii) The readings A et C never agree against these of B. As an example, consider the three following sentences corresponding to the same sentence of manuscripts copied one from the other. A =This is a sentence invented for the example B =Here is a sentence invented for the example C =Here is a sentence built for the example

11 / 32

slide-12
SLIDE 12

Idea of Don Quentin Method : If B is between A and C then : (i) The readings of the middle manuscript B agree alternately with these of A or C (ii) The readings A et C never agree against these of B. As an example, consider the three following sentences corresponding to the same sentence of manuscripts copied one from the other. B =Here is a sentence invented for the example A =This is a sentence invented for the example C =Here is a sentence built for the example

12 / 32

slide-13
SLIDE 13

Plan

1

Introduction

2

Different characterizations for betweenness

3

Textual tradition De Nuptiis and first result

4

Application to seriation

5

Conclusions

13 / 32

slide-14
SLIDE 14

Metric betweenness Among many geometrical characterizations of betweenness, Menger has introduced a definition under the name of metric betweenness in the following way : Definition A ternary relation (, , ) on a set E is a metric betweenness relation if there is a metric d on E such that : (a, b, c) ⇔ d(a, b) + d(b, c) = d(a, c)

14 / 32

slide-15
SLIDE 15

Metric betweenness The metric d between two strings is the number of operations (substitution deletion insertion) required to transform one of them into the other (a kind of word edit distance). In our example, calculations of d give : A This is a sentence invented for the example B Here is a sentence invented for the example C Here is a sentence built for the example d(A, B) = 1 d(B, C) = 1 d(A, C) = 2 It is noted that (A, B, C) because d(A, C) = d(A, B) + d(B, C) and not (A, C, B).

15 / 32

slide-16
SLIDE 16

Metric betweenness The metric d between two strings is the number of operations (substitution deletion insertion) required to transform one of them into the other (a kind of word edit distance). In our example, calculations of d give : A This is a sentence invented for the example B Here is a sentence invented for the example C Here is a sentence built for the example d(A, B) = 1 d(B, C) = 1 d(A, C) = 2 For application, we use the index IM = d(A,B)+d(B,C)−d(A,C)

d(A,C)

null if (A, B, C).

16 / 32

slide-17
SLIDE 17

Betweenness defined by a score built with Don Quentin’s conditions We want to build an index to detect if a manuscript is between two

  • thers. If B is between A and C then :

Let n reading’s number in B Let n1 reading’s number in B which do not belong either in A, or in C. Put down IQ1 = n1

n . Therefore IQ1 ∈ [0, 1] and if IQ1 = 0 the first

condition of Don Quentin is verified. Let n2 The number of common readings in A and C which do not belong in B Put down IQ2 = n2

n . Therefore IQ2 ∈ [0, 1] and if IQ2 = 0 the second

condition of Don Quentin is verified. So, IQ = 0.8 ∗ IQ2 + 0.2 ∗ IQ1 = 0 if both Don Quentin’s conditions are verified.

17 / 32

slide-18
SLIDE 18

Betweenness defined by a score built with Don Quentin’s conditions With the preceding example : A This is a sentence invented for the example B Here is a sentence invented for the example C Here is a sentence built for the example n = 2, n1 = 0, n2 = 0 ⇒ IQ = 0 Therefore B is between A and C A This is a sentence invented for the example C Here is a sentence built for the example B Here is a sentence invented for the example n = 2, n1 = 1, n2 = 1 ⇒ IQ = 1 therefor C isn’t betweenn A and B

18 / 32

slide-19
SLIDE 19

Betweenness relation based on set theory Definition One set B is between two other sets A and C for Restle if and only if : (i) A ∩ B ∩ C = ∅ (ii) A ∩ B ∩ C = ∅ C A B A ∩ B ∩ C A ∩ B ∩ C C A B

19 / 32

slide-20
SLIDE 20

Betweenness relation based on set theory Definition One set B is between two other sets A and C for Restle if and only if : (i) A ∩ B ∩ C = ∅ (ii) A ∩ B ∩ C = ∅ We use the index IE = Card(A∩¯

B∩C)+Card(¯ A∩B∩¯ C) Card(B)

which is null if (A, B, C).

20 / 32

slide-21
SLIDE 21

Betweenness relation based on set theory Using the preceding example again : The set A contains variants { This, invented }, B { Here, invented } and C { Here, built }. On a diagram : This invented Here built

21 / 32

slide-22
SLIDE 22

Betweenness relation based on set theory Using the preceding example again : The set A contains variants { This, invented }, B { Here, invented } and C { Here, built }. On a diagram : This invented Here built A ∩ B∩C = ∅

22 / 32

slide-23
SLIDE 23

Betweenness relation based on set theory Using the preceding example again : The set A contains variants { This, invented }, B { Here, invented } and C { Here, built }. On a diagram : This invented Here built A ∩ B∩C = ∅ A ∩ B ∩C = ∅

23 / 32

slide-24
SLIDE 24

Plan

1

Introduction

2

Different characterizations for betweenness

3

Textual tradition De Nuptiis and first result

4

Application to seriation

5

Conclusions

24 / 32

slide-25
SLIDE 25

Corpus De Nuptiis Use of the textual tradition engendered by the poem which opens the book IX Nuptiis Philologiae et Mercurii Tertullianus of Martianus Capella (5th century ap. J.-C). Collected by Jean-Baptiste Guillaumin Stemma, certainly incomplete, draw up by Danuta Shanzer The corpus is made up of 18 poems indicated by the following letters : Co = (A, B, C, D, E, F, H, K, L, M, O, P, R, S, U, V, W, Z) The collation table is consisted of 234 readings

25 / 32

slide-26
SLIDE 26

Results on the corpus De Nuptiis Betweenness index IQ ordered for some triplets : Triplets SFV MFV KFV UFV LFV PFV RFV HF IQ 0, 086 0, 087 0, 087 0, 087 0, 087 0, 087 0, 088 0, 0 No intermediate manuscript in the real corpus. Manuscripts E and F are present in the first 70 triplets. So E and F are between the textual sub-tradition CEFV and the rest of the corpus ? We must relaxe the definitions and define the notion of weak-betweenness : Definition Let I (= IM,IE or IQ ) a between index, B is weak-between A and C if : (i) I(A, B, C) ≤ I(B, A, C) et I(A, B, C) ≤ I(A, C, B) (ii) There is a threshold index IS such as I(A, B, C) ≤ IS

26 / 32

slide-27
SLIDE 27

Plan

1

Introduction

2

Different characterizations for betweenness

3

Textual tradition De Nuptiis and first result

4

Application to seriation

5

Conclusions

27 / 32

slide-28
SLIDE 28

Application in the seriation The first goal of this seriation is the determination of the original manuscript (the archetype). Use of betweenness relation to generate a seriation Use of Guttman effect with the R-package Factominer The archaeological seriation with the R-package Seriation

Kendall’s multidimensional scaling Hodson’s hierarchical clustering . The Correspondence Analysis resumed in archaeology by F . DjinDjian

Ordination Methods for Ecologists with the R-package Vegan

Minchin’s Non Metric Multidimensional Scaling (NMDS) Eigenvector Ordination : Principal Components Analysis (PCA) and Correspondence Analysis (CA) use by Goodall

28 / 32

slide-29
SLIDE 29

Comparison betweenn’s algo and Guttman’s method We find sub-traditions as (C, F, E, V) or (W, R, D, B) witch are already known. There is really no Guttman’s parabola

29 / 32

slide-30
SLIDE 30

Comparison between other methods An other sub-tradition (S, K, O) seem relevant There is no interesting global seriation

30 / 32

slide-31
SLIDE 31

Plan

1

Introduction

2

Different characterizations for betweenness

3

Textual tradition De Nuptiis and first result

4

Application to seriation

5

Conclusions

31 / 32

slide-32
SLIDE 32

Conclusions The betweenness modellings and the seriation tested on fictitious corpuses work well. On a real corpus, global seriation → local seriation The methods multiplicity allows to spot local seriation There is more information in betweenness than in distances, but triplets remain more difficult to treat.

32 / 32