[PPT] - Grey Relational Analysis and Natural Language Processing Arjab Singh PowerPoint Presentation

SLIDE 1

Grey Relational Analysis and Natural Language Processing

Arjab Singh Khuman1 Yingjie Yang1 Sifeng Liu2

1Centre for Computational Intelligence

De Montfort University Leicester, United Kingdom

2College of Economics and Management

Nanjing University of Aeronautics and Astronautics Nanjing, China

September 2015

SLIDE 2

Outline for the Presentation

1 Introduction 2 Natural Language Processing 3 Grey Relational Analysis 4 Proposal 5 Observations 6 Conclusion

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 2 / 30

SLIDE 3

Introduction

We will investigate the validity of using Grey Relational Analysis for

Natural Language Processing

Providing a theoretical overview from which further research can be

undertaken

Describing what Grey Relational Analysis and Natural Language

Processing entails

We look towards the use of Grey Incidence Analysis for inspection and

quantification

Understanding the traditional use of Grey Incidence, allows one to better

understand our intended use for Natural Language Processing

We describe the the varying components to our framework, highlighting

problem areas and possible solutions

We conclude and suggestions of possible enhancements are put

forward

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 3 / 30

SLIDE 4

Outline for the Presentation

1 Introduction 2 Natural Language Processing 3 Grey Relational Analysis 4 Proposal 5 Observations 6 Conclusion

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 4 / 30

SLIDE 5

Natural Language Preliminaries

Natural Language Processing is primarily concerned with the interaction

between machines and human based linguistics

It has been a hot topic within Computer Science and Artificial

Intelligence since the 1950s

It is an umbrella term, which encompasses many sub-domains, including

Natural Language Understanding which is associated with deriving meaning and sentiment

There are many examples of experiments and programs that are

associated with Natural Language Processing

The Georgetown experiment in 1954, where the automatic

transformation of over 60 Russian sentences were converted into interpretable English equivalent sentences

The creation of ELIZA, a system which simulated a person-centred

counseling client

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 5 / 30

SLIDE 6

Natural Language Preliminaries

The 1970s saw the introduction of conceptual ontologies, which

associated itself with structuring real-world information into data that was machine understandable

The likes of MARGIE, SAM, PAM, POLITICS, all which are examples
f conceptual ontology programs
The introduction chatterbots, programs that could interact with users

and engage in menial conversation, at least to some extent

The likes of PARRY, a program written to simulate a paranoid

schizophrenic

Racter, which was supposedly able to generate English language prose,

short pieces of grammatically structured works, with rudimental natural flow

Jabberwacky, a chatterbot created to synthesize natural human

chatter in an interesting, entertaining manner

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 6 / 30

SLIDE 7

Natural Language Preliminaries

Modern Natural Language Processing algorithms are based on machine

learning, in particular statistical machine learning

Prior implementations of language-processing tasks typically involved the

hard-coding of a large number of deterministic rules

Modern day machine learning algorithms are still firmly rooted in

statistical inferencing

There are several different classes of machine learning which execute in

similar ways; taking large sets of features that are obtained from the input data

The current trend is still very much to make use of statistical models,

which allow for soft, probabilistic decisions based on attaching a weight to each identified input feature

There are certain characteristics that make it very applicable for

Grey Theory

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 7 / 30

SLIDE 8

Outline for the Presentation

1 Introduction 2 Natural Language Processing 3 Grey Relational Analysis 4 Proposal 5 Observations 6 Conclusion

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 8 / 30

SLIDE 9

Grey Relational Analysis Preliminaries

Grey Relational Analysis falls under the remit of Grey Incidence

Analysis, whereby the main ethos is to understand which factors of a system are more important than others

Establishing which factors can be identified as being favourable and

equally, which factors are detrimental

By using a characteristic sequence, a sequence that represents an ideal of

the system, then comparing it against behavioural factors to ascertain how much the sequences are alike, or how much the behaviour factors impact upon the characteristic sequence itself

This information can then be used in terms of identifying if more

emphasis should be applied to a particular behaviour or not

Given that incidence analysis is mainly used for the inspection of a

system, there is little to no literature regarding the use of incidence analysis for Natural Language Processing

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 9 / 30

SLIDE 10

Grey Relational Analysis Preliminaries

The characteristic sequences of a system Y1, Y2, . . . , Yn, against its

behavioural factor sequences X1, X2, . . . , Xm, all of which must be of the same magnitude

Γ = [γij], where each entry in the ith row of the matrix is the degree of

grey incidence for the corresponding characteristic sequence Yi, and relevant behavioural factors X1, X2, . . . , Xm

Each entry for the jth column is reference to the degrees of grey

incidence for the characteristic sequences Y1, Y2, . . . , Yn and behavioural factors Xm

For the inspection and analysis of the sequences, there are several

variations of the degree of incidence one could employ...

However, we a merely concerned with the Absolute degree of grey

incidence

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 10 / 30

SLIDE 11

Degrees of Grey Incidence

Absolute degree of grey incidence

Assume that Xi and Xj ∈ U are two sequences of data with the same magnitude, that are defined as the sum of the distances between two consecutive time points, whose zero starting points have already been computed:

si = n

1

(Xi − xi(1))dt sj = n

1

(Xj − xj(1))dt (1) si − sj = n

1

(X0

i − X0 j )dt

(2)

Which is associated with the absolute relationships that exist between

characteristic sequences and their behaviours

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 11 / 30

SLIDE 12

Outline for the Presentation

1 Introduction 2 Natural Language Processing 3 Grey Relational Analysis 4 Proposal 5 Observations 6 Conclusion

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 12 / 30

SLIDE 13

The Concept

We are merely interested in the analysis of the sequences
Assume that you have a hard-wired linguistic sequence in the system,

this may execute an associated command; this can be representative of a characteristic sequence

Also assume that a user input stream is presented to the system; a

behavioural sequence, incidence analysis can be carried out to establish how similar or dissimilar the sequences are

If the returned coefficient surpasses a threshold value, the associated
utput command is executed
This harks back to the fact that the more recent Natural Language

Processing algorithms make use of statistical based models

Allowing for soft, probabilistic decisions to be undertaken, with the

advantage of expressing relative certainty to any number of possible answers rather than just one

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 13 / 30

SLIDE 14

The Concept

Multiple input streams could be compared to multiple target streams

and compared accordingly in a pairwise manner to establish which input is better suited to which output

This is achieved is by the measurement of the metric spaces contained

between the geometric curves of the sequences being compared

As the sequence themselves are made up of discretised data points, point

wise comparisons can be made to garner the relative similarity between sequences

The use of the absolute degree of grey incidence gives the means of

providing computation, returning a coefficient value of absoluteness

The value itself falls within the range of [0, 1], the more similar the

sequences are the closer to 1 the coefficient will be and vice-versa

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 14 / 30

SLIDE 15

Outline for the Presentation

1 Introduction 2 Natural Language Processing 3 Grey Relational Analysis 4 Proposal 5 Observations 6 Conclusion

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 15 / 30

SLIDE 16

Observations

We will present some of the core individual aspects that contribute to

the framework

Small examples are demonstrated to further enhance the understanding
f using such an approach
Also identified are the weak points and the assumptions that are placed

upon the concept

Possible solutions to circumvent these weak areas an unrealistic

assumptions are discussed

Some key application areas are described where real world applicability is

feasible

The overall evaluation of the framework is also discussed, remarking
n the individual aspects of the framework
A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 16 / 30

SLIDE 17

Observations

Envision that the linguistic term to be coded into a sequence is done by

using simple symbolic association: a = 1, b = 2, c = 3, . . . , i = n

Assume the word ‘would’ is the characteristic sequence and its associated

valued sequences is: s0 = [23, 15, 21, 12, 4]

It is noteworthy to mention that if the word is spelled correctly, the

sequence it generates will be completely unique

There will be no other exact sequence other than the sequence you are

referring to

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 17 / 30

SLIDE 18

Observations

The s0 sequence is the characteristic sequence, assume that the input

stream presented to the system is ‘could’ with the following valued sequence: s1 = [3, 15, 21, 12, 4]

The returned absolute degree of grey incidence for these two sequences is:

0.888

A high scoring coefficient indicating the similarity of the two sequences is

high

Obviously, if the input sequence and the target sequence matched exactly

the output for the incidence would be an absolute 1.

The use of either the relative or synthetic degree of incidence for

analysis, is actually not needed

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 18 / 30

SLIDE 19

Observations

With the English language and like many others, there are several ways

to refer to the initial same observation

One could use the Queen’s English and produce a grammatically,

perfectly structured sentence, or one could use broken English and still maintain the underlying sentiment

Sentence 1 below is a grammatically correct statement which describes

the colour of a door.

Sentence 2 is a broken sentence, but it contains the underlying sentiment
f sentence 1 using only two words.
1. THE(1) DOOR(2) WAS(3) A(4) VIVID(5) GREEN(6)
2. DOOR(2) GREEN(6)
There will always be a statement that will be of the smallest possible

length, one which will contain all the relevant sentiment and key features of a more grammatically correct statement

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 19 / 30

SLIDE 20

Observations

Sentence 1 would have an associated sequence of:

|20, 8, 5|1 27 |4, 15, 15, 18|2 27 |23, 1, 19|3 27 |1|4 27 |22, 9, 22, 9, 4|5 27 |7, 18, 5, 5, 14|6

The value of 27 is indicative of a white space and indicates the start of a

new word. Given the sequence and the values it contains, that sequence can only ever refer to that sentence.

Sentence 2 is the target sequence, therefore it has the following

information contained: |4, 15, 15, 18|2 27 |7, 18, 5, 5, 14|6

Token 2 and 6 are identical to the target and therefore it can be

concluded that sentence 1 is indeed a possible match for sentence 2

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 20 / 30

SLIDE 21

Observations

A requirement of Grey Relational Analysis is that the sequences being

compared must have the same length

A input stream may have an unknown length, as compared to the known

length of the target stream

There is a high likelihood that some words may not be spelt correctly
Identifying key features and having those compared against the target

sequence would lessen the burden of exactness

As the sequences themselves can be tokenised and parsed, these

individual elements can be inspected using incidence analysis

If the key features of an inspected input stream return high coefficient

values, there is a high likelihood they are a positive match

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 21 / 30

SLIDE 22

Observations

T HE DOOR W AS A V IV ID GREEN A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 22 / 30

SLIDE 23

Observations

It is this concatenated statement that would be the target sequence
As it is, there is no abstraction or understanding of what the word or

words mean, it is merely a collection of letters.

Therefore, Natural Language Understanding applications at this stage

would not be a key domain

However, morphological segmentation most definitely would
The separation of words into individual grammatical units; the smallest

meaningful unit of a word concatenated would still provide for a unique sequence - such as making use of syllables

If the individual syllables have their sequences mapped and stored in a

system, those syllables collected and presented in a certain way would only ever refer to the word that was intended

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 23 / 30

SLIDE 24

Observations

By tokenising a sentence we have effectively created word blocks, which

will have their own geometric patterns for their associated sequences

If the input stream can be isolated and tokenised, those individual tokens

could then be compared to individual tokens from the target sequence

Assuming that an input stream has been tokenised and parsed into the

system, the collective geometric curves of the statement could be permutated to see if fits with a possible target sequence

The degree of incidence could then be computed on a word by word basis,

with every high scoring result for its coefficient being collected and stored

Theses stored coefficient values could then be sequenced to see how

similar the overall comparison is

One would hope to see a geometric curve, as straight as possible

and as close to 1 throughout its duration

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 24 / 30

SLIDE 25

Observations

The area of Named Entity Recognition would be a possible avenue for

further research

Parsing is also another area that grey analysis could be deployed with

some degree of success

This concatenated and tokenised parsed sequence would be the target,

which would be compared to against input streams

The individual tokens of the input stream could be compared against

segments of the target stream

This would circumvent the problem of having the exact same magnitude

for the sequences themselves, as we have a higher likelihood of comparing a token of the sequence with the token of the target, of the same magnitude

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 25 / 30

SLIDE 26

Observations

This again has associated problems, the main one being that one has to

assume that the input stream contains correctly spelt words

The use of sequencing to represent syllables would allow for this problem

to be somewhat alleviated

The target sequence could be in theory a collection of target sequences

for a specific output, all which contain possible variations of how a word maybe pronounced using permutations of syllable ordering

This would be applicable for the area of Word Sense Disambiguation
If the target sequences are that of a word with associated disambiguation,

then several permutations of that word could then be given meaning

Using a grey approach for Natural Language Processing can be

evaluated from both the intrinsic and extrinsic perspectives

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 26 / 30

SLIDE 27

Outline for the Presentation

1 Introduction 2 Natural Language Processing 3 Grey Relational Analysis 4 Proposal 5 Observations 6 Conclusion

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 27 / 30

SLIDE 28

Final Remarks

We touched upon the validity of using Grey Relational Analysis

techniques for use in certain Natural Language Processing domains

The main approach adopts Grey Incidence Analysis for the inspection of

sequences

The uniqueness of a word or sentence, will only ever refer to what was

intended

As such, that word or statement will always have the exact same

geometric pattern for its sequence

It would be a farfetched to assume that every input stream would

contain the correct spelling

In which case the inspection of the segmentation of the word may
ffer an alternative, such as the syllables that make up the word
A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 28 / 30

SLIDE 29

Final Remarks

Having multiple target sequences which are slight permutations of its

intended meaning would help overcome the problems of not distinguishing between homonyms/homophones

The returned coefficient for any inspected pair of sequences provides one

a measure of similarity

The greater the value is to 1, the greater the likeness of the two

sequences, and vice-versa

Further enhanced via the possible inclusion of Radial Analysis
Another enhancement could be the inclusion of grey bounds, upper and

lower bounds which would contain the input sequence itself - providing a realm of containment

A comparison of not only the sequences themselves, but also of the

realms could be undertaken to gauge similarity

A. S. Khuman

(C.C.I.) The Leverhulme Trust September 2015 29 / 30