Analyzing Time Structured Corpora Corpus Statistics Research Group - - PowerPoint PPT Presentation

analyzing time structured corpora
SMART_READER_LITE
LIVE PREVIEW

Analyzing Time Structured Corpora Corpus Statistics Research Group - - PowerPoint PPT Presentation

Analyzing Time Structured Corpora Corpus Statistics Research Group launch event Birmingham, 11th Feb 2016 Tony Hennessey (University of Nottingham) joint work with R. Carrington, Y. van Gennip, M. Mahlberg, S. Preston, K. Severn, V. Wiegand


slide-1
SLIDE 1

Analyzing Time Structured Corpora

Corpus Statistics Research Group launch event

Birmingham, 11th Feb 2016

Tony Hennessey (University of Nottingham)

joint work with R. Carrington, Y. van Gennip, M. Mahlberg,

  • S. Preston, K. Severn, V. Wiegand

Tony Hennessey (UoN) 1 / 18

slide-2
SLIDE 2

Overview

Overview

How to look at the time dependency in the properties of a corpus. Recap terminology and describe the main example used throughout the presentation. Binning data and how to think about binning mathematically. Using kernels which are better than bins.

Tony Hennessey (UoN) 2 / 18

slide-3
SLIDE 3

Setting the scene (and a bit of a recap)

X - some matrix representation of the corpus

             

2 2 1 . . . 2 1 1 . . . 1 1 1 . . . 1 1 1 . . . 1 1 1 . . . . . . . . . . . . . . . . . . ...

Tony Hennessey (UoN) 3 / 18

slide-4
SLIDE 4

Setting the scene (and a bit of a recap)

X - some matrix representation of the corpus

aardvark abacus badger bandicoot bonsai

             

doc 01

2 2 1 . . .

doc 02

2 1 1 . . .

doc 03

1 1 1 . . .

doc 04

1 1 1 . . .

doc 05

1 1 1 . . . . . . . . . . . . . . . . . . ...

document-term matrix

Tony Hennessey (UoN) 3 / 18

slide-5
SLIDE 5

Setting the scene (and a bit of a recap)

f(X) - some function that we apply to the corpus

Tony Hennessey (UoN) 4 / 18

slide-6
SLIDE 6

Setting the scene (and a bit of a recap)

f(X) - some function that we apply to the corpus

The cosine of the angle between words in a vector space which was derived using a matrix factorization. (X = USVT singular value decomposition) This measure quantifies the degree of association between words i.e. a bigger value implies closer association.

Tony Hennessey (UoN) 4 / 18

slide-7
SLIDE 7

Setting the scene (and a bit of a recap)

X (document-term matrix)

11,543,110 documents 472,331 terms

Tony Hennessey (UoN) 5 / 18

slide-8
SLIDE 8

Setting the scene (and a bit of a recap)

X (document-term matrix)

11,543,110 documents 472,331 terms

Meta-data for each document includes a date

Tony Hennessey (UoN) 5 / 18

slide-9
SLIDE 9

How does the corpus change with time?

Let us try binning the data using dates.

Tony Hennessey (UoN) 6 / 18

slide-10
SLIDE 10

Binning by date

X =

Tony Hennessey (UoN) 7 / 18

slide-11
SLIDE 11

Binning by date

{ { {

1st Jan 1785 2nd Jan 1785 3rd Jan 1785 Tony Hennessey (UoN) 7 / 18

slide-12
SLIDE 12

Binning by date

{ { {

1st Jan 1785 2nd Jan 1785 3rd Jan 1785

{ { {

Tony Hennessey (UoN) 7 / 18

slide-13
SLIDE 13

Binning by date

X(t = 1st Jan 1785) = X(t = 2nd Jan 1785) = X(t = 3rd Jan 1785) =

X(t)

Tony Hennessey (UoN) 7 / 18

slide-14
SLIDE 14

Binning by date

X(t = 1st Jan 1785) = X(t = 2nd Jan 1785) = X(t = 3rd Jan 1785) =

X(t)

t

f(X(t))

+ + +

1st Jan 1785 2nd Jan 1785 3rd Jan 1785

Tony Hennessey (UoN) 7 / 18

slide-15
SLIDE 15

Binning by date

Identity matrix X = I X

               

2 2 . . . 2 . . . 1 . . . 1 1 . . . 1 1 1 . . . 4 2 . . . 2 1 1 . . . 2 2 . . . 1 2 . . . 4 . . . . . . . . . . . . ...

               

=

               

1 . . . 1 . . . 1 . . . 1 . . . 1 . . . 1 . . . 1 . . . 1 . . . 1 . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...

                               

2 2 . . . 2 . . . 1 . . . 1 1 . . . 1 1 1 . . . 4 2 . . . 2 1 1 . . . 2 2 . . . 1 2 . . . 4 . . . . . . . . . . . . ...

               

Tony Hennessey (UoN) 8 / 18

slide-16
SLIDE 16

Binning by date

Filter by date X(t) = b(t) X

               

2 2 . . . 2 . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...

               

=

               

1 . . . 1 . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...

                               

2 2 . . . 2 . . . 1 . . . 1 1 . . . 1 1 1 . . . 4 2 . . . 2 1 1 . . . 2 2 . . . 1 2 . . . 4 . . . . . . . . . . . . ...

               

where t = ’1st Jan 1785’

Tony Hennessey (UoN) 8 / 18

slide-17
SLIDE 17

Binning by date

Filter by date X(t) = b(t) X

               

. . . . . . . . . 1 1 . . . 1 1 1 . . . 4 2 . . . . . . . . . . . . . . . . . . . . . . . . ...

               

=

               

. . . . . . . . . 1 . . . 1 . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...

                               

2 2 . . . 2 . . . 1 . . . 1 1 . . . 1 1 1 . . . 4 2 . . . 2 1 1 . . . 2 2 . . . 1 2 . . . 4 . . . . . . . . . . . . ...

               

where t = ’2nd Jan 1785’

Tony Hennessey (UoN) 8 / 18

slide-18
SLIDE 18

Binning by date

How wide should the bins be?

depends on your research question

e.g. over what time scale are you interested in examining change?

depends on your data

e.g. how sparsely distributed are the traits you are looking at likely to be?

Tony Hennessey (UoN) 9 / 18

slide-19
SLIDE 19

Binning by date

An example of binning using the TDA

just showing f ( X(t) ) for ‘smoking’ and ‘cancer’

0.0 0.1 0.2 0.3 0.4 0.5

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

+ + + + + + + + + + + + + + + + + + + + +

Tony Hennessey (UoN) 10 / 18

slide-20
SLIDE 20

Binning by date

An example of binning using the TDA

just showing f ( X(t) ) for ‘smoking’ and ‘cancer’

0.0 0.1 0.2 0.3 0.4 0.5

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

+ + + + + + + + + + + + + + + + + + + + +

We used 5 year bins because the number of articles about smoking are quite sparsely distributed we are mainly interested in long term trends

Tony Hennessey (UoN) 10 / 18

slide-21
SLIDE 21

Binning by date

t

1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1916 1915

+ + +

Tony Hennessey (UoN) 11 / 18

slide-22
SLIDE 22

Binning by date

Sliding the bins t

1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1916 1915

+ + + + +

Tony Hennessey (UoN) 11 / 18

slide-23
SLIDE 23

Binning by date

Sliding the bins t

1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1916 1915

+ + + + + + + + + + + + +

Tony Hennessey (UoN) 11 / 18

slide-24
SLIDE 24

Binning by date

Sliding the bins for the TDA example

revisit f ( X(t) ) for ‘smoking’ and ‘cancer’

0.0 0.1 0.2 0.3 0.4 0.5

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Tony Hennessey (UoN) 12 / 18

slide-25
SLIDE 25

Binning by date

Sliding the bins for the TDA example

revisit f ( X(t) ) for ‘smoking’ and ‘cancer’

0.0 0.1 0.2 0.3 0.4 0.5

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ + + + + + + + + + + + + + + + + + + + +

Tony Hennessey (UoN) 12 / 18

slide-26
SLIDE 26

Using a kernel

Can we do better?

  • Yes. Use a kernel.

Tony Hennessey (UoN) 13 / 18

slide-27
SLIDE 27

Using a kernel

Why use a kernel? Why not just bin?

A kernel takes account of the width of your data collection window i.e. if you bin, as your bins get wider your effect will get bigger; with a kernel it will not.

0.0 0.2 0.4 0.6 0.8 1.0

1910 1911 1912 1909 1908

Bin Uniform

Tony Hennessey (UoN) 14 / 18

slide-28
SLIDE 28

Using a kernel

Why use a kernel? Why not just bin?

A kernel takes account of the width of your data collection window i.e. if you bin, as your bins get wider your effect will get bigger; with a kernel it will not. k(t) = 1 w b(t)

          

. . . . . . . . .

1 w

. . .

1 w

. . .

1 w

. . . . . . . . . . . . . . . . . . . . . . . . . . . ...

          

= 1 w

          

. . . . . . . . . 1 . . . 1 . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . ...

          

Tony Hennessey (UoN) 14 / 18

slide-29
SLIDE 29

Using a kernel

Why use a kernel? Why not just bin?

A kernel takes account of the width of your data collection window i.e. if you bin, as your bins get wider your effect will get bigger; with a kernel it will not. With a kernel we can control smoothing.

0.0 0.2 0.4 0.6 0.8 1.0

1910 1911 1912 1909 1908

Bin Uniform

Tony Hennessey (UoN) 14 / 18

slide-30
SLIDE 30

Using a kernel

Why use a kernel? Why not just bin?

A kernel takes account of the width of your data collection window i.e. if you bin, as your bins get wider your effect will get bigger; with a kernel it will not. With a kernel we can control smoothing.

0.0 0.2 0.4 0.6 0.8 1.0

1910 1911 1912 1909 1908

Bin Uniform Triweight

Tony Hennessey (UoN) 14 / 18

slide-31
SLIDE 31

Using a kernel

Why use a kernel? Why not just bin?

A kernel takes account of the width of your data collection window i.e. if you bin, as your bins get wider your effect will get bigger; with a kernel it will not. With a kernel we can control smoothing.

.05 .2 .4 .2 .05

Tony Hennessey (UoN) 14 / 18

slide-32
SLIDE 32

Using a kernel

Some examples of kernels

0.0 0.2 0.4 0.6 0.8 1.0

  • 1.0
  • 0.5

0.0 0.5 1.0 Uniform Triangle Epanechnikov Quartic Triweight Gaussian Cosine image by Brian Amberg (wikicommons) Tony Hennessey (UoN) 15 / 18

slide-33
SLIDE 33

Using a kernel

Sliding the kernel t

1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1916 1915

f(X(t))

+

Tony Hennessey (UoN) 16 / 18

slide-34
SLIDE 34

Using a kernel

Sliding the kernel t

1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1916 1915

f(X(t))

+ +

Tony Hennessey (UoN) 16 / 18

slide-35
SLIDE 35

Using a kernel

Sliding the kernel t

1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1916 1915

f(X(t))

+ + +

Tony Hennessey (UoN) 16 / 18

slide-36
SLIDE 36

Using a kernel

Sliding the kernel t

1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1916 1915

f(X(t))

+ + + +

Tony Hennessey (UoN) 16 / 18

slide-37
SLIDE 37

Using a kernel

Sliding the kernel t

1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1916 1915

f(X(t))

+ + + + +

Tony Hennessey (UoN) 16 / 18

slide-38
SLIDE 38

Using a kernel

Sliding the kernel t

1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1916 1915

f(X(t))

+ + + + + +

Tony Hennessey (UoN) 16 / 18

slide-39
SLIDE 39

Using a kernel

Triweight kernel for the TDA example

revisit f ( X(t) ) for ‘smoking’ and ‘cancer’

0.0 0.1 0.2 0.3 0.4 0.5

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Tony Hennessey (UoN) 17 / 18

slide-40
SLIDE 40

Using a kernel

Triweight kernel for the TDA example

revisit f ( X(t) ) for ‘smoking’ and ‘cancer’

0.0 0.1 0.2 0.3 0.4 0.5

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Tony Hennessey (UoN) 17 / 18

slide-41
SLIDE 41

Using a kernel

Triweight kernel for the TDA example

revisit f ( X(t) ) for ‘smoking’ and ‘cancer’

Information from BBC News article online - Timeline: Smoking and disease

1951: Dr Richard Doll and Prof Austin Bradford Hill conduct first large-scale study of link between smoking and lung cancer. 1954: Dr Doll and his team publish a paper confirming the link. Tony Hennessey (UoN) 17 / 18

slide-42
SLIDE 42

Using a kernel

Triweight kernel for the TDA example

revisit f ( X(t) ) for ‘smoking’ and ‘cancer’

Information from BBC News article online - Timeline: Smoking and disease

1962: Royal College of Physicians report concludes that smoking is a cause of lung cancer and bronchitis, and probably contributes to coronary heart disease. Tony Hennessey (UoN) 17 / 18

slide-43
SLIDE 43

Using a kernel

Triweight kernel for the TDA example

revisit f ( X(t) ) for ‘smoking’ and ‘cancer’

Information from BBC News article online - Timeline: Smoking and disease

1971: Government health warnings to be carried on all cigarette packets sold in the UK. Tony Hennessey (UoN) 17 / 18

slide-44
SLIDE 44

Using a kernel

Triweight kernel for the TDA example

revisit f ( X(t) ) for ‘smoking’ and ‘cancer’

Information from BBC News article online - Timeline: Smoking and disease

1983: Latest Royal College of Physicians report features passive smoking for the first time. 1984: Smoking banned on London Underground trains. Tony Hennessey (UoN) 17 / 18

slide-45
SLIDE 45

The End

The End.

Tony Hennessey (UoN) 18 / 18