The Internet of Animals Professor Stephen Hailes UCL New Frontiers - - PowerPoint PPT Presentation

the internet of animals
SMART_READER_LITE
LIVE PREVIEW

The Internet of Animals Professor Stephen Hailes UCL New Frontiers - - PowerPoint PPT Presentation

The Internet of Animals Professor Stephen Hailes UCL New Frontiers in IoT Well, kind of. q Sheep (x2) (w Cambridge PDN & RVC) q Leopards (RVC & BCPT) q Wild dogs (RVC & BCPT) q Baboons (Swansea) q Birds (RVC) q This is


slide-1
SLIDE 1

The Internet of Animals…

Professor Stephen Hailes UCL

slide-2
SLIDE 2

New Frontiers in IoT

Well, kind of.

q Sheep (x2) (w Cambridge PDN & RVC) q Leopards (RVC & BCPT) q Wild dogs (RVC & BCPT) q Baboons (Swansea) q Birds (RVC) q This is collaborative work with

q Prof Jenny Morton (Cambridge) q Prof Alan Wilson (RVC) q Dr Andrew King (Swansea) q … and many others

slide-3
SLIDE 3

New Frontiers in IoT

Batten Disease (source NIH)

q Nature

q A type of neurodegenerative disorder. q Autosomal recessive q Evidence suggests it is caused by problems with the brain's ability to remove and recycle proteins.

q Symptoms

q Abnormally increased muscle tone or spasm (myoclonus) q Blindness or vision problems q Dementia q Lack of muscle coordination q Mental retardation with decreasing mental function q Movement disorder (choreoathetosis) q Seizures q Unsteady gait (ataxia)

3

slide-4
SLIDE 4

New Frontiers in IoT

q Prognosis

q Symptoms normally appear age 5-10 q Early signs can be subtle - personality and behaviour changes, slow learning, clumsiness, or stumbling. q Over time, affected children suffer mental impairment, worsening seizures, and progressive loss of sight and motor skills. q Eventually, children with Batten disease become blind, bedridden, and demented. Batten disease is often fatal by the late teens or twenties. q No specific treatment is known that can halt or reverse the symptoms of Batten disease. q Palliative care (anticonvulsants, physical therapy) can help.

4

slide-5
SLIDE 5

New Frontiers in IoT

NZ trip (Feb/March 2011)

q Cohort 1 (69 sheep: 40 ewes, 19 rams)

q 2010 (~6 month old), mixed:

q Normal sheep (17) q Batten disease (CLN5/6) – Homozygous (11), Heterozygous (17) q Cataract sheep – Blind (11), Impaired (5), Sighted (8)

q Cohort 2 (11 ewes)

q 2009 (~18 month old) shed ewes, mixed:

q Homozygous (5), Heterozygous (6)

q Cohort 3 (11 sheep: 2 ewes, 9 rams)

q 2009 (~18 month old), mixed:

q Homozygous (9), Heterozygous (2)

5

slide-6
SLIDE 6

New Frontiers in IoT

Cohort 1

6

slide-7
SLIDE 7

New Frontiers in IoT

Cohort 2

7

slide-8
SLIDE 8

New Frontiers in IoT

GPS-based work

q Data obtained from GPS/IMU units

q GPS at 1 sample/s q IMU at 50 sample/s q Over max 22-24h periods

q Attached using harness...

q Issues: C1 sheep were small, shorn and in poor condition

8

slide-9
SLIDE 9

New Frontiers in IoT

Can be used to derive individual position fixes for each individual....

9

slide-10
SLIDE 10

New Frontiers in IoT

Cohort 3 - 3/4/11

10

slide-11
SLIDE 11

New Frontiers in IoT

Sheep 1171 - Affected

11

slide-12
SLIDE 12

New Frontiers in IoT

Sheep 1004 - Affected

12

slide-13
SLIDE 13

New Frontiers in IoT

Which sometimes throws up some suprises....

13

slide-14
SLIDE 14

New Frontiers in IoT

Sheep 1008 - Heterozygous

14

slide-15
SLIDE 15

New Frontiers in IoT

Sheep 1106 - Affected

15

slide-16
SLIDE 16

New Frontiers in IoT

Cohort 1 + 2..... 30/3/11

16

slide-17
SLIDE 17

New Frontiers in IoT

17

slide-18
SLIDE 18

New Frontiers in IoT

Q: How can we identify phenotype from the data?

Try analysis of distance covered....by phenotype

18

slide-19
SLIDE 19

New Frontiers in IoT

19

slide-20
SLIDE 20

New Frontiers in IoT

Try analysis of distance covered....by time of day

20

slide-21
SLIDE 21

New Frontiers in IoT

21

UTC

slide-22
SLIDE 22

New Frontiers in IoT

q What about IMU information? q Produce a measure of activity:

q Take 50Hz 3D accelerometry signal, calculate magnitude of resultant q (Roughly – calibration offset) q Integrate numerically over 1 minute for measure of activity q Subtract mean calculated over whole day to look at variation in activity relative to the mean q And we get....

22

slide-23
SLIDE 23

New Frontiers in IoT

Activity – Cohort 1+2 30/3/11

23

slide-24
SLIDE 24

New Frontiers in IoT

24

slide-25
SLIDE 25

New Frontiers in IoT

25

slide-26
SLIDE 26

New Frontiers in IoT

Back to the GPS...

26

slide-27
SLIDE 27

New Frontiers in IoT

slide-28
SLIDE 28

New Frontiers in IoT

Path analysis

28

Four cohort 2 sheep 00:06 – 00:18 22/03/11 1123 - Affected 1169 - Hetero 1156- Affected 1187 - Hetero

slide-29
SLIDE 29

New Frontiers in IoT

Path analysis - numerically

1156 (Homo) 1123 (Homo) 1169 (Hetero) 1187 (Hetero) Path length 16.59 253.70 36.11 46.12 Mean step size 0.023 0.352 0.050 0.064 SD step size 0.045 0.290 0.118 0.102 P(Turn same dir) 0.570 0.827 0.566 0.525 95% c.i. Psame 0.534 – 0.607 0.800 – 0.855 0.530 – 0.603 0.489 – 0.562 p-value Psame≠0.5 0.0002 << 0.0001 0.0004 0.1784 Correlation between adj. turn angles 0.0009 0.0002 0.0017 0.0022

29

slide-30
SLIDE 30

New Frontiers in IoT

SOCIAL STRUCTURE

slide-31
SLIDE 31

New Frontiers in IoT

Statistics

qMotivation

qMuch of the work done on social structure lacks a mathematical foundation qThis matters

qWe care about the identification of groups in a social network, and about the nature of change with time qExisting measures offer little in the way of robust evidence.

qAim:

qTo provide significance tests that allow the inference of social networks, or of important features of social networks such as group separation, from movement data.

slide-32
SLIDE 32

New Frontiers in IoT

Quick intro

q In social network analysis, a graph is constructed to represent the social structure of a group

q = a sociogram

q Nodes are individuals q Edges represent relationships q Centrality (betweenness, closeness, degree) q Position (structural) q Strength of ties (strong/weak, weighted/discrete) q Cohesion (groups, cliques) q Division (structural holes, partition)

slide-33
SLIDE 33

New Frontiers in IoT

Our problem

qTypically SNA assumes that the structure of the network is observable.

qE.g. who is friends with whom on Facebook

qNot the case for us:

qWe only have GPS data available and so… qWe must infer the underlying social network before analysing it

slide-34
SLIDE 34

New Frontiers in IoT

Existing approaches for animals

q The most common approach is.. q The Gambit of The Group

q Data split into time windows q A separate social network constructed for each time window qPut an edge if two animals are said to be “in the same place at the same time” during that time window q Once we have this collection, amalgamate into a single (weighted) network q Then threshold this to remove ‘weak’ links

q Arguable for animals in which ‘place’ has a clear meaning – e.g. roosting bats q Less clear for situations in which place has less meaning

slide-35
SLIDE 35

New Frontiers in IoT

Existing approaches II

q There is a relationship between A and B if animal A stays within x metres of animal B for at least t seconds

q But this is parameterised by x and t, and it is not clear how to choose these – often arbitrary or anthropomorphic.

slide-36
SLIDE 36

New Frontiers in IoT

Our approach

q We assume:

q That the social network of the group can be directly associated with the correlation structure of the group’s movement patterns

q We aim to detect any significant correlation between the movement of two members of the group q And do this through the construction of an appropriate significance test q Given that similarity in movement patterns is statistically significant, we place an edge in the social network. Else we don’t.

slide-37
SLIDE 37

New Frontiers in IoT

Notation

q Given a data set, we use:

q N ∈ ℕ to denote the number of animals q H ∈ ℕ to denote the number of time points in the data set q (xt (n), yt (n)) ∈ ℝ2 for position of animal n ∈ ℕN at time t ∈ ℕH q xt , yt ∈ ℝN for coordinates of the entire group at time t ∈ ℕH q x1:H , y1:H ∈ ℝNH for coordinates of the entire group

q Assume that the entire group of animals is always contained within a bounded region, D.

slide-38
SLIDE 38

New Frontiers in IoT

Inferring social structure

q Assume social structure corresponds to correlation structure in the movement patterns of the group.

q When there is a relationship, movement patterns are correlated q When there is not, they are independent

q A standard statistical approach to such a problem is:

q Construct a generative model for group movement, i.e. a probabilistic model over the space of possible movement patterns. q Given the observed movement pattern either obtain a point- estimate of the model parameters, through e.g. likelihood maximisation, or obtain the posterior of the model parameters through Bayes’ rule. q Given the point-estimate or posterior, the correlation structure of the group’s movements is then directly obtainable from the generative model.

slide-39
SLIDE 39

New Frontiers in IoT

But…

q It is extremely difficult to construct a generative model that is both:

q sufficiently rich to model the complex movements patterns seen in real-life data sets q sufficiently constrained so as to avoid over-fitting and (feasibly) allow parameter optimisation, or posterior inference.

q Various ‘swarm models’ have been proposed in the literature but to the best of our knowledge… no statistical inference has been performed on real- life data sets through these models.

slide-40
SLIDE 40

New Frontiers in IoT

Our approach

q By defining an appropriate null model we:

q construct a novel significance test that infers the social structure of the group q obviate the need to construct a model for the collective movements

  • f the group.

q Null hypothesis: the movements of each animal are independent of the other members of the group q Given this, it is simple to train a separate generative model for each individual animal q Given the observed movement patterns we use our set of individual generative models to determine whether any similarity in the movements of any two animals is significant, or simply due to chance.

slide-41
SLIDE 41

New Frontiers in IoT

Geospatial approach

q Step 1: Partition space into subregions. e.g. take bounding rectangle for field and divide into equal-sized squares q Reason: given a generative model for the movements of each animal, it is meaningful to calculate the probability that two animals are in the same sub-region of the partition at the same point in time q Given such probabilities, we can then determine whether the number of times that two animals where observed to be in the same sub-region is significant or simply down to chance.

slide-42
SLIDE 42

New Frontiers in IoT

The key – individual movement models

q We learn a movement model for each animal in the group q There are various possibilities: e.g. a multinomial distribution and a Markov model. q To construct such models, we represent the observations

  • f each animal’s movements in terms of the partition of

space q for each animal, n ∈ ℕN, and each observation, t ∈ ℕH , we use the notation it,n ∈ ℕD to denote the index of the sub- region that contains the point (xt (n), yt (n)) qi.e. (xt (n), yt (n)) ∈ Dit,n

slide-43
SLIDE 43

New Frontiers in IoT

Simple approach – multinomial distribution

q Takes no account of the temporal structure of the data q Simply calculate the probability that animal n will be in subregion Di

𝜌 "# 𝑗 = 𝐷',# ∑ 𝐷'*,#

'*∈ℕ-

Where: Ci,n is a count of the number of times animal n was is region i.

slide-44
SLIDE 44

New Frontiers in IoT

Not enough

q Consider an animal walking in a circle

q Where is has come from is important to where it is going to go next

q Construct a Markov model – give a transition matrix and initial location

𝑈 /# 𝑗 𝑘 = ∑ 𝑱 𝑗3,#, 𝑗 𝑱[𝑗356,#, 𝑘]

8 39:

𝐷

;,# − 𝑱[𝑗8,#, 𝑘]

𝑞 "#

> 𝑗 = 𝑱[𝑗6,#, 𝑗]

slide-45
SLIDE 45

New Frontiers in IoT

Determining significant interactions

q Determine the number of times a pair of animals were in the same sub-region at the same time q For each pair of animals, n, n’ ∈ ℕN , we denote this count

𝑓#,#@

8

= A A 𝑱 𝑗3,#*, 𝑗 𝑱[𝑗3,#, 𝑗]

'∈ℕ- 3∈ℕB

q In the case of the Markov model, the probability of the colocation of two animals at the same time is:

𝑞#,#@

3

= A 𝑞̂#(𝑗3 = 𝑗)𝑞̂#@(𝑗3 = 𝑗)

'∈ℕ-

q Where the 𝑞̂ values are the marginals under MLE of the transition matrix and initial state distribution

slide-46
SLIDE 46

New Frontiers in IoT

q Using 𝐹#,#@

8

to denote the random variable for the number

  • f colocations between n and n’, given our generative

models q We can calculate 𝐹#,#@

8

either analytically, for the multinomial or iteratively for the Markov model q We reject the null hypothesis if:

𝑞 𝐹#,#*

8

≥ 𝑓#,#*

8

≤ α

q i.e. if the probability that there are more random colocations than actual colocations is less than a given value of significance q If we reject the null hypothesis, we add an edge into the social network

slide-47
SLIDE 47

New Frontiers in IoT

Artificial mixing experiment

q In the artificial mixing experiment we manipulate the data in such a manner that it is known a priori that the flock is formed of two sub- groups

q To obtain two sub-groups we amalgamate pairs of data sets. q To ensure a clear demarcation between the two sub-groups we amalgamate data sets from different days, e.g., 1st and 2nd March q We only consider pairs of data sets from the same field

q A total of twenty three different amalgamated data sets, with an average of one hundred and forty animals. q Split the area into twenty five equally-sized sub-regions q Take the median position of each individual over a five minute period as an observation q Consider a Markov model, and use the data of the entire group to construct a single model. We use the significance test to construct a single binary network, and consider a 0.5% level of significance.

slide-48
SLIDE 48

New Frontiers in IoT

Artificial mixing experiment

slide-49
SLIDE 49

New Frontiers in IoT

False positives

q The proportion of connections between the two flocks, i.e. the false-positive rate, was 4.9 ± 1.6%, q Slightly higher than the expected false-positive rate when using this level of significance

slide-50
SLIDE 50

New Frontiers in IoT

Real mixing experiment

q Used a data set that consists of ninety one individuals. q Flock is formed of two sub-groups that were put into the same field on the day of data collection. q Used a six hour period during which the two sub-groups were fairly well separated to consider the social network of the group during this period. q Other parameters the same.

slide-51
SLIDE 51

New Frontiers in IoT

slide-52
SLIDE 52

New Frontiers in IoT

Real mixing experiment

slide-53
SLIDE 53

New Frontiers in IoT

Social network – cohort 2, NZ1

slide-54
SLIDE 54

New Frontiers in IoT

Classification experiment

q Comparative data:

q View animation q For each pair, consider the movements of the pair, in relation to the movements of the entire group q Subjectively determine whether an edge is present between the pair in the social network

q Construct a binary network for data from six different days. q For each data set we considered a six hour period, selecting periods with a high amount of movement activity q Significance test has to determine whether there is a significant amount of interaction during the six hour period. q Each data set consists of a flock of eleven animals, so that there were fifty five possible edges in each of the six social networks.

slide-55
SLIDE 55

New Frontiers in IoT

Result

q Over the six data sets there was a total of three hundred and thirty possible edges. q Our significance test obtained a classification accuracy of 90.61 ± 1.61% (of the edges). q Comparison: q Within 3m for 3 minutes: 65.45 ± 2.62%. q Optimise parameters to give the best results for this distance/time approach:

q α = proportion of 3 minute period (optimum = 0.1) q β = threshold for formation of binary net (= 0.7) q γ = distance (= 6.0)

q => classification accuracy of 89.09 ± 1.72%

slide-56
SLIDE 56

New Frontiers in IoT

kNN–based approach

q Assume we have a flock consisting of two groups A and B. q For each animal n, and each time point t, calculate the proportion of the 5 nearest neighbours that are from the same group as n at time t q For each time point, average this across all animals q Calculate the significance…. Easier in this case:

q For 1000 iterations qAt random, split the flock into two partitions, A’ and B’ of the same size as A and B qCalculate the proportion of nearest neighbours from the same group as before, for each time point q Determine what proportion of the iterations are at least as extreme as the observation

slide-57
SLIDE 57

New Frontiers in IoT

Ewe2/Ewe3 06/09/2012

slide-58
SLIDE 58

New Frontiers in IoT

Ewe2/Ewe3 08/09/2012 – by cohort

slide-59
SLIDE 59

New Frontiers in IoT

Ewe2/Ewe3 28/02/2013

slide-60
SLIDE 60

New Frontiers in IoT

Ewe2/Ewe3 01/03/2013

slide-61
SLIDE 61

New Frontiers in IoT

04/09/2012 – Ewe2 by genotype

slide-62
SLIDE 62

New Frontiers in IoT

Selfish herd behaviour

slide-63
SLIDE 63

New Frontiers in IoT

Herding sheep

slide-64
SLIDE 64

New Frontiers in IoT

slide-65
SLIDE 65

New Frontiers in IoT

Cheetah

slide-66
SLIDE 66

New Frontiers in IoT

slide-67
SLIDE 67

New Frontiers in IoT

Leopards

slide-68
SLIDE 68

New Frontiers in IoT

Leopards

slide-69
SLIDE 69

New Frontiers in IoT

Wild dogs

slide-70
SLIDE 70

New Frontiers in IoT

Wild dogs

slide-71
SLIDE 71

New Frontiers in IoT

Birds

slide-72
SLIDE 72

New Frontiers in IoT

Hefted Sheep

slide-73
SLIDE 73

New Frontiers in IoT

Experimental Computer Science?

73

slide-74
SLIDE 74

New Frontiers in IoT

With thanks to…

q Jenny Morton, Liz Skillings and others at Cambridge q Alan Wilson, Jim Usherwood, John Lowe, Steve Portugal and many others at RVC q Dave Palmer, Nadia Mitchell and others from U. Lincoln, NZ q Andy King, Gaelle Fehlmann and others at U. Swansea q Skye Rudiger and others at SARDI, Aus q Tom Furmston q Sarah Chisholm q Daniel Strömbom in Uppsala q Tico McNutt and others at BCPT q …. Many others.