Multiresolution Cluster AnalysisAddressing Trust in Climate - - PowerPoint PPT Presentation

multiresolution cluster analysis addressing trust in
SMART_READER_LITE
LIVE PREVIEW

Multiresolution Cluster AnalysisAddressing Trust in Climate - - PowerPoint PPT Presentation

Multiresolution Cluster AnalysisAddressing Trust in Climate Classifications Introduction Multiresolution Cluster AnalysisAddressing Trust in Climate Classifications Derek DeSantis , Phil Wolfram, Boian Alexandrov Los Alamos National


slide-1
SLIDE 1

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications

Derek DeSantis†, Phil Wolfram, Boian Alexandrov

Los Alamos National Laboratory, Center for Nonlinear Studies†

AMS Annual Meeting, January 2020

slide-2
SLIDE 2

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction K¨

  • ppen-Geiger Model

Figure: K¨

  • ppen-Geiger map of North America (Peel et. al.)
slide-3
SLIDE 3

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with K¨

  • ppen-Geiger

Problem Climate depends on more than temperature and precipitation.

slide-4
SLIDE 4

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with K¨

  • ppen-Geiger

Problem Climate depends on more than temperature and precipitation. Can only resolve land.

slide-5
SLIDE 5

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with K¨

  • ppen-Geiger

Problem Climate depends on more than temperature and precipitation. Can only resolve land. Does not adapt to changing climate.

slide-6
SLIDE 6

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with K¨

  • ppen-Geiger

Problem Climate depends on more than temperature and precipitation. Can only resolve land. Does not adapt to changing climate. The cut-offs in model are, to some extent, arbitrary.

slide-7
SLIDE 7

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with K¨

  • ppen-Geiger

Problem Climate depends on more than temperature and precipitation. Can only resolve land. Does not adapt to changing climate. The cut-offs in model are, to some extent, arbitrary. No universal agreement to how many classes there should be.

slide-8
SLIDE 8

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with clustering

Problem Dependence on algorithm of choice and hyperparameters.

slide-9
SLIDE 9

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with clustering

Problem Dependence on algorithm of choice and hyperparameters.

Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering Figure: Many clusterings combined into a single consensus clustering.

slide-10
SLIDE 10

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with clustering

Problem Dependence on algorithm of choice and hyperparameters.

Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering Figure: Many clusterings combined into a single consensus clustering.

Clustering ill-posed - lack measurement of “trust”.

slide-11
SLIDE 11

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with clustering

Problem Dependence on algorithm of choice and hyperparameters.

Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering Figure: Many clusterings combined into a single consensus clustering.

Clustering ill-posed - lack measurement of “trust”. Dependence on “hidden parameters” - scale of data.

slide-12
SLIDE 12

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Proposed Solution

Solution

1 Leverage discrete wavelet transform to classify across a multitude

  • f scales.
slide-13
SLIDE 13

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Proposed Solution

Solution

1 Leverage discrete wavelet transform to classify across a multitude

  • f scales.

2 Use information theory to discover most important scales to

classify on.

slide-14
SLIDE 14

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Proposed Solution

Solution

1 Leverage discrete wavelet transform to classify across a multitude

  • f scales.

2 Use information theory to discover most important scales to

classify on.

3 Taking these scales, combine classifications to produce a fuzzy

clustering that assess the trust at each point.

slide-15
SLIDE 15

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Proposed Solution

Solution

1 Leverage discrete wavelet transform to classify across a multitude

  • f scales.

2 Use information theory to discover most important scales to

classify on.

3 Taking these scales, combine classifications to produce a fuzzy

clustering that assess the trust at each point.

Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering

CGC 1 CGC 2 CGC L1 CGC 1 CGC 2 CGC L2 CGC 1 CGC 2 CGC Ln

slide-16
SLIDE 16

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Preliminary Tools Discrete Wavelet Transform and Mutual Information

The DWT splits a signal into high and low frequency Low temporal signal captures climatology (seasons, years, decades), while low spatial signal captures regional features(city, county, state).

DWT Space DWT Time DWT

  • f

Tensor

slide-17
SLIDE 17

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Preliminary Tools Discrete Wavelet Transform and Mutual Information

The DWT splits a signal into high and low frequency Low temporal signal captures climatology (seasons, years, decades), while low spatial signal captures regional features(city, county, state).

DWT Space DWT Time DWT

  • f

Tensor

Definition Given partitions of data U = {Uj}k

j=1, V = {Vj}l j=1, the

Mutual Information NI(U, V ) measures how knowledge of

  • ne clustering reduces our uncertainty of the other.
slide-18
SLIDE 18

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Preliminary Tools L15 Gridded Climate Dataset - Livneh et. al.

Gridded climate data set of North America. Grid cell is monthly data from 1950-2013, six kilometers across. Available variables used: precipitation, maximum temperature, minimum temperature.

slide-19
SLIDE 19

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Proposed Solution

Solution

1 Leverage discrete wavelet transform to classify across a

multitude of scales.

2 Use information theory to discover most important scales to

classify on.

3 Taking these scales, combine classifications to produce a fuzzy

clustering that assess the trust at each point.

Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering

CGC 1 CGC 2 CGC L1 CGC 1 CGC 2 CGC L2 CGC 1 CGC 2 CGC Ln

slide-20
SLIDE 20

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm

1

slide-21
SLIDE 21

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm

DWT DWT DWT

1 2

slide-22
SLIDE 22

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm

DWT Stack DWT DWT

1 2 3

slide-23
SLIDE 23

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm

DWT Stack DWT DWT Vectorize

1 2 3 4

slide-24
SLIDE 24

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm

DWT Stack DWT DWT Vectorize Cluster

1 2 3 4 5

slide-25
SLIDE 25

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm

DWT Stack DWT DWT Vectorize Cluster Label

1 2 3 4 5 6

slide-26
SLIDE 26

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining

Figure: CGC: K-means k = 10, (ℓs, ℓt) = (1, 1)

slide-27
SLIDE 27

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining

Figure: CGC: K-means k = 10, (ℓs, ℓt) = (4, 1)

slide-28
SLIDE 28

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining

Figure: CGC: K-means k = 10, (ℓs, ℓt) = (1, 1)

slide-29
SLIDE 29

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining

Figure: CGC: K-means k = 10, (ℓs, ℓt) = (1, 6)

slide-30
SLIDE 30

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining

Figure: CGC: K-means k = 10, (ℓs, ℓt) = (1, 1)

slide-31
SLIDE 31

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining

Figure: CGC: K-means k = 10, (ℓs, ℓt) = (4, 6)

slide-32
SLIDE 32

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) Proposed Solution

Solution

1 Leverage discrete wavelet transform to classify across a multitude

  • f scales.

2 Use information theory to discover most important

scales to classify on.

3 Taking these scales, combine classifications to produce a fuzzy

clustering that assess the trust at each point.

Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering

CGC 1 CGC 2 CGC L1 CGC 1 CGC 2 CGC L2 CGC 1 CGC 2 CGC Ln

slide-33
SLIDE 33

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) The Algorithm

1

slide-34
SLIDE 34

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) The Algorithm

1 2

slide-35
SLIDE 35

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) The Algorithm

Graph Cut 1 2 3

slide-36
SLIDE 36

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) The Algorithm

Graph Cut Representative + Find 1 2 3 4 5

slide-37
SLIDE 37

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) Results - Example for K-means K=10

Figure: Results from graph cut algorithm. The highlighted resolutions are the final ensemble. Vertical number = ls, horzontal bar = lt.

slide-38
SLIDE 38

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) Results - Example for K-means K=10

(a) (ℓs, ℓt) = (2, 1) (b) (ℓs, ℓt) = (2, 4) (c) (ℓs, ℓt) = (3, 5) (d) (ℓs, ℓt) = (4, 4)

slide-39
SLIDE 39

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm Proposed Solution

Solution

1 Leverage discrete wavelet transform to classify across a multitude

  • f scales.

2 Use information theory to discover most important scales to

classify on.

3 Taking these scales, combine classifications to produce a

fuzzy clustering that assess the trust at each point.

Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering

CGC 1 CGC 2 CGC L1 CGC 1 CGC 2 CGC L2 CGC 1 CGC 2 CGC Ln

slide-40
SLIDE 40

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm The Algorithm

1

slide-41
SLIDE 41

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm The Algorithm

, , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ]

1 2

Class Labels

slide-42
SLIDE 42

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm The Algorithm

, , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ]

1 2 3

= C1 = C2 = Ck

Class Labels Signals

slide-43
SLIDE 43

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm The Algorithm

, , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ]

d( , ) = 0.8

, , , [ ] , , , [ ]

d( , ) = 0.2

, , , [ ]

vs

, , , [ ]

vs

, , , [ ]

d( vs

, , , [ ]) = 0.1

, 1 2 3 4

= C1 = C2 = Ck

Class Labels Signals Distance from Signals

slide-44
SLIDE 44

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm The Algorithm

, , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ]

d( , ) = 0.8

, , , [ ] , , , [ ]

d( , ) = 0.2

, , , [ ]

vs

, , , [ ]

vs

, , , [ ]

d( vs

, , , [ ]) = 0.1

, 1 2 3 4

= C1 = C2 = Ck

, , , [ ] , , , [ ] [ ]

(C1 (C2 (Ck

5

Class Labels Signals Distance from Signals Assign Labels and Trust

, 0.8) , 0.75) , 1.0)

slide-45
SLIDE 45

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm Results - Example for K-means K=10

Figure: Consensus clustering from reduced ensemble of clusters for k=10, along with the trust. Grey = multi-class. Darker hue = lower trust.

slide-46
SLIDE 46

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion

Summary The DWT brings forth structure hidden at different scales within the data.

slide-47
SLIDE 47

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion

Summary The DWT brings forth structure hidden at different scales within the data. Mutual information allows us to effectively represent the diversity across all scales.

slide-48
SLIDE 48

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion

Summary The DWT brings forth structure hidden at different scales within the data. Mutual information allows us to effectively represent the diversity across all scales. Using this reduced ensemble, we produce a fuzzy clustering that has an interpretable trust metric at each point in space.

slide-49
SLIDE 49

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion Results - Effect of k

Figure: CGC: K-means k = 4, (ℓs, ℓt) = (2, 3)

slide-50
SLIDE 50

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion Results - Effect of k

Figure: CGC: K-means k = 8, (ℓs, ℓt) = (2, 3)

slide-51
SLIDE 51

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion Results - Effect of k

Figure: CGC: K-means k = 12, (ℓs, ℓt) = (2, 3)

slide-52
SLIDE 52

Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion Results - Effect of k

Figure: CGC: K-means k = 16, (ℓs, ℓt) = (2, 3)