Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction
Multiresolution Cluster AnalysisAddressing Trust in Climate - - PowerPoint PPT Presentation
Multiresolution Cluster AnalysisAddressing Trust in Climate - - PowerPoint PPT Presentation
Multiresolution Cluster AnalysisAddressing Trust in Climate Classifications Introduction Multiresolution Cluster AnalysisAddressing Trust in Climate Classifications Derek DeSantis , Phil Wolfram, Boian Alexandrov Los Alamos National
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction K¨
- ppen-Geiger Model
Figure: K¨
- ppen-Geiger map of North America (Peel et. al.)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with K¨
- ppen-Geiger
Problem Climate depends on more than temperature and precipitation.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with K¨
- ppen-Geiger
Problem Climate depends on more than temperature and precipitation. Can only resolve land.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with K¨
- ppen-Geiger
Problem Climate depends on more than temperature and precipitation. Can only resolve land. Does not adapt to changing climate.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with K¨
- ppen-Geiger
Problem Climate depends on more than temperature and precipitation. Can only resolve land. Does not adapt to changing climate. The cut-offs in model are, to some extent, arbitrary.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with K¨
- ppen-Geiger
Problem Climate depends on more than temperature and precipitation. Can only resolve land. Does not adapt to changing climate. The cut-offs in model are, to some extent, arbitrary. No universal agreement to how many classes there should be.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with clustering
Problem Dependence on algorithm of choice and hyperparameters.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with clustering
Problem Dependence on algorithm of choice and hyperparameters.
Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering Figure: Many clusterings combined into a single consensus clustering.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with clustering
Problem Dependence on algorithm of choice and hyperparameters.
Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering Figure: Many clusterings combined into a single consensus clustering.
Clustering ill-posed - lack measurement of “trust”.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Problems with clustering
Problem Dependence on algorithm of choice and hyperparameters.
Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering Figure: Many clusterings combined into a single consensus clustering.
Clustering ill-posed - lack measurement of “trust”. Dependence on “hidden parameters” - scale of data.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Proposed Solution
Solution
1 Leverage discrete wavelet transform to classify across a multitude
- f scales.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Proposed Solution
Solution
1 Leverage discrete wavelet transform to classify across a multitude
- f scales.
2 Use information theory to discover most important scales to
classify on.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Proposed Solution
Solution
1 Leverage discrete wavelet transform to classify across a multitude
- f scales.
2 Use information theory to discover most important scales to
classify on.
3 Taking these scales, combine classifications to produce a fuzzy
clustering that assess the trust at each point.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Introduction Proposed Solution
Solution
1 Leverage discrete wavelet transform to classify across a multitude
- f scales.
2 Use information theory to discover most important scales to
classify on.
3 Taking these scales, combine classifications to produce a fuzzy
clustering that assess the trust at each point.
Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering
CGC 1 CGC 2 CGC L1 CGC 1 CGC 2 CGC L2 CGC 1 CGC 2 CGC Ln
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Preliminary Tools Discrete Wavelet Transform and Mutual Information
The DWT splits a signal into high and low frequency Low temporal signal captures climatology (seasons, years, decades), while low spatial signal captures regional features(city, county, state).
DWT Space DWT Time DWT
- f
Tensor
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Preliminary Tools Discrete Wavelet Transform and Mutual Information
The DWT splits a signal into high and low frequency Low temporal signal captures climatology (seasons, years, decades), while low spatial signal captures regional features(city, county, state).
DWT Space DWT Time DWT
- f
Tensor
Definition Given partitions of data U = {Uj}k
j=1, V = {Vj}l j=1, the
Mutual Information NI(U, V ) measures how knowledge of
- ne clustering reduces our uncertainty of the other.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Preliminary Tools L15 Gridded Climate Dataset - Livneh et. al.
Gridded climate data set of North America. Grid cell is monthly data from 1950-2013, six kilometers across. Available variables used: precipitation, maximum temperature, minimum temperature.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Proposed Solution
Solution
1 Leverage discrete wavelet transform to classify across a
multitude of scales.
2 Use information theory to discover most important scales to
classify on.
3 Taking these scales, combine classifications to produce a fuzzy
clustering that assess the trust at each point.
Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering
CGC 1 CGC 2 CGC L1 CGC 1 CGC 2 CGC L2 CGC 1 CGC 2 CGC Ln
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm
1
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm
DWT DWT DWT
1 2
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm
DWT Stack DWT DWT
1 2 3
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm
DWT Stack DWT DWT Vectorize
1 2 3 4
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm
DWT Stack DWT DWT Vectorize Cluster
1 2 3 4 5
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) The Algorithm
DWT Stack DWT DWT Vectorize Cluster Label
1 2 3 4 5 6
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining
Figure: CGC: K-means k = 10, (ℓs, ℓt) = (1, 1)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining
Figure: CGC: K-means k = 10, (ℓs, ℓt) = (4, 1)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining
Figure: CGC: K-means k = 10, (ℓs, ℓt) = (1, 1)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining
Figure: CGC: K-means k = 10, (ℓs, ℓt) = (1, 6)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining
Figure: CGC: K-means k = 10, (ℓs, ℓt) = (1, 1)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Coarse-Grain Clustering (CGC) Results - Effect of Coarse-Graining
Figure: CGC: K-means k = 10, (ℓs, ℓt) = (4, 6)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) Proposed Solution
Solution
1 Leverage discrete wavelet transform to classify across a multitude
- f scales.
2 Use information theory to discover most important
scales to classify on.
3 Taking these scales, combine classifications to produce a fuzzy
clustering that assess the trust at each point.
Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering
CGC 1 CGC 2 CGC L1 CGC 1 CGC 2 CGC L2 CGC 1 CGC 2 CGC Ln
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) The Algorithm
1
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) The Algorithm
1 2
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) The Algorithm
Graph Cut 1 2 3
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) The Algorithm
Graph Cut Representative + Find 1 2 3 4 5
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) Results - Example for K-means K=10
Figure: Results from graph cut algorithm. The highlighted resolutions are the final ensemble. Vertical number = ls, horzontal bar = lt.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Mutual Information Ensemble Reduce (MIER) Results - Example for K-means K=10
(a) (ℓs, ℓt) = (2, 1) (b) (ℓs, ℓt) = (2, 4) (c) (ℓs, ℓt) = (3, 5) (d) (ℓs, ℓt) = (4, 4)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm Proposed Solution
Solution
1 Leverage discrete wavelet transform to classify across a multitude
- f scales.
2 Use information theory to discover most important scales to
classify on.
3 Taking these scales, combine classifications to produce a
fuzzy clustering that assess the trust at each point.
Dataset Cluster 1 Cluster 2 Cluster n Consensus Clustering
CGC 1 CGC 2 CGC L1 CGC 1 CGC 2 CGC L2 CGC 1 CGC 2 CGC Ln
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm The Algorithm
1
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm The Algorithm
, , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ]
1 2
Class Labels
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm The Algorithm
, , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ]
1 2 3
= C1 = C2 = Ck
Class Labels Signals
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm The Algorithm
, , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ]
d( , ) = 0.8
, , , [ ] , , , [ ]
d( , ) = 0.2
, , , [ ]
vs
, , , [ ]
vs
, , , [ ]
d( vs
, , , [ ]) = 0.1
, 1 2 3 4
= C1 = C2 = Ck
Class Labels Signals Distance from Signals
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm The Algorithm
, , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ] , , , [ ]
d( , ) = 0.8
, , , [ ] , , , [ ]
d( , ) = 0.2
, , , [ ]
vs
, , , [ ]
vs
, , , [ ]
d( vs
, , , [ ]) = 0.1
, 1 2 3 4
= C1 = C2 = Ck
, , , [ ] , , , [ ] [ ]
(C1 (C2 (Ck
5
Class Labels Signals Distance from Signals Assign Labels and Trust
, 0.8) , 0.75) , 1.0)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Consensus Clustering and Trust Algorithm Results - Example for K-means K=10
Figure: Consensus clustering from reduced ensemble of clusters for k=10, along with the trust. Grey = multi-class. Darker hue = lower trust.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion
Summary The DWT brings forth structure hidden at different scales within the data.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion
Summary The DWT brings forth structure hidden at different scales within the data. Mutual information allows us to effectively represent the diversity across all scales.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion
Summary The DWT brings forth structure hidden at different scales within the data. Mutual information allows us to effectively represent the diversity across all scales. Using this reduced ensemble, we produce a fuzzy clustering that has an interpretable trust metric at each point in space.
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion Results - Effect of k
Figure: CGC: K-means k = 4, (ℓs, ℓt) = (2, 3)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion Results - Effect of k
Figure: CGC: K-means k = 8, (ℓs, ℓt) = (2, 3)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion Results - Effect of k
Figure: CGC: K-means k = 12, (ℓs, ℓt) = (2, 3)
Multiresolution Cluster Analysis—Addressing Trust in Climate Classifications Conclusion Results - Effect of k