 
              A Cluster ‐ Target Similarity Based g y Principal Component Analysis for Interval ‐ Valued Data University of Tsukuba School of Systems and Information Engineering Mika Sato ‐ Ilic
Energy Evaluation Data Energy JP1 JP2 … UK1 UK2 … 1. Oil 1. Oil [60,90] [ , ] [ [60,70] , ] … [81,91] [ , ] [ [51,71] , ] … 2. Coal [90,120] [80,95] … [80,91] [80,91] … 3. Coal with [60,100] [20,40] … [50,60] [65,75] … CCS CCS 4. Nuclear [70,120] [50,85] … [45,65] [60,80] … 5. Geothermal [60,80] [30,45] … [0,20] [0,20] … 6. Solar PV [30,70] [30,40] … [0,10] [10,40] … 7. Biomass [40,100] [20,35] … [60,70] [60,70] … 8. On. Wind, [70 100] [70,100] [50 60] [50,60] … [60,72] [60 72] [50,70] [50 70] … large 9. Mun/Ind [83,111] [50,65] … [60,80] [80,90] … Waste 10. Hydro y [70,100] [40,60] … [65,75] [60,70] … 11. Gas [80,120] [65,85] … [87,97] [87,97] … Joint Research: ESRC ‐ funded Sussex Energy Group at SPRU (Science and Technology Policy Research, University of Sussex) J i t R h ESRC f d d S E G t SPRU (S i d T h l P li R h U i it f S ) Sustainable Energy/Environment & Public Policy (SEPP), University of Tokyo (ESRC: Economic and Social Research Council)
Principal Component Analysis based on Classification Structure by Fuzzy Clustering y y g Multi ‐ dimensional Space (p dimensional space) p: Number of Variables 10 3 1 7 8 6 2 5 9 11 4 Adaptable Classification Structure Adaptable Number of Clusters
Principal Component Analysis for Metric Projection Principal Component Analysis for Metric Projection Metric Projection  : P P X X L L L     x y x y x ( ) { : ( , )} P L d L L    x x x y , ( , ) inf X d L y  L L X : : Inner Product Space L : : Nonempty Subset of X
Principal Component Analysis for Metric Projection Metric Projection  : P X L L is nonexpansi ve P L    x y x y ( ( ) ) ( ( ) ) P P L L  x y , X L : Convex Chebyshev Set : Convex Chebyshev Set  x For F each h , there th exists i t at t l least t one nearest t point i t in i X X L L Principal Component Analysis     x y x y x y ( , ) ( ) ( ) C P P L L  x y y : Dissimilar ity y of Objects j  x y ( ) ( ) : Dissimilar ity of Objects on Projected Space P P L L
Principal Component Analysis Principal Component Analysis ~   x   1 ~       x x         , ( ( , , ), ) 1 1 , , X X x x x x i i n n 1 i i ip   ~ x   n   x   1 a       x x x x x x         ( ( , , ) ), , 1 1 , , X X a a p p 1 p a    x  na : : Number Number of of Objects Objects : : Number Number of of Variables Variables n n p p p     x l x l Minimize ( )' ( ) F X X 1 1 a a  1 a   l x 1 ( ' ) ' X X X 1 a p       * x 1 x x 1 x ( ( ' ) ' )' ( ( ' ) ' ) F X X X X X X X X a a a a  a 1  1 x  x ( ' ) ' : Projection to Subspace Spanned by , , X X X X 1 p    p p  '         x x 1 x x x x 1 x x ( ) ( ' ) ' ( ) ( ) ( ' ) ' ( ) X X X X X X X X a b a b a b a b   a b 1
Principal Component Analysis  1 x  x ( ' ) ' : Projection to Subspace Spanned by , , X X X X 1 p   1 ( ' ) ' P X X X X X X   : ' : P P P Idempotent P P Symmetry X X X X X     x x x x n n , , V V a b a b p p        * x x x x x x x x ( )' ( ) ( ' ' ) F P P P a X a a X a a a a X a   a 1 a 1 p               ' x x x x x x x x x x x x x x x x ( ( ) ) ( ( ) ) ( ( ) ) ( ( ) ) P P P P a b X a b a b X a b   a b 1 p         x x x x x x x x ( )' ( ) ( )' ( ) P a a b b a a b b a a b b X X a a b b   1 a b p p       x x x x x x x x 2 ( ' ' ) 2 ( ' ' ) p P P a a a X a a b a X b       1 1 1 1 a a a a b b * Covariance between Variables F
Fuzzy Cluster based Principal Component Analysis p           ' x x x x x x x x ( ) ( ) ( ) ( ) P P a b X a b a b X a b    1 1 a a b b p        x x x x x x x x ( )' ( ) ( )' ( ) P a b a b a b X a b     1 1 a a b b p p       x x x x x x x x 2 ( ' ' ) 2 ( ' ' ) p P P a a a X a a b a X b    1 1 a a b * Covariance between Variables F Adaptable Classification Structure based on an Appropriate Number of Clusters Dissimilarity Structure of Objects in Higher Dimensional Space
Selection of an Appropriate Number of Clusters Selection of an Appropriate Number of Clusters a  ~     ( ) ( ) a K a U S S S ~ ~    ( ) ( ) b b K b U S : Number of Clusters K a   ( ( ) ) a b b : Observed Similarity Data S ( ( ) ) l l : Classifica Cl ifi tion ti Structure St t for f when h the th N Number b of f Clusters Cl t i is U U S S l l  ( )  A Result of Fuzzy Clustering 1 , , l K ~ S : ( ) ( ) l l Restored Similarity by Using U ~ ~ ~ ( ( ) ) ( ( 2 2 ) ) ( ( ) ) l l K K  S l Select t Cl Closest t t to among { { , , } } S S S S S S S S Select Most Explainabl e Classifica tion Structure for Original Data Appropriat e Number of Clusters is l
Asymmetric Similarity of Interval Asymmetric Similarity of Interval ‐ Valued Data Valued Data       ( ) ([ , ]), 1 , , , 1 , , Y y y y i n a p ia ia ia   y y   ( , , ) ( , , ) Dissimilar ity between y y and y y 1 1 i i ip j j jp         p           sup sup ( ( , ) ) | | , ( ( , ) ) inf inf ( ( , ) ) | | d d d d x x y y x x y y d d x x y y d d x x y y y y y y ij ja ia ja ja  1 a       p             sup sup ( ( , ) ) | | , ( ( , ) ) inf inf ( ( , ) ) | | d d d d y y y y y y y y d d y y y y d d x x y y x x y y ji ia ja ia ia  1 a   ( ) d d i j ij ji     1 / max { }, , 1 , , s d d i j n ij ij ij , i j   ( ( ) ) s s i j j ij ij ji ji
Asymmetric Fuzzy Clustering Model Asymmetric Fuzzy Clustering Model ( ( (Sato and Sato, 1995) (Sato and Sato, 1995) ) ) Asymmetric Similarity Data Asymmetric Similarity Data       , , ( ), , 1 ,..., S s s s i j i j n ij ij ji   K K       , , , , 1 1 ,..., ,..., s s w w u u u u i i j j n n ij ij kl kl ik ik jl jl ij ij   1 1 k l : s Asymmetric Similarity Between Objects i and j ij : u Degree of Belongingn ess of an Ojbect i to a Cluster k ik : w Asymmetric Similarity Between Clusters k and l kl  ij : Error K        [ 0 , 1 ], 1 , ( 1 , ) w w s s u u m kl lk ij ji ik ik  1 1 k k : : n Number of Objects K Number of Clusters
Asymmetric Similarity of Clusters Asymmetric Similarity of Clusters 1      ( ) w K  1 1 , , 1 1 , , w k k l l K K ~ kl  ( ) K  w 1 exp kl        | | 1 ~             ( k , K ) ( ) μ μ 1 K ( ) log w tr I      kl ( k , K ) ( l , K )   ( k , K ) ( l , K ) 1     2   | | | |   ( , ) k K ( ( , ) ) l l K K       μ μ μ μ ' 1 μ μ ( ) ( ) ( k , K ) ( l , K )   ( k , K ) ( l , K ) ( k , K ) ( k , K ) ( l , K ) 1 ( , ) k K μ : Expected Value of Data in Cluster k ( , ) k K    : Variance V i C Covariance i M Matrix i f for Cluster Cl k k ( , ) k K ~ ~ ~    ( ) ( ) ( ) K K K , ( ), [ 0 , 1 ] w w k l w kl lk kl K : Number of Clusters
Recommend
More recommend