Dependencies in Interval- -valued valued Dependencies in Interval - PowerPoint PPT Presentation

Dependencies in Interval- -valued valued Dependencies in Interval Symbolic Data Symbolic Data Lynne Billard University of Georgia lynne@stat.uga.edu Tribute to Professor Edwin Diday: Paris, France; 5 September 2007

Naturally occurring Symbolic Data -- Mushrooms

Patient Records – – Single Hospital, Single Hospital, Cardiology Cardiology Patient Records Patient Hospital Age Smoker …. Patient 1 Fontaines 74 heavy Patient 2 Fontaines 78 light Patient 3 Beaune 69 no Patient 4 Beaune 73 heavy Patient 5 Beaune 80 light Patient 6 Fontaines 70 heavy Patient 7 Fontaines 82 heavy M M M M

Patient Records by Hospital -- aggregate over patients Result: Symbolic Data Patient Hospital Age Smoker Patient 1 Fontaines 74 heavy Patient 2 Fontaines 78 light Patient 3 Beaune 69 no Patient 4 Beaune 73 heavy Patient 5 Beaune 80 light Patient 6 Fontaines 70 heavy Patient 7 Fontaines 82 heavy M M M M Hospital Age Smoker Fontaines [70, 82] {light ¼, heavy ¾} Beaune [69, 80] {no, light, heavy} M M M

Histogram-valued Data -- Weight by Age Distribution:

Logical dependency rule E.g. Y 1 = age Y 2 = # children Classical: Y a = (10, 0), Y b = (20, 2), Y c = (18, 1) Aggregation → ξ = (10 , 20) × (0, 1, 2) Symbolic: 2 1 0 10 20 I.e., ξ implies classical Y d = (10, 2) is possible ν : {If Y 1 < 15, then Y 2 = 0} Need rule

Interval-valued data u Y 1 Y 2 u Y 1 Y 2 Team # At-Bats # Hits Team # At-Bats # Hits 1 (289, 538) (75, 162) 11 (212, 492) (57, 151) 2 (88, 422) (49, 149) 12 (177, 245) (189, 238) 3 (189, 223) (201, 254) 13 (342, 614) (121, 206) 4 (184, 476) (46, 148) 14 (120, 439) (35, 102) 5 (283, 447) (86, 115) 15 (80, 468) (55, 115) 6 (24, 26) (133, 141) 16 (75, 110) (75, 110) 7 (168, 445) (37, 135) 17 (116, 557) (95, 163) 8 (123, 148) (137, 148) 18 (197, 507) (52, 53) 9 (256, 510) (78, 124) 19 (167, 203) (48, 232) 10 (101, 126) (101, 132) ξ (2): Y 2 = 149 not possible when Y 1 < 149

Observation ξ(2) Y 2 Y 2 = α Y 1 149 R 4 R 1 88 R 2 R 3 49 88 149 422

Dependencies between Variables – Interval-valued Variables E.g., Regression Analysis Y = ( Y 1 , L , Y q ), e.g., q=1 Dependent variable: X = (X 1 , L , X p ) Predictor/regression variable: Multiple regression model: Y = β 0 + β 1 X 1 + L + β p X p + e e ∼ E(e)=0, Var(E) = σ 2 , Cov(e i , e k )= 0, i ≠ k. Error:

Y = β 0 + β 1 X 1 + L + β p X p + e Multiple Regression Model: In vector terms, Y = X β + e Observation matrix: Y 0 = (Y 1 , L , Y n ) ⎛ ⎞ 1 X 11 · · · X 1 p ⎜ ⎟ . . . . . . Design matrix: X = . . . ⎝ ⎠ 1 X n 1 · · · X np Regression coefficient matrix: β 0 = ( β 0 , β 1 , L , β p ) Error matrix: e 0 = (e 1 , L , e n )

Model: Y = X β + e Least squares estimator of β is ˆ = ( X 0 X ) -1 X 0 Y β When p=1, P n i =1 ( X i − ¯ X )( Y i − ¯ Y ) = Cov ( X, Y ) ˆ β 1 = V ar ( X ) , P n i =1 ( X i − ¯ X ) 2 ˆ Y − ˆ ¯ β ¯ β 0 = X where n n X X Y = 1 X = 1 ¯ ¯ Y i , X i . n n i =1 i =1

Y = β 0 + β 1 X 1 + L + β p X p + e Model: Or, write as Y − ¯ Y = β 1 ( X 1 − ¯ X 1 ) + . . . + β p ( X p − ¯ X p ) + e n X X j = 1 ¯ j = 1 , . . . , p. X ij , n i =1 β 0 ≡ ¯ Y − ( β 1 ¯ X 1 + . . . + β p ¯ X p ) Then,

Y − ¯ Y = β 1 ( X 1 − ¯ X 1 ) + . . . + β p ( X p − ¯ X p ) + e Least squares estimator of β is 0 ( X − ¯ 0 ( Y − ¯ − 1 ( X − ¯ β = [( X − ¯ ˆ X ) X )] X ) Y ) where X ) 0 ( X − ¯ ( X − ¯ X ) = ⎛ ⎞ X 1 ) 2 Σ ( X 1 − ¯ Σ ( X 1 − ¯ X 1 )( X p − ¯ · · · X p ) ⎜ ⎟ . . . . = . . ⎝ ⎠ X p ) 2 Σ ( X p − ¯ X p )( X 1 − ¯ Σ ( X p − ¯ · · · X 1 ) ⎛ ⎞ ⎝ X ⎠ , ( X j 1 − ¯ X j 1 )( X j 2 − ¯ j 1 , j 2 = 1 , · · · , p = X j 2 ) i ⎛ ⎞ ⎝ X X ) 0 ( Y − ¯ ( X − ¯ ⎠ , j = 1 , · · · , p ( X j − ¯ X j )( Y − ¯ Y ) = Y ) i

Interval-valued data: = = ∈ = Y [ a , b ], j 1,..., , p u E { w ,..., w ,... w } uj uj uj 1 u m Bertrand and Goupil (2000): Symbolic sample mean is 1 = + ∑ Y ( b a ), j uj uj 2 m u ∈ E Symbolic sample variance is 1 1 = + + − + 2 ∑ 2 2 ∑ 2 S ( b b a a ) [ ( b a )] j uj uj uj uj uj uj 3 m 2 ∈ 4 m ∈ u E u E Notice, e.g., m = 1, Y = Weight Y 1 = [132, 138] → = 2 = Y 135, S 3 1 1 = 2 = Y 135, S 12 Y 2 = [129, 141] → 1 2

Can rewrite 1 ∑ 2 = − 2 + − − + − 2 S [( a Y ) ( a Y )( b Y ) ( b Y ) ] j uj j uj j uj j uj j m ∈ 3 u E Then, by analogy, for j = 1,2, for interval-valued variables Y 1 and Y 2 , empirical covariance function Cov ( Y 1 , Y 2 ) is 1 ∑ 1/ 2 = Cov Y Y ( , ) G G Q Q [ ] 1 2 1 2 1 2 3 m ∈ u E = − 2 + − − + − 2 Q ( a Y ) ( a Y )( b Y ) ( b Y ) j uj j uj j uj j uj j ⎧− ≤ ⎪ 1, if Y Y , uj j = ⎨ G j > ⎪ 1, if Y Y , ⎩ uj j = + Y ( a b )/ 2. uj uj uj ≡ 2 Notice, special cases: (i) C o v Y ( , Y ) S 1 1 1 (ii) If a uj = b uj = y j , for all u , i.e., classical data, 1 = Σ − − C ov Y ( , Y ) ( y Y )( y Y ) 1 2 1 1 2 2 m

Back to Bertrand and Goupil (2000) Sample variance is 1 1 = + + − + 2 ∑ 2 2 ∑ 2 S ( b b a a ) [ ( a b )] j uj uj uj uj uj uj 3 m 2 ∈ 4 m ∈ u E u E This is total variance. = 2 SS mS Take Total Sum of Squares = Total j j Then, we can show = + Total SS Within Objects SS Betwee n Obje cts S S j j j where

1 ∑ 2 2 2 = − + − − + − S [( a Y ) ( a Y )( b Y ) ( b Y ) ] j uj j uj j uj j uj j m ∈ 3 u E 1 2 2 = − + − − + − ∑ [( a Y ) ( a Y )( b Y ) ( b Y ) ] Within Objects SS j uj uj uj uj uj uj uj uj 3 u ∈ E ∑ 2 = + − S S [( a b ) / 2 Y ] Between Objects j uj uj j ∈ u E with 1 ∑ = + = + Y ( a b ) / 2, Y ( a b ). uj uj uj j uj uj 2 m ∈ u E = = a b Y Classical data: u j u j u j → Within Objects SS j = 0

So, for Y j , we have Sum of Squares SS, = + Total SS Within Objects SS Betwee n Obje cts S S j j j Likewise, for ( Y i , Y j ), we have Sum of Products SP = + Tota l SP Within Objects SP Between Objec ts SP ij ij ij

Can rewrite 1 ∑ 2 = − 2 + − − + − 2 S [( a Y ) ( a Y )( b Y ) ( b Y ) ] j uj j uj j uj j uj j m ∈ 3 u E Then, by analogy, for j = 1,2, for interval-valued variables Y 1 and Y 2 , empirical covariance function Cov ( Y 1 , Y 2 ) is 1 ∑ 1/ 2 = Cov Y Y ( , ) G G Q Q [ ] 1 2 1 2 1 2 3 m ∈ u E = − 2 + − − + − 2 Q ( a Y ) ( a Y )( b Y ) ( b Y ) j uj j uj j uj j uj j ⎧− ≤ ⎪ 1, if Y Y , uj j = ⎨ G j > ⎪ 1, if Y Y , ⎩ uj j = + Y ( a b )/ 2. uj uj uj

Can rewrite 1 ∑ 2 = − 2 + − − + − 2 S [( a Y ) ( a Y )( b Y ) ( b Y ) ] j uj j uj j uj j uj j m ∈ 3 u E Then, by analogy, for j = 1,2, for interval-valued variables Y 1 and Y 2 , empirical covariance function Cov ( Y 1 , Y 2 ) is 1 ∑ 1/ 2 = Cov Y Y ( , ) G G Q Q [ ] 1 2 1 2 1 2 3 m ∈ u E = − 2 + − − + − 2 Q ( a Y ) ( a Y )( b Y ) ( b Y ) j uj j uj j uj j uj j ⎧− ≤ ⎪ 1, if Y Y , uj j = ⎨ G j > ⎪ 1, if Y Y , ⎩ uj j = + Y ( a b )/ 2. uj uj uj ( Total)SP part can be replaced by X Total SP = 1 £ 2( a − ¯ Y )( c − ¯ X ) + ( a − ¯ Y )( d − ¯ X ) + ( b − ¯ Y )( c − ¯ X ) 6 u ¤ +2( b − ¯ Y )( d − ¯ X )

How is this obtained? Recall that for a Uniform distribution, Y ∼ S ( a, b ), V ar ( Y ) = ( b − a ) 2 12 By analogy, we can show, for u=1,…,m observations, m X 1 ( a u − b u )( c u − d u ) Within SP = 12 u =1 µ a u + b u ¶ µ c u + d u ¶ m X − ¯ − ¯ Between SP = Y 1 Y 2 2 2 u =1 where Y u 1 = [ a u , b u ] , Y u 2 = [ c u , d u ] µ a u + b u ¶ µ c u + d u ¶ m m X X Y 1 = 1 Y 2 = 1 ¯ ¯ , 2 2 m m u =1 u =1

m X 1 Within SP = ( a u − b u )( c u − d u ) 12 u =1 µ a u + b u ¶ µ c u + d u ¶ m X − ¯ − ¯ Between SP = Y 1 Y 2 2 2 u =1 Hence, from Total SP = Within SP + Between SP m X =1 £ 2( a u − ¯ Y 1 )( c − ¯ Y 2 ) + ( a − ¯ Y 1 )( d − ¯ Y 2 ) 6 u =1 ¤ +( b − ¯ Y 1 )( c − ¯ Y 2 ) + 2( b − ¯ Y 1 )( d − ¯ Y 2 )

Y X1 X2 Pulse Systolic Diastolic u Rate Pressure Pressure 1 [44, 68] [90, 110] [50, 70] 2 [60, 72] [90, 130] [70, 90] 3 [56, 90] [140, 180] [90, 100] 4 [70, 112] [110, 142] [80, 108] 5 [54, 72] [90, 100] [50, 70] 6 [70, 100] [134, 142] [80, 110] 7 [72, 100] [130, 160] [76, 90] 8 [76, 98] [110, 190] [70, 110] 9 [86, 96] [138, 180] [90, 110] 10 [86, 100] [110, 150] [78, 100] 11 [63, 75] [60, 100] [140, 150] Rule: X2 = Diastolic Pressure < Systolic Pressure = X1

Dependencies in Interval- -valued valued Dependencies in Interval - PowerPoint PPT Presentation

Dependencies in Interval- -valued valued Dependencies in Interval Symbolic Data Symbolic Data Lynne Billard University of Georgia lynne@stat.uga.edu Tribute to Professor Edwin Diday: Paris, France; 5 September 2007 Naturally occurring

Holts exponential smoothing model for interval-valued time series This work is part of a paper

Massive Data Algorithmics Lecture 6: Interval Trees Massive Data Algorithmics Lecture 6:

Towards More Realistic How Interval Data Is . . . Discussion Interval Models in How to Actually

Interval Computations Interval . . . Linearization and their Possible Use Interval Arithmetic:

Dynamic Programming: Interval Scheduling and Knapsack 6.1 Weighted Interval Scheduling Weighted

Many-Valued Logic Daniel Bonevac February 27, 2013 Daniel Bonevac Many-Valued Logic Rationales

Building stuff with monadic dependencies + unchanging dependencies + polymorphic dependencies +

From Processing Reduction to Interval . . . Interval-Valued Fuzzy Data Need for Type-2 Fuzzy . .

How Much For an Interval? Case of Interval . . . a Set? a Twin Set? a p-Box? Case of Set-Valued

Classication SVM algorithms with interval-valued training data using triangular and

Algebraic Study of Lattice-Valued Logic and Lattice-Valued Modal Logic Yoshihiro Maruyama

Shuffle algebra perspective on operator valued probability theory 30 mars 2020 1/25 Operator

VECTOR-VALUED FUNCTIONS MATH 200 MAIN QUESTIONS FOR TODAY Whats a vector valued function?

Math 211 Math 211 Lecture #2 Separable Equations 2 Interval of Existence Interval of

Computing the Cube of an Computing the . . . Why Power of a Matrix Interval Matrix Is NP-Hard

On restrictions of balanced 2-interval graphs Philippe Gambette and Stphane Vialette Outline

Outlines 1 Introduction 2 Related Literature 3 Modeling set-up 4 Target achievements 5

A simple model of trees for unicellular maps Guillaume Chapuy ( LIAFA, Paris-VII) joint work with

Bijective enumeration of permutations starting with a longest increasing subsequence Greta Panova

ENUMERATION OF POLYOMINOES INSCRIBED IN A RECTANGLE Alain Goupil, Hugo Cloutier, Fathallah Nouboud

COMMUNITY May 2 nd , 2018 Presentation will start at 6:00 pm Please enjoy food, beverage, and

Inference for the internal rotation profile of stars based on dipolar modes of oscillations 2

Pattern Avoidance in Motzkin Paths Dan Daly Mary Ramey Southeast Missouri State University July

Consistent Subtyping for All Ningning Xie Xuan Bi Bruno C. d. S. Oliveira 16 April, 2018 The