A Plot for Visualizing Multivariate Data
Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com
A Plot for Visualizing Multivariate Data Rida E. A. Moustafa - - PowerPoint PPT Presentation
A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com Talk Outline The Theory of MV-Plot. Detecting Linear Structures with MV-plot.
Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com
The Theory of MV-Plot. Detecting Linear Structures with MV-plot. Detecting Non-Linear Structures with MV-plot. Comparisons with other methods and application on real data.
= =
− = = = =
d j j d d j j d
x f x x f x g v x x f m
1 2 1 1 1
| ) ( | )) ( , ( | | ) ( Given an observation x=(x1,x2,…,xd) We define m and v as follows: Computing m and v for every observation produces vector of m and v. What is the relationship between m and v?
2 1 2 1 2 2 1 2 1 2 1 2 1 2 1 2 1
i i i j ij i i i j ij i
= =
1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 2
; ; ) 1 ( ) 1 ( if ) 1 ( ; ) 1 ( a x a v a x a m a w a w w w w x w v w x w m w x w x
i i i i i i i i i i
+ = + = ⇒ = = − ≈ + ⇒ + + + − = + + = ⇒ + =
If the data is linear in the original space It will be linear in the MV-space!!
− = − − = 1 1 1 1 1 1
2
d j ij j d d j d j ij j d j
− = − = 1 1 1 1 d j ij j j d j ij j j
MV- plot can detect nonlinear structure
| ) sin( | ), sin( ) sin( , | ) cos( | ), cos( ) cos( , x x v x x m x x x x v x x m x x − = + = → − = + = →
2
2 2 1 2 2 1 2 1 1 2 d R i i d j i ij d d j i ij d i
= =
Case I:
2
2 2 1 2 2 1 2 1 1 2 d R i i d j i c j c j ij d d j i c j c j ij d i
= =
Case II:
Fisher’s IRIS data (150x4)
3-classes of( 50 point each)
Process control data (600x60)
6-classes of (100 points each)
Pollen data (3,848x5) (Wegman’s data)
2-classes (linear and nonlinear)
Multidimensional Scaling Fisher Discriminate Analysis Principal Component
Other methods: Require more storage and speed. Even if it work, we expect bad results on this particular data. (Wegman2002)
Linear and Nonlinear mixed structures.
17+16+18+17+14+16=98 Linear, 3750 nonlinear
MV-algorithm can discover the linear and
MV-algorithm can discover symmetric data. MV-algorithm deals with large multivariate