Chapter 14 Reduce Items and Attributes Vis/Visual Analytics, Chap - - PowerPoint PPT Presentation

chapter 14
SMART_READER_LITE
LIVE PREVIEW

Chapter 14 Reduce Items and Attributes Vis/Visual Analytics, Chap - - PowerPoint PPT Presentation

Chapter 14 Reduce Items and Attributes Vis/Visual Analytics, Chap 14 Reduce 1 CGGM Lab., CS Dept., NCTU Jung Hong Chuang The Big Picture Datasets are large and complex Showing everything in a view visual clutter There are five


slide-1
SLIDE 1

Chapter 14 Reduce Items and Attributes

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 1

slide-2
SLIDE 2

The Big Picture

  • Datasets are large and complex

– Showing everything in a view  visual clutter – There are five options for handling complexity

  • Change view over time

– most obvious, most popular, and most flexible one

  • Derive new data (chap 4)
  • Facet into multiple views (chap. 13)
  • Reduce items and attributes (chap 14)
  • Embed: Focus + Context in a single view (chap 15)

– Are not mutually exclusive, and various combinations of them are common

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 2

slide-3
SLIDE 3

The Big Picture

  • Reduce what is shown at once within a view

– Filtering – eliminate elements

  • Items or attributes

– Aggregate – combines many together

  • Items or attributes

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 3

slide-4
SLIDE 4

The Big Picture

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 4

slide-5
SLIDE 5

Why Reduce?

  • Reducing the amount of data shown in a

view

– An obvious way to reduce its visual complexity – The challenge is to minimize the chance that information important to the task is hidden

  • Filtering simply eliminate elements
  • Aggregation

– Creates a single new element that stands in for multiple others that it replace

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 5

slide-6
SLIDE 6

Why Reduce?

  • Trade-off between filtering and aggregation

– Filtering is very straightforward for users to understand and to compute

  • People tend to have “out of sight, out of mind” mentally

about missing information

– Aggregation can be somewhat safer cognitively

  • The stand-in element is designed to convey information

about the entire set of information it replaces

  • It cannot convey all omitted information. The challenge

is how and what to summarize in a way that match well with the dataset and task

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 6

slide-7
SLIDE 7

Filter

  • Applied to both items and attributes

– Challenge come in designing a vis system where filtering can be used to effectively explore a dataset

  • To do filtering, users need to select a “range”, the

problem is that users do not know the dataset yet

  • In an interactive vis context, filtering is often

accomplished through dynamic queries

– There is a tightly coupled loop between visual encoding and interaction, so that users can immediately see the result of the intervention – Often the filtering is standard GUI widgets » Sliders, buttons, comboboxes, text fields

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 7

slide-8
SLIDE 8

Filter Item Filtering

  • To eliminate items based on their values

w.r.t specific attributes

– Fewer items are shown, # of attributes does not change

  • Ex: FilmFinder

– Data

  • A movie database: A table with 9 value attributes

– Genre, year made, title, actors, actresses, directors, rating, popularity, length

– Encoding

  • Interactive scatterplot

– Items are movies color coded by genre – Axes: year made vs. popularity

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 8

slide-9
SLIDE 9

Filter Item Filtering

  • Ex: FilmFinder

– Filtering

  • Dual slider – select both a minimum and a maximum
  • Several alpha sliders – tuned for selection with text

strings

– Display

  • Multiform overview-detail views

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 9

slide-10
SLIDE 10

Filter Item Filtering

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 10

FilmFinder features tightly coupled interactive filtering, where the result of moving sliders and pressing buttons is immediately reflected in the visual encoding. (a) Exploration begins with an

  • verview of all movies in the dataset. (b) Moving

the actor slider to select Sean Connery filters out most of the other movies, leaving enough room to draw labels. (c) Clicking on the mark representing a movie brings up a detail view.

slide-11
SLIDE 11

Filter Attribute Filtering

  • To eliminate attributes

– To show the same # of items, but fewer attributes for each item – Often used in conjunction with attribute

  • rdering
  • If attributes can be ordered according to a derived

attribute that measures the similarity between them, all

  • f the high-scoring attributes or lower-scoring

attributes can be filtered out

  • Item filtering and attribute filtering can be

combined

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 11

slide-12
SLIDE 12

Filter Attribute Filtering

Ex: DOSFA

  • Attribute filtering with attribute ordering
  • Dataset

– 215 attributes, representing word counts – 298 points representing documents

  • Encoding

– Star plots

  • Original star plots are so densely packed that little

structure can be seen

  • After attribute ordering and filtering, the star plots

show clear pattern

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 12

slide-13
SLIDE 13

Filter Attribute Filtering

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 13

The DOSFA idiom shown on star glyphs with a medical records dataset of 215 dimensions and 298 points. (a) The full dataset is so dense that patterns cannot be seen. (b) After ordering on similarity and filtering on both similarity and importance, the star glyphs show structure.

slide-14
SLIDE 14

Filter Attribute Filtering

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 14

slide-15
SLIDE 15

Aggregate

  • A group of elements is represented by a

new derived element that stand in for the entire group

– Typically involves the use of a derived attribute

  • Simple Ex: average, minimum, maximum, count, sum,

but rarely an adequate solution!

– Are rarely good solutions!

– Challenge: avoid eliminating the interesting signal in the process of summarization – A powerful design choice, particularly when used within interactive idiom

  • Change the level of aggregation on the fly to inspect

the dataset at different levels of detail

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 15

slide-16
SLIDE 16

Aggregate

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 16

Different data with same mean, variance, and correlation Dataset 1: ok Dataset 2: nonlinear Dataset 3: a single outlier leads to a misleading regression line! Dataset 4: dramatically mislead!

slide-17
SLIDE 17

Aggregate Item Aggregation

  • The most straightforward use of item

aggregation is within static visual encoding idioms

– Its full power and flexibility can be harnessed by interactive idioms where the view dynamically changes

  • Examples

– Histograms – Continuous scatterplots – Boxplot charts – SolarPlot – Hierarchical parallel coordinates

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 17

slide-18
SLIDE 18

Aggregate Item Aggregation

  • Histograms

– Shows the distribution of items within an

  • riginal attribute
  • EX. Shows the distribution of weights for all the cats in

a neighborhood, binned into 5-pound blocks

– Compared to bar charts

  • Histogram can be continuous
  • Histograms do not show the original table directly

– Is an aggregation idiom that shows a derived table that is more concise than the original dataset

  • The choice of bin size is crucial and tricky

– Compute the # of bins based on the dataset characteristics – User-controlled + interactivity; to see how the histogram changes

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 18

slide-19
SLIDE 19

Aggregate Item Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 19

The histogram idiom aggregates an arbitrary number of items into a concise representation of their distribution.

slide-20
SLIDE 20

Aggregate Item Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 20

slide-21
SLIDE 21

Aggregate Item Aggregation

  • Continuous scatterplots

– Solve the occlusion problem by plotting an aggregate value rather than drawing every single item as an individual point

» Size coding and text labels exacerbates the problem

  • Use color coding at each pixel to indicate the density of
  • verplotting, often with transparency

– Use an aggregate derived attribute: overplot density

  • Ex: Dataset: tornado air-flow dataset

– Horizontal axis: magnitude of the velocity – Vertical axis: z-direction velocity – The density is shown with a log-scale sequential colormap with monotonically increasing luminance » Starts with dark blue at the low end, continuous with reds, and then yellows and whites at the high end

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 21

slide-22
SLIDE 22

Aggregate Item Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 22

The continuous scatterplot idiom uses color to show the density at each location, solving the problem of occlusion from overplotting and allowing scalability to large datasets.

slide-23
SLIDE 23

Aggregate Item Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 23

slide-24
SLIDE 24

Aggregate Item Aggregation

  • Boxplots

– Shows an aggregate statistical summarization of all the values that occur within the distribution

  • f a single quantitative attribute

– Median (50% point) – Lower and upper quartiles (25% and 75%) – Upper and lower fences » Values beyond the fences should be counted as outliers

  • Encoding of these 5 numbers

– Glyph that relies on vertical spatial position – Core box stretches between the lower and upper quartiles – A horizontal line at the median – Whiskers – vertical lines that extend from the core box to the fences – Outliers – discrete dots

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 24

slide-25
SLIDE 25

Aggregate Item Aggregation

  • Boxplot
  • Shows the spread, the degree of dispersion with the

extent of the box

  • Handle unimodal data
  • Vase plot
  • Shows a variable-width variant
  • Uses an additional spatial dimension within the glyph

by altering the width of the core box according to the density

  • Allows a visual check to see if the distribution is instead

multimodal, with multiple peaks

  • Boxplot charts

– Feature multiple boxplots within a shared frame to contrast different attribute distributions

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 25

slide-26
SLIDE 26

Aggregate Item Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 26

The boxplot is an idiom presenting summary statistics for the distribution

  • f a quantitative attribute, using five derived values. These plots illustrate

four kinds of distributions: normal (n), skewed (s), peaked (k), and multimodal (mm). (a) Standard box plots. (b) Vase plots, which use horizontal spatial position to show density directly.

slide-27
SLIDE 27

Aggregate Item Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 27

slide-28
SLIDE 28

Aggregate Item Aggregation

  • Controllable aggregation

– Interesting use of aggregation in vis involve mapping between individual items and the aggregated visual mark changes on the fly

  • Simple case

– allow users to explicitly request aggregation and deaggregation of item set

  • More sophisticated cases

– do these operation automatically as a result of higher-level interaction and navigation

  • SolarPlot

– A radial histogram with an interactively controlled aggregation

  • Circle size change  aggregation level changes

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 28

slide-29
SLIDE 29

Aggregate Item Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 29

The SolarPlot circular histogram idiom provides indirect control of aggregation level by changing the circle size. (a) The small circle shows the increase in ticket sales over time. (b) Enlarging the circle shows seasonal patterns in addition to the gradual increase. (Dataset: ticket sale over time, 30 years in total)

slide-30
SLIDE 30

Aggregate Item Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 30

slide-31
SLIDE 31

Aggregate Item Aggregation

  • Hierarchical aggregation

– Construct the derived data of a hierarchical clustering of items in the original dataset and allow the use to interactively control the LOD to show

  • Hierarchical parallel coordinates
  • Compute derived data: a hierarchical clustering of the

items

  • Compute statistics about each cluster

» # of items, mean, minimum, maximum, depth

  • A cluster is coded by a band of varying width and
  • pacity

– Clusters are color coded according to their proximity in the hierarchy

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 31

slide-32
SLIDE 32

Aggregate Item Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 32

Hierarchical parallel coordinates provide multiple levels of detail. (a) The single top cluster has large extent. (b) When several clusters are shown, each has a smaller extent. (c) When many clusters are shown, the proximity-based coloring helps them remain distinguishable from each other.

slide-33
SLIDE 33

Aggregate Item Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 33

slide-34
SLIDE 34

Aggregate Attribute Aggregation

  • Simple one

– Group attributes by similarity measure – Synthesize the new attribute by calculating an average

  • More complex methods

– Dimension reduction

  • To preserve the meaningful structure of a dataset while

using fewer synthetic attributes to represent the items

– Hinges on the assumption that there is hidden structure and significant redundancy in the dataset because the underlying latent variables could not be measured directly

  • Methods

– Multidimensional scaling (MDS), PCA

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 34

slide-35
SLIDE 35

Aggregate Attribute Aggregation

Ex: DR for document collections

  • Document collection  a derived high-dim

table

– Feature vector: the word count vector

  • Very sparse

– Table: item - feature vector

  • DR

– Analyze the different distribution of words between documents or to find clusters of related documents – Typically shown with a scatterplot for cluster discovery

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 35

slide-36
SLIDE 36

Aggregate Attribute Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 36

Dimensionality reduction of a large document collection using Glimmer for multidimensional scaling. The results are laid out in a single 2D scatterplot, allowing the user to verify that the conjectured clustering shown with color coding is partially supported by the spatial layout.

slide-37
SLIDE 37

Aggregate Attribute Aggregation

  • A typical analysis scenario – a chained seq
  • 1. A low-dim table is derived using MDS
  • 2. The low-dim data is encoded as a color-coded

scatterplot

  • 3. To produce annotations by adding text labels to

the verified clusters

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 37

slide-38
SLIDE 38

Aggregate Attribute Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 38

A chained sequence of what–why–how analysis instances for the scenario

  • f dimensionality reduction of document collection data.
slide-39
SLIDE 39

Aggregate Attribute Aggregation

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 39

slide-40
SLIDE 40

Aggregate How to show DR Data?

  • Reduced dim > 2

– Scatterplot matrix

  • Reduced dim = 2

– Scatterplot

  • The DR is designed so that these two dimensions will not

be correlated

– Different from the general scatterplot

– Only related distance matter

  • The absolute position is not meaningful

– Should be used ONLY to find or verify large-scale cluster structure

  • Due to information lost in DR, fine-grained structure is

not reliable – minor diff in distances may not be reliable!

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 40

slide-41
SLIDE 41

Aggregate More on MDS

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 41

slide-42
SLIDE 42

Aggregate More on MDS

Isomap

  • MDS and PCA are not good at MD when the

data has a completed, non-linear relationships to on another

  • Isomap can successfully handle such dataset

– For each data point

  • Find its nearest neighbors and create a neighborhood

graph

– connect the point and its nearest neighbors

– Compute the geodesic distances between points in the neighborhood graph – Apply classical MDS

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 42

slide-43
SLIDE 43

Aggregate PCA

  • The goal of PCA is to identify the most

meaningful basis to re-express a dataset.

– Filter out the noise – Reveal hidden structure

  • Dataset X: mxn matrix, m attributes, n samples

– Each column is a single sample – The naïve basis

  • Orthonormal basis
  • Naïve basis reflects the method we gathered the data

CGGM Lab., CS Dept., NCTU Jung Hong Chuang 43

slide-44
SLIDE 44

Aggregate PCA

  • Change of basis

– Is there another basis, which is a linear combination of the original basis, that best re- express the dataset?

  • Makes one stringent but powerful assumption: linearity
  • Linearity vastly simplifies the problem by restricting the

set of potential basis

– Let Y be another mxn matrix related by a linear transformation P, that is Y=PX

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 44

slide-45
SLIDE 45

Aggregate PCA

  • The rows of P are a set of new basis vectors for

expressing the columns of X

– Each coefficient of yi is a dot-product of xi with corresponding row in P – The j-th coefficient of yi is a projection of xi onto the pj

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 45

slide-46
SLIDE 46

Aggregate PCA

  • Finding the appropriate change of basis

– The row vectors {p1, p2, …, pm} will become the principal components of X – Questions arise

  • What is the best way to re-express X?
  • What is a good choice of basis P?
  • For answering the questions, we need to ask ourselves

“What features we would like Y to exhibit?”

  • Data has noise and redundancy!

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 46

slide-47
SLIDE 47

Aggregate PCA

  • Variance and the goal

– What does “best express” the data mean? – Noise and rotation

  • All noise is quantified relative to the signal strength
  • A common measure is the signal-to-noise ratio (SNR)

– A high SNR (>>1) indicates a high precision measurement – A low SNR indicates very noise data

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 47

slide-48
SLIDE 48

Aggregate PCA

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 48

Basis vector: the direction of largest variance!!

slide-49
SLIDE 49

Aggregate PCA

  • Variance and the goal

– Redundancy

  • Reexamine the figure on the last slide

– We should ask whether it was really necessary to record two variables?

  • See next figure!

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 49

slide-50
SLIDE 50

Aggregate PCA

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 50

slide-51
SLIDE 51

Aggregate PCA

Variance and the goal

– Covariance matrix

  • Covariance of two sample measurement with zero means

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 51

slide-52
SLIDE 52

Aggregate PCA

Variance and the goal

– Covariance matrix

  • Covariance of two sets of measurements with zero means

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 52

slide-53
SLIDE 53

Aggregate PCA

Variance and the goal

– Covariance matrix

  • Covariance of two sets of measurements with zero means

– Measures the degree of the linear relationship between two variable – A large positive covariance value indicates positively correlated data – A large negative covariance value indicates negatively correlated data – The absolute magnitude of the covariance measures the degree of redundancy

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 53

slide-54
SLIDE 54

Aggregate PCA

Variance and the goal

– Covariance matrix

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 54

The ij-th element of CX is the dot product between the vector of the i-th measurement type with the vector of the j-th measurement type

slide-55
SLIDE 55

Aggregate PCA

Variance and the goal

– Covariance matrix

  • CX is a square symmetric mxm matrix
  • The diagonal elements of CX are the variance of particular

measurement type

  • The off-diagonal elements of CX are the covariance

between measurement types

  • That is, CX captures the covariance between all possible

pairs of measurements

– So the covariance values reflect the noise and redundancy in the measurement! » Large diagonal elements correspond to interesting structure! » Large off-diagonal magnitudes correspond to high redundancy!

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 55

slide-56
SLIDE 56

Aggregate PCA

Variance and the goal

– Our goals

  • Y=PX
  • What features do we want to optimize in CY ?

– To minimize redundancy, measured by the magnitude of covariance – To maximize the signal, measured by the variance

  • What would the optimized covariance matrix CY look like?

– All off-diagonal elements should be 0. Thus, CY must be a diagonal matrix. Or, Y is decorrelated – Each successive dimension in Y should be rank-ordered according to variance!

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 56

slide-57
SLIDE 57

Aggregate PCA

Solving PCA using eigenvector decomposition

  • Goal

– Find some orthogonal matrix P in Y=PX such that CY=1/n YYT is a diagonal matrix. The rows of P are the principal components of X

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 57

slide-58
SLIDE 58

Aggregate PCA

Solving PCA using eigenvector decomposition

  • Any symmetric matric A is diagonalized by an
  • rthogonal matrix of its eigenvectors. i.e.,

– A=EDET, where D is a diagonal matrix and E is a matrix of eigenvectors of A arranged in columns – CY=PCXPT

  • We select the matrix P to be a matrix where each row pi is

an eigenvector of 1/n XXT, i.e. P=ET

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 58

slide-59
SLIDE 59

Aggregate PCA

Solving PCA using eigenvector decomposition

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 59

slide-60
SLIDE 60

Aggregate PCA

Solving PCA using eigenvector decomposition

  • The principal components of X are the

eigenvectors of CX=1/n XXT

  • The i-th diagonal value of CY is the variance
  • f X along pi
  • Practical computing PCA involves

– Subtract off the mean of each measurement type – Compute the eigenvectors of CX

Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 60