Chapter 14 Reduce Items and Attributes
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 1
Chapter 14 Reduce Items and Attributes Vis/Visual Analytics, Chap - - PowerPoint PPT Presentation
Chapter 14 Reduce Items and Attributes Vis/Visual Analytics, Chap 14 Reduce 1 CGGM Lab., CS Dept., NCTU Jung Hong Chuang The Big Picture Datasets are large and complex Showing everything in a view visual clutter There are five
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 1
– most obvious, most popular, and most flexible one
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 2
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 3
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 4
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 5
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 6
– There is a tightly coupled loop between visual encoding and interaction, so that users can immediately see the result of the intervention – Often the filtering is standard GUI widgets » Sliders, buttons, comboboxes, text fields
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 7
– Genre, year made, title, actors, actresses, directors, rating, popularity, length
– Items are movies color coded by genre – Axes: year made vs. popularity
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 8
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 9
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 10
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 11
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 12
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 13
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 14
– Are rarely good solutions!
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 15
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 16
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 17
– Is an aggregation idiom that shows a derived table that is more concise than the original dataset
– Compute the # of bins based on the dataset characteristics – User-controlled + interactivity; to see how the histogram changes
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 18
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 19
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 20
» Size coding and text labels exacerbates the problem
– Use an aggregate derived attribute: overplot density
– Horizontal axis: magnitude of the velocity – Vertical axis: z-direction velocity – The density is shown with a log-scale sequential colormap with monotonically increasing luminance » Starts with dark blue at the low end, continuous with reds, and then yellows and whites at the high end
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 21
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 22
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 23
– Median (50% point) – Lower and upper quartiles (25% and 75%) – Upper and lower fences » Values beyond the fences should be counted as outliers
– Glyph that relies on vertical spatial position – Core box stretches between the lower and upper quartiles – A horizontal line at the median – Whiskers – vertical lines that extend from the core box to the fences – Outliers – discrete dots
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 24
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 25
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 26
The boxplot is an idiom presenting summary statistics for the distribution
four kinds of distributions: normal (n), skewed (s), peaked (k), and multimodal (mm). (a) Standard box plots. (b) Vase plots, which use horizontal spatial position to show density directly.
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 27
– allow users to explicitly request aggregation and deaggregation of item set
– do these operation automatically as a result of higher-level interaction and navigation
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 28
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 29
The SolarPlot circular histogram idiom provides indirect control of aggregation level by changing the circle size. (a) The small circle shows the increase in ticket sales over time. (b) Enlarging the circle shows seasonal patterns in addition to the gradual increase. (Dataset: ticket sale over time, 30 years in total)
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 30
» # of items, mean, minimum, maximum, depth
– Clusters are color coded according to their proximity in the hierarchy
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 31
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 32
Hierarchical parallel coordinates provide multiple levels of detail. (a) The single top cluster has large extent. (b) When several clusters are shown, each has a smaller extent. (c) When many clusters are shown, the proximity-based coloring helps them remain distinguishable from each other.
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 33
– Hinges on the assumption that there is hidden structure and significant redundancy in the dataset because the underlying latent variables could not be measured directly
– Multidimensional scaling (MDS), PCA
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 34
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 35
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 36
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 37
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 38
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 39
– Different from the general scatterplot
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 40
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 41
– connect the point and its nearest neighbors
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 42
CGGM Lab., CS Dept., NCTU Jung Hong Chuang 43
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 44
– Each coefficient of yi is a dot-product of xi with corresponding row in P – The j-th coefficient of yi is a projection of xi onto the pj
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 45
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 46
– A high SNR (>>1) indicates a high precision measurement – A low SNR indicates very noise data
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 47
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 48
– We should ask whether it was really necessary to record two variables?
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 49
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 50
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 51
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 52
– Measures the degree of the linear relationship between two variable – A large positive covariance value indicates positively correlated data – A large negative covariance value indicates negatively correlated data – The absolute magnitude of the covariance measures the degree of redundancy
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 53
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 54
– So the covariance values reflect the noise and redundancy in the measurement! » Large diagonal elements correspond to interesting structure! » Large off-diagonal magnitudes correspond to high redundancy!
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 55
– To minimize redundancy, measured by the magnitude of covariance – To maximize the signal, measured by the variance
– All off-diagonal elements should be 0. Thus, CY must be a diagonal matrix. Or, Y is decorrelated – Each successive dimension in Y should be rank-ordered according to variance!
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 56
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 57
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 58
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 59
Vis/Visual Analytics, Chap 14 Reduce CGGM Lab., CS Dept., NCTU Jung Hong Chuang 60