1
EE 6882 Statistical Methods for Video Indexing and Analysis
Fall 2004
- Prof. Shih-Fu Chang
EE 6882 Statistical Methods for Video Indexing and Analysis Fall - - PowerPoint PPT Presentation
EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang http://www.ee.columbia.edu/~sfchang Lecture 1 part A (9/8/04) 1 EE E6882 SVIA Lecture #1 Part I Introduction Course Syllabus Readings
1
2 EE6882-Chang
Analysis and Machine Intelligence, vol 22, No 1, Jan. 2000.
(Chapter 12, Object recognition)
(Chapter 9.14)
Multimedia Magazine, Summer, Vol. 4 No. 3, pp.12-20, 1997.
Image Query System,” In ACM Multimedia, Boston, MA, November 1996.
3 EE6882-Chang
See Columbia’s WebSEEk and EdSearch demos Goggle image search? “find video clips of basketball going through the hoop” “find images containing shape shown in the sketch”
(e.g., recognition of text, face, scene, vehicle, location, etc)
(e.g., break videos into shots, scenes, and stories)
(e.g., sports events, human activities, meetings, medical, and
e.g., topic clustering, highlight generation See Columbia’s sports highlight, news topic clustering demo
4 EE6882-Chang
shot story anchor shot
How to detect and recognize the characters and words? (Demo) How to detect the boundaries
stories, and commercials?
5 EE6882-Chang
Many problems can be posed as pattern
(e.g., Matlab statistical classification demo)
Statistical models to handle uncertainty
Rich tools for learning and prediction Image processing toolkits available Increasing benchmark data
(e.g., NIST TREC Video)
6 EE6882-Chang
(From Jain, Duin, and Mao, SPR Review, ’99)
7 EE6882-Chang
Color, texture, motion, shape, layout, regions, parts, etc
Discrete vs. continuous, vectorization, dimension Invariance to scale, rotation, translation …
PCA, MDS, Kernel PCA, etc
Generative vs. discriminative Multi-modal fusion, early fusion vs. late fusion
8 EE6882-Chang
shapes
9 EE6882-Chang
11 EE6882-Chang
Likelihood
Class 1 Class 2
(Height, income, …)
+ + + + + + + + + + + + + + ++ + + + + + + +
+ + + + + + + +
f(x) discriminant function
12 EE6882-Chang
Assume the same distribution in different set,
x(1) x(2) Training
+ + + + -
models, parameters x(1) x(2) Validation
+ + +
hypothesis through validation x(1) x(2) Testing
+ ++ - +
performance
13 EE6882-Chang
Multiple validation sets can be used for different
Val - 1 Val - 1
Optimal classifier using feature 1
Val - 2
Optimal classifier using feature 2 Optimal classifier fusing multiple features … …
Cross validation, leave-one-out
1 2 … K
Training Testing
14 EE6882-Chang
Rule of thumb – (# of training patterns per class) / (# of features) > 10
x(1) x(2) Overtraining
+ + + +
+ + + +
15 EE6882-Chang
Learn how to formulate and solve problems in this field
Feature extraction, object/event recognition, structure
detection, video search and retrieval
Get insights and experience of recent machine learning
Statistical, Bayesian, Neural Network, PCA, HMM, SVM
Have fun in experimenting with actual visual
Beginning graduate students or professionals familiar with signal/image processing comfortable with probability, statistics, linear algebra, and
16 EE6882-Chang
Exam 30% Final Project 40%
17 EE6882-Chang
Each student discusses paper and demos with me
Week 1: review and research Week 2: simulate a toy problem using available
Week 3: prepare presentation
Upload the slide and codes to the class wiki site
Presentation
30 mins each paper (including demo) I will provide additional materials about the
18 EE6882-Chang
Background review and examples Problem addressed and main ideas Insights about why it works Limitation, generality, and repeatability Alternatives and comparisons
Software and data available and repeatable? Reconstruct the method and try on toy data set?
Analysis of results (not just accuracy numbers, offer
Demo code archived on class site and shared with others
19 EE6882-Chang
Tutorials on paper writing, Matlab, etc
Benchmark data set, a few thousands of images from
Extracted features and labels Will distribute on a DVD for class project use only
Accessible in Mudd 251 Computer Lab Need CU ACIS account Very brief introduction next week
20 EE6882-Chang
Feature extraction and image search Image/video classification Interactive image retrieval Video structure parsing Multimedia information retrieval
Bayesian, factor graph, graphical model SVM and variations Language model, relevance model from IR HMM and variations
21 EE6882-Chang
A few papers reviewed last year
22 EE6882-Chang
time {video, audio}
a static face? motion energy changes? change from music to speech? speech segment? {cue words}j appear {cue words}i appear
k
k
1 k
1 k
(Hsu and Chang)
k
23 EE6882-Chang
(Valaiya et al 98 and 01)
and tree?
distributions of features for each class?
25 EE6882-Chang
(Naphade et al)
26 EE6882-Chang
Extract > 45K selective efficient features by multi-scale filtering
Classifier combination and sample re-weighting (Tieu and Viola)
27 EE6882-Chang
User selected examples 20 retrieval results Negative images in the training set close to decision boundary Images in the testing set close to the decision boundary
28 EE6882-Chang
(Duygulu et al)
between words and blobs
and retrieval
29 EE6882-Chang
time
top-level states running pitching break bottom-level states
bench close up batter audience field bird view pitcher 1st base
Learning Multi-Level Markovian Temporal Dependence
Baseball Example
(Xie et al)