EE 6882 Statistical Methods for Video Indexing and Analysis Fall - PowerPoint PPT Presentation

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang http://www.ee.columbia.edu/~sfchang Lecture 1 part A (9/8/04) 1

EE E6882 SVIA Lecture #1 Part I � Introduction � Course Syllabus � Readings � A. Jain et al, "Statistical Pattern Recognition: A Review," IEEE Tran. on Pattern � Analysis and Machine Intelligence, vol 22, No 1, Jan. 2000. Gonzalez and Woods, Digital Image Processing, 2nd edition, Prentice Hall, 2001 � (Chapter 12, Object recognition) Anil K. Jain, Fundamentals of Digital Image Processing, Prentice Hall, 1989. � (Chapter 9.14) Part II � Introduction of a simple image search system � Image feature extraction � Similarity matching, Performance metrics � Readings � J. R. Smith and S.-F. Chang, "Visually Searching the Web for Content," IEEE � Multimedia Magazine, Summer, Vol. 4 No. 3, pp.12-20, 1997. John R. Smith, Shih-Fu Chang. “VisualSEEk: a Fully Automated Content-Based � Image Query System,” In ACM Multimedia, Boston, MA, November 1996. EE6882-Chang 2

Problems in Video Indexing and Analysis Indexing, search, and retrieval for images and videos � � See Columbia’s WebSEEk and EdSearch demos � Goggle image search? � “find video clips of basketball going through the hoop” � “find images containing shape shown in the sketch” Automatic annotation of visual content � � (e.g., recognition of text, face, scene, vehicle, location, etc) Automatic parsing of video programs into structures � � (e.g., break videos into shots, scenes, and stories) Event detection � � (e.g., sports events, human activities, meetings, medical, and other spatio-temporal patterns) Summary � � e.g., topic clustering, highlight generation � See Columbia’s sports highlight, news topic clustering demo EE6882-Chang 3

Examples of object recognition and structure parsing problems How to detect and recognize the characters and words? (Demo) How to detect the boundaries of programs, stories, and story shot commercials? anchor shot EE6882-Chang 4

Statistical Paradigm � Many problems can be posed as pattern recognition � (e.g., Matlab statistical classification demo) � Statistical models to handle uncertainty and provide flexibility � Rich tools for learning and prediction � Image processing toolkits available � Increasing benchmark data � (e.g., NIST TREC Video) EE6882-Chang 5

A Very High-Level Stat. Pattern Recog. Architecture (From Jain, Duin, and Mao, SPR Review, ’99) EE6882-Chang 6

Important issues Image/video pre-processing – quality, resolution etc � Feature extraction � � Color, texture, motion, shape, layout, regions, parts, etc Feature representation � � Discrete vs. continuous, vectorization, dimension � Invariance to scale, rotation, translation … Feature selection � � PCA, MDS, Kernel PCA, etc Classification models � � Generative vs. discriminative � Multi-modal fusion, early fusion vs. late fusion Size of training/test data and manual supervision efforts � Validation and evaluation processes � Complexity � EE6882-Chang 7

Some examples of feature representation Features determine the patterns � and their separability E.g., � Angular distance for closed � shapes Part features for iris flowers � EE6882-Chang 8

Another example of feature Bankers Asso. Font used on � personal checks Use magnetic ink and reader � to simplify segmentation Feature: the horizontal scan � of the rate of increase/decrease of the character area Peaks and zeros are � arranged to be located at the vertical grid lines � can be sampled accurately Patterns can be easily � distinguished EE6882-Chang 9

Classification Paradigms x 2 Likelihood Decision f(x) > 0 P ( x|C=1 ) > or < P ( x|C=2 ) Boundary + ++ + + Class 1 Class 2 + + + + + + + + + + + + + + - + + + - - + - - + + - + + + - -- - -- - - - - - - - - + - - - - - - - - - - + - - - - - f(x) < 0 - + - - - - - - - - x 1 x x 0 (Height, f(x) discriminant function income, …) C ( x 0 ) = ? Discriminative Probabilistic EE6882-Chang 11

Training / Validation / Testing Training Validation Testing x(2) x(2) x(2) + + + + - - ++ - + + + + - + + + - - - - - + - - - - - x(1) x(1) x(1) Select optimal Evaluate optimal features, hypothesis performance models, parameters through over test data validation � Assume the same distribution in different set, otherwise the optimal solution from validation may not be optimal in test data EE6882-Chang 12

Training / Validation / Testing (cont.) � Multiple validation sets can be used for different optimization steps. Optimal classifier using feature 1 Val - 1 Optimal classifier Optimal classifier Val - 2 fusing multiple features using feature 2 Val - 1 … … � Cross validation, leave-one-out 1 2 … K Rotate the choice of the test set and average the Training Testing performance over runs EE6882-Chang 13

Curse of Dimensionality and Overtraining x(2) Overtraining A case of overtraining + + - + + + + - + + + - + - - - - - - - - x(1) Rule of thumb – (# of training patterns per class) / (# of features) > 10 EE6882-Chang 14

About the course Objectives: � � Learn how to formulate and solve problems in this field � Feature extraction, object/event recognition, structure detection, video search and retrieval � Get insights and experience of recent machine learning techniques � Statistical, Bayesian, Neural Network, PCA, HMM, SVM � Have fun in experimenting with actual visual classification/indexing problems Intended Audience � � Beginning graduate students or professionals � familiar with signal/image processing � comfortable with probability, statistics, linear algebra, and some machine learning EE6882-Chang 15

Course Format Overview Lectures + student presentations + final projects � I will give several overview lectures at the beginning. � Student paper presentation � One paper assigned to each student � assignments determined 3 weeks in advance � CVN students present over the phone � Everyone writes comments before and after class on the class wiki site � (starting the 3 rd week) One written exam after all presentations � test understanding of concepts discussed throughout the course � One term project at the end of the course � Grading � Paper presentation/demo 30% � Exam 30% Final Project 40% EE6882-Chang 16

Paper review and demo � Each student discusses paper and demos with me and TA 2 weeks before class � Week 1: review and research � Week 2: simulate a toy problem using available data set and tools � Week 3: prepare presentation � Upload the slide and codes to the class wiki site before class � Presentation � 30 mins each paper (including demo) � I will provide additional materials about the subject. EE6882-Chang 17

Paper Review and Demo (2) Review � � Background review and examples � Problem addressed and main ideas � Insights about why it works � Limitation, generality, and repeatability � Alternatives and comparisons Demo � � Software and data available and repeatable? � Reconstruct the method and try on toy data set? (from some available generic toolkit) � Analysis of results (not just accuracy numbers, offer explanations and verifiable theories about observations) � Demo code archived on class site and shared with others EE6882-Chang 18

Resources and Matlab Links on the class web site � � Tutorials on paper writing, Matlab, etc Software links on web site to � Matlab, Neural Network, HMM, Netlab, SVM SVIA EE6882 Class Dataset � � Benchmark data set, a few thousands of images from broadcast news and stock photos � Extracted features and labels � Will distribute on a DVD for class project use only Matlab is recommended for programming � � Accessible in Mudd 251 Computer Lab � Need CU ACIS account � Very brief introduction next week EE6882-Chang 19

Paper categories Problems � � Feature extraction and image search � Image/video classification � Interactive image retrieval � Video structure parsing � Multimedia information retrieval Statistical Techniques � � Bayesian, factor graph, graphical model � SVM and variations � Language model, relevance model from IR � HMM and variations � others EE6882-Chang 20

� A few papers reviewed last year EE6882-Chang 21

Maximum Entropy Fusing τ Objective: a story boundary at time ? � k (Hsu and Chang) τ = { shot boundaries or significant pauses} � k observation time τ − τ + τ k 1 k 1 k {video, audio} a static face? motion energy changes? change from music to speech? speech segment? {cue words} i appear {cue words} j appear EE6882-Chang 22

Bayesian Image Classification (Valaiya et al 98 and 01) How to select the categories � and tree? How to estimate the � distributions of features for each class? EE6882-Chang 23

Concept (In)Dependence (Naphade et al) EE6882-Chang 25

Boosting (Tieu and Viola) Extract > 45K selective efficient features by multi-scale filtering Classifier combination and sample re-weighting EE6882-Chang 26

EE 6882 Statistical Methods for Video Indexing and Analysis Fall - PowerPoint PPT Presentation

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang http://www.ee.columbia.edu/~sfchang Lecture 1 part A (9/8/04) 1 EE E6882 SVIA Lecture #1 Part I Introduction Course Syllabus Readings

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2003 Prof. Shih-Fu Chang

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang

1 Basic Image/Video Features Image Features Color (a). SCD (scalable color descriptor) (b).

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

Statistical Paradigm Many problems can be posed as pattern recognition Image

The Future of Video Indexing in the BBC Joanne Evans, BBC Information & Archives TrecVid

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing & Searching 3

Bitmap Indexing and related indexing techniques Presented by: El Ghailani Maher Outline I

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Indexing Presentation - The Basics Attached is the slide deck for a short presentation on indexing

Indexing December 12, 2008 Indexing Introduction New tuple is stored without any order next

EE 6882 Visual Search Engine Feb. 27 th , 2012 Lecture #6 Object Search Using Local Features

Video Games Written and Researched by: Patrick Kania First Video Game The first Video Game made

EE E6882 SVIA Lecture # 1 Introduction, Course Syllabus Readings (available on course site)

Q4 2017 Delivering on our Commitments Today and Tomorrow Cautionary notes CAUTIONARY NOTE

An introduction to particle simulation of rare events P. Del Moral Centre INRIA de Bordeaux - Sud

SIGMAPHI SIGMAPHI RACCAM magnet design RACCAM magnet design Damien Neuvglise Thomas

Hachette Livre Head of digitalization laudrain@hachette-livre.fr Publishers activity For

W O R K I N G W E L L W I T H O T H E R S I N T E C H P O L I C Y C O N T E X T S Lynette

Trickle: Code Propagation and Maintenance in Wireless Sensor Networks Philip Levis Scott

Software Maintenance and Evolution Keith H. Bennett Research Institute for Software Evolution

Seadrill Partners LLC Third Quarter Results November 20th, 2018 Forward Looking Statements This

EE 6882 Statistical Methods for Video Indexing and Analysis Fall - PowerPoint PPT Presentation

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang http://www.ee.columbia.edu/~sfchang Lecture 1 part A (9/8/04) 1 EE E6882 SVIA Lecture #1 Part I Introduction Course Syllabus Readings

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2003 Prof. Shih-Fu Chang

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang

1 Basic Image/Video Features Image Features Color (a). SCD (scalable color descriptor) (b).

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

Statistical Paradigm Many problems can be posed as pattern recognition Image

The Future of Video Indexing in the BBC Joanne Evans, BBC Information &amp; Archives TrecVid

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing &amp; Searching 3

Bitmap Indexing and related indexing techniques Presented by: El Ghailani Maher Outline I

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Indexing Presentation - The Basics Attached is the slide deck for a short presentation on indexing

Indexing December 12, 2008 Indexing Introduction New tuple is stored without any order next

EE 6882 Visual Search Engine Feb. 27 th , 2012 Lecture #6 Object Search Using Local Features

Video Games Written and Researched by: Patrick Kania First Video Game The first Video Game made

EE E6882 SVIA Lecture # 1 Introduction, Course Syllabus Readings (available on course site)

Q4 2017 Delivering on our Commitments Today and Tomorrow Cautionary notes CAUTIONARY NOTE

An introduction to particle simulation of rare events P. Del Moral Centre INRIA de Bordeaux - Sud

SIGMAPHI SIGMAPHI RACCAM magnet design RACCAM magnet design Damien Neuvglise Thomas

Hachette Livre Head of digitalization laudrain@hachette-livre.fr Publishers activity For

W O R K I N G W E L L W I T H O T H E R S I N T E C H P O L I C Y C O N T E X T S Lynette

Trickle: Code Propagation and Maintenance in Wireless Sensor Networks Philip Levis Scott

Software Maintenance and Evolution Keith H. Bennett Research Institute for Software Evolution

Seadrill Partners LLC Third Quarter Results November 20th, 2018 Forward Looking Statements This

The Future of Video Indexing in the BBC Joanne Evans, BBC Information & Archives TrecVid

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing & Searching 3