EE E6882 SVIA Lecture # 1 Introduction, Course Syllabus Readings - PDF document

EE 6882 Statistical Methods for Video Indexing and Analysis I nstructors: Prof. Shih-Fu Chang, Columbia University Dr. Lexing Xie, I BM T.J. Watson Research TA: Eric Zavesky Fall 2007, Lecture 1 Course web site: http:/ / www.ee.columbia.edu/ ~ sfchang/ course/ svia 1 EE E6882 SVIA Lecture # 1 � Introduction, Course Syllabus � Readings (available on course site) � Rui et al, Content-Based Image Retrieval Review paper � A. Jain et al, "Statistical Pattern Recognition: A Review," IEEE Tran. on Pattern Analysis and Machine Intelligence, vol 22, No 1, Jan. 2000. � Gonzalez and Woods, Digital Image Processing, 2nd edition, Prentice Hall, 2001 (Chapter 12, Object recognition) � Next Week: � Sept. 17 th 2007 (Prof. Xie) � Topic: Content Based Image Retrieval EE6882-Chang 2 EE6882 Chang 1

Topics: Image/Video Search Explosive growth of online image/video data, personal media, � broadcast news videos, etc. 5 billion images on the Web, 31 million hours of TV programs each � year Successful services like Youtube and Flickr � Others: blinkx.com, like.com, etc � Image/video search exciting opportunity � Different Visual Search Models � Browsing and Grouping � Subject listing (e.g., WebSeek, http://www.ee.columbia.edu/webseek ) � Animation summary (e.g., http://www.blinkx.com ) � Keyword Search � Content-Based Search � E.g., VisualSeek, like.com EE6882-Chang 4 EE6882 Chang 2

User Expectation in Practice “…type in a few words at most, then expect the engine to “…type in a few words at most, then expect the engine to bring back the perfect results. More than 95 percent of bring back the perfect results. More than 95 percent of us never use the advanced search features most us never use the advanced search features most engines include, …” – The Search , J. Battelle, 2003 engines include, …” – The Search , J. Battelle, 2003 � Keyword search is the primary search method. � digital video | multimedia lab -5- Google Zeitgeist publishes top keywords monthly digital video | multimedia lab -6- EE6882 Chang 3

Examples of Keyword Image Search query: “sunset” 1 st page 2 nd page � Reasonable Keyword Search Results � Content Analysis May Help Correct Mistakes… Example Search � Text Query on Google: “Manhattan Cruise” � Image content analysis may help refine results � Image content analysis may help refine results digital video | multimedia lab -8- EE6882 Chang 4

How about Social-Net Tagging? Uploaded by gdanny � Yahoo-flickr Tags : outdoor, nyc, millions of bridges, water, boat, cruise Camera : Canon PowerShot users, SD 400 extensive Date : Sept. 17 2006 Social tags labels may be subjective and incomplete. EE6882-Chang 9 Insufficient Precision of Social Tags precision � Test Bronx-Whitestone Br. 1.00 Brooklyn Br. 0.38 New York Chrysler Building 0.65 City Columbia University 0.30 landmark Empire State Building 0.18 labels Flatiron Building 0.70 George Washington Br. 0.48 Grand Central 0.37 Guggenheim 0.21 Many tags from social networks are Met. Museum of Art 0.02 of low precision Queensboro Br. 0.38 (due to batch uploading?) Statue of Liberty 0.49 Times Square 0.56 Verrazano Narrows Br. 0.66 World Trade Center 0.13 EE6882 Chang 5

An Interesting Paradigm: (Von Ahn & Dabbish, CHI 04) Image Tagging via Game Playing Used in � Goggle Image Labeler ( http://images.google.com/imagelabeler/ ) Use competitive games to � motivate users Has attracted many � participants for free! Some users spent hours � in a day Claim the potential of � annotating the whole Web in just few months! 5 Billion images � Seeking the image search tools -- Content-Based Image Retrieval (CBIR) I BM QBI C ’95, Columbia VisualSEEk ’96 Query Query by by Sketch Sketch results results EE6882-Chang 12 EE6882 Chang 6

Issues � What image features to extract? � How to match images and videos? � How to make it fast? EE6882-Chang 13 Opportunity for Content Analysis: Large-Scale Auto. Image Tagging Framework Audio-visual features � Rich semantic description � based on content analysis Surrounding text � SVM or graph models � Context fusion � Semantic Tagging Anchor + Snow - Soccer Building . . . Outdoor Statistical models EE6882-Chang 14 EE6882 Chang 7

Large-Scale Concept Detectors from Research Community � Columbia374 � 374 baseline detectors for LSCOM multimedia ontology � MediaMill � 491 concept detectors for LSCOM and MediaMill 101 Lexicons � IBM MARVEL Search System � Trials with BBC, CNN � Real-time standalone detectors from IBM AlphaWorks � Others … EE6882-ChangShih-Fu Chang 15 What Concept to Detect? One effort: Large Scale Concept Ontology for � Multimedia (LSCOM) Joint effort by news/intelligence analysts, librarians, � researchers Broadcast News Domain � Selection Criteria � useful, detectable, observable � 834 concepts defined, 449 concepts annotated � Labeled over 61,000 shots of TRECVID 2005 data set � 33 Million judgments collected, 100 person-month labor � Download by 170+ groups so far � http://www.ee.columbia.edu/dvmm/lscom/ � EE6882-Chang 16 EE6882 Chang 8

LSCOM Concepts (449) � Event/Activity (56 - 13%) � Airplane taking off, car crash, explosion, etc � People (113 - 25%) � Person, male/female, firefighter, etc � Location (89 - 20%) � Cityscape, hospital, airfield, etc � Object (135 - 30%) � Vehicle, map, tank, power plant, etc � Scene (49 - 10%) � Vegetation, urban, interview, etc � Program (7 - 2%) � Entertainment, weather, finance, etc EE6882-Chang 17 Consumer Video Ontology (Kodak-Columbia, 2007) Activity: Activity (6) � Occasion : dancing, singing, sitting, walking, Occasion (16) � running, talking wedding, birthday, graduation, Scene (15) Scene: � Christmas, ski, picnic, show, Object (25) sunset, beach, � Object: meeting, parade, sports, playground, waterscape/waterfront, mountain, People (11) � people, animal, boat, and others People: theme-park, park, (back) yard, field, desert, urban, suburban, night, Sound (14) � crowd, baby, youth, adult, and Sound: dinning, museum home, kitchen, office, lab, public Camera Motion (5) � others music, cheer, and others Camera Motion: building Object Motion (3) � pan, tilt, zoom, fix, track Object Motion: Social (4) � entity, speed, direction Social: friend, family, classmate, colleague EE6882-Chang 18 EE6882 Chang 9

Research Issues � How to develop automatic tagging tools? � Train automatic recognition models � What image features? � What statistical models? � Explore surrounding information � Time, location (e.g., Yahoo! Zonetag, http://zonetag.research.yahoo.com/) � Text and metadata EE6882-Chang 19 Building Image Classifiers – Basic Detector for each concept � General for all concepts, easy to implement � 374 baseline detectors ( Columbia 374 ) released EE6882-Chang 20 EE6882 Chang 10

Examples of Basic Image Features grid layout + color Gabor edge direction moment texture histogram μ σ γ μ σ γ μ σ γ 48 dimensions 73 dimensions 225 dimensions Text search vs. visual classification Keyword search - “boat” Automatic classification – “boat” (images from TRECVID) EE6882 Chang 11

Text search vs. visual classification Keyword search - “car” Automatic classification – “car” Example: good detectors for LSCOM concept waterfront bridge crowd explosion fire US flag Military personnel digital video | multimedia lab -24- EE6882 Chang 12

Power of Concept-based Representation Large building semantic index . . . people outdoor New applications: Search, Filtering, Pattern Mining digital video | multimedia lab -25- Mapping search topics to concepts TRECVI D search topics Finds shots with one or more emergency Find shots with a view of one or more tall vehicles in motion (e.g., ambulance, police buildings (more than 4 stories) and the top car, fire truck, etc.) story visible. Matched Concepts: Matched Concepts: Building Emergency_Room, Vehicle Concept Concept Research issue: Find shots with one or more people leaving Find shots with one or more soldiers, police, what concept to use? or entering a vehicle. or guards escorting a prisoner. How to fuse multiple concepts? Matched Concepts: Matched Concepts: Person, Vehicle Guard, Police_Security, Prisoner, Soldier Concept Concept DVMM Lab, Columbia University 26 Lyndon Kennedy EE6882 Chang 13

Concept Search Demo � Interactive demos available at http://apollo.ee.columbia.edu/vace/newSearch/ � Concept search case 1 (link) � Concept search case 2 (link) � Multimodal search (link) Demos prepared by Eric Zavesky EE6882-Chang 27 CuVid : Columbia Video Search System http://www.ee.columbia.edu/cuvidsearch XML Customizable Automatic Output Multi-modal Query Search Tool Suite Expansions Beyond keywords: Automatically search by Detected example Story image Segments Search Result Folder Prototype includes 160 hours, 3 languages (English, Arabic, Chinese), 6 channels EE6882 Chang 14

EE E6882 SVIA Lecture # 1 Introduction, Course Syllabus Readings - PDF document

EE 6882 Statistical Methods for Video Indexing and Analysis I nstructors: Prof. Shih-Fu Chang, Columbia University Dr. Lexing Xie, I BM T.J. Watson Research TA: Eric Zavesky Fall 2007, Lecture 1 Course web site: http:/ / www.ee.columbia.edu/

EE E6882 SVIA: Homework 1 Due on October 1, 2007 Shih-Fu Chang, Lexing Xie Monday 4:10-6:30

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

Statistical Paradigm Many problems can be posed as pattern recognition Image

Feature Selection for SVMs by J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V.

The Climate Problem - Warming from unprecedented increases in atmospheric CO 2 already causing

CEE 680 Lecture #2 1/22/2020 1 CEE 680 Lecture #2 1/22/2020 2 CEE 680 Lecture #2

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

Multiphase Modelling in Cancer Helen Byrne Wolfson Centre for Mathematical Biology Mathematical

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Thank you Anne, and Good _____________ everyone and thank you for joining us today. Today

Welcome & landscape David L Miller & Jason J Roberts Welcome! Who are we? David L

CQARank:Jointly Model Topics and Expertise in Community Question Answering Liu Yang, Minghui Qiu,

Selective W eb Archiving at the Germ an National Library 1 | 8 | Selective Web Archiving

Open Source Tools for Mining and Analysing Web Data @ Scale Kris Carpenter Negulescu, Internet

iNACOL Symposium 2018: A Primer on Submitting Your Proposal to Present iNACOL Special Edition

TLSCF Data System FAQs What every TDS user should know. Albert Y Chang AIRS-TDS Jet Propulsion

Factors Influencing Public Support for RPSs Hosted by Warren Leon, Executive Director, CESA

EE E6882 SVIA Lecture # 1 Introduction, Course Syllabus Readings - PDF document

EE 6882 Statistical Methods for Video Indexing and Analysis I nstructors: Prof. Shih-Fu Chang, Columbia University Dr. Lexing Xie, I BM T.J. Watson Research TA: Eric Zavesky Fall 2007, Lecture 1 Course web site: http:/ / www.ee.columbia.edu/

EE E6882 SVIA: Homework 1 Due on October 1, 2007 Shih-Fu Chang, Lexing Xie Monday 4:10-6:30

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

Statistical Paradigm Many problems can be posed as pattern recognition Image

Feature Selection for SVMs by J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V.

The Climate Problem - Warming from unprecedented increases in atmospheric CO 2 already causing

CEE 680 Lecture #2 1/22/2020 1 CEE 680 Lecture #2 1/22/2020 2 CEE 680 Lecture #2

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

Multiphase Modelling in Cancer Helen Byrne Wolfson Centre for Mathematical Biology Mathematical

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Thank you Anne, and Good _____________ everyone and thank you for joining us today. Today

Welcome &amp; landscape David L Miller &amp; Jason J Roberts Welcome! Who are we? David L

CQARank:Jointly Model Topics and Expertise in Community Question Answering Liu Yang, Minghui Qiu,

Selective W eb Archiving at the Germ an National Library 1 | 8 | Selective Web Archiving

Open Source Tools for Mining and Analysing Web Data @ Scale Kris Carpenter Negulescu, Internet

iNACOL Symposium 2018: A Primer on Submitting Your Proposal to Present iNACOL Special Edition

TLSCF Data System FAQs What every TDS user should know. Albert Y Chang AIRS-TDS Jet Propulsion

Factors Influencing Public Support for RPSs Hosted by Warren Leon, Executive Director, CESA

Welcome & landscape David L Miller & Jason J Roberts Welcome! Who are we? David L