High-Dimensional Signature Compression for Large-Scale Image - PDF document

High-Dimensional Signature Compression for Large-Scale Image Classification Jorge S´ anchez and Florent Perronnin Textual and Visual Pattern Analysis (TVPA) group Xerox Research Centre Europe (XRCE) Abstract winners of the PASCAL VOC 2007 [8] and 2008 [9] com- petitions used a similar paradigm: many types of low-level local features are extracted (referred to as “channels”), one We address image classification on a large-scale, i.e . bag-of-visual-words (BOV) histogram is computed for each when a large number of images and classes are involved. channel and non-linear kernel classifiers such as SVMs are First, we study classification accuracy as a function of the image signature dimensionality and the training set size. used to perform classification [38, 29]. The use of many We show experimentally that the larger the training set, the channels and costly non-linear SVMs was made possible higher the impact of the dimensionality on the accuracy. In by the modest size of the available databases. other words, high-dimensional signatures are important to In recent years only has the computational cost become obtain state-of-the-art results on large datasets. Second, we a central issue in image classification / object detection. In tackle the problem of data compression on very large signa- [19], Maji et al . showed that the runtime cost of an inter- tures (on the order of 10 5 dimensions) using two lossy com- section kernel (IK) SVM could be made independent of pression strategies: a dimensionality reduction technique the number of support vectors. Maji and Berg [18] and known as the hash kernel and an encoding technique based Wang et al . [31] then proposed efficient algorithms to learn on product quantizers. We explain how the gain in storage IKSVMs. Vedaldi and Zisserman [30] and Perronnin et al . can be traded against a loss in accuracy and / or an increase [21] subsequently generalized this principle to any additive in CPU cost. We report results on two large databases – Im- classifier. Another line of research consists in computing ageNet and a dataset of 1M Flickr images – showing that we image representations which are directly amenable to cost- can reduce the storage of our signatures by a factor 64 to less linear classification. Yang et al . [36], Wang et al . [32] 128 with little loss in accuracy. Integrating the decompres- and Boureau et al . [4] showed that replacing the average sion in the classifier learning yields an efficient and scalable pooling stage in the BOV computation by a max-pooling training algorithm. On ILSVRC2010 we report a 74.3% yielded excellent results. To go beyond the BOV, i.e . be- accuracy at top-5, which corresponds to a 2.5% absolute yond counting, it has been proposed to include higher order improvement with respect to the state-of-the-art. On a statistics in the image signature. This includes modeling subset of 10K classes of ImageNet we report a top-1 ac- an image by a probability distribution [17, 35] or using the curacy of 16.7%, a relative improvement of 160% with Fisher kernel framework [20]. Especially, it was shown that respect to the state-of-the-art. the Fisher Vector (FV) could yield high accuracy with linear classifiers [22]. If one wants to stick to efficient linear classifiers, the 1. Introduction image representations should be high-dimensional to en- Scaling-up image classification systems is a problem sure linear separability of the classes. Therefore, we ar- which is receiving an increasing attention as larger labeled gue that the storage/memory cost is becoming a central is- image datasets are becoming available. For instance, Ima- sue in large-scale image classification . As an example, in geNet (www.image-net.org) consists of more than 12M im- this paper we consider almost dense image representations – based on the improved FV of [22] – with up to 524 K ages of 17K concepts [7] and Flickr contains thousands of groups (www.flickr.com/groups)– some of which with hun- dimensions. Using a 4 byte floating point representation, dreds of thousands of pictures – which can be readily used a single signature requires 2MB of storage. Storing the to learn object classifiers [31, 22]. ILSVRC2010 dataset [2] would take approximately 2.8TBs The focus in the image classification community was ini- and storing the full ImageNet dataset around 23TBs. Ob- tially on developing systems which would yield the best viously, these numbers have to be multiplied by the num- possible accuracy fairly independently of their cost. The ber of channels, i.e . feature types. As another example, the 1665

High-Dimensional Signature Compression for Large-Scale Image - PDF document

High-Dimensional Signature Compression for Large-Scale Image Classification Jorge S anchez and Florent Perronnin Textual and Visual Pattern Analysis (TVPA) group Xerox Research Centre Europe (XRCE) Abstract winners of the PASCAL VOC 2007 [8]

Electronic Signature Electronic Signature El Electronic Signature t i Si t Digital

Discharge uncertainty: sources and implications for hydrological analyses Signature 1 Signature

How To Design A Signature Talk: Part 1 How To Design Your Signature Talk: Part 1 Your Signature

Lossless compression in lossy compression systems Almost every lossy compression system

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

Digital Signature And Hash Function

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

ratification and signature Signature vs ratification Signature formal expression of intent to

1-out-of-2 Signature Jun Shao 2 Whats 1-out-of-2 Signature Mirosaw Kutyowski 1 and Jun

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Digital Image Compression Digital Image Compression Digital Image Compression and JPEG Standards

Digital Video Compression Digital Video Compression Digital Video Compression and H.261

From Sorting to Heaps to Compression Data Compression video on demand/set top box jpeg

Tradeoffs in XML Database Compression James Cheney University of Edinburgh Data Compression

Introduction to Compressed Sensing Gitta Kutyniok (Institut f ur Mathematik, Technische

Error-Resilient LZW data compression Yonghui Wu Stefano Lonardi University of California,

Cloud Radio Access Downlink with Backhaul Constrained Oblivious Processing Shlomo Shamai

First Quarter 2020 Results Presentation Wednesday, May 13, 2020 Agenda Prepared Remarks Jeff

Mesh Compression Mesh Compression Connectivity: Often, triangulated graph CS101 - Meshing

Aligning DNA sequences on compressed collections of genomes Part 2. Compressed indexing The

Compression Overview Multimedia Encoding and Compression Huffman codes Lossless

Compressing RSA/Rabin keys Public keys D. J. Bernstein Each user publishes a key 2 2047 + 1