Introduction to Pattern Recognition
Selim Aksoy
Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr
CS 551, Fall 2015
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 1 / 40
Introduction to Pattern Recognition Selim Aksoy Department of - - PowerPoint PPT Presentation
Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015 2015, Selim Aksoy (Bilkent University) c 1 / 40 Human Perception
Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 1 / 40
◮ Humans have developed highly sophisticated skills for
◮ recognizing a face, ◮ understanding spoken words, ◮ reading handwriting, ◮ distinguishing fresh food from its smell.
◮ We would like to give similar capabilities to machines.
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 2 / 40
◮ A pattern is an entity, vaguely defined, that could be given a
◮ fingerprint image, ◮ handwritten word, ◮ human face, ◮ speech signal, ◮ DNA sequence, ◮ . . .
◮ Pattern recognition is the study of how machines can
◮ observe the environment, ◮ learn to distinguish patterns of interest, ◮ make sound and reasonable decisions about the categories
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 3 / 40
◮ We are often influenced by the knowledge of how patterns
◮ Research on machine perception also helps us gain deeper
◮ Yet, we also apply many techniques that are purely
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 4 / 40
Problem Domain Application Input Pattern Pattern Classes Document image analysis Optical character recognition Document image Characters, words Document classification Internet search Text document Semantic categories Document classification Junk mail filtering Email Junk/non-junk Multimedia database retrieval Internet search Video clip Video genres Speech recognition Telephone directory assis- tance Speech waveform Spoken words Natural language processing Information extraction Sentences Parts of speech Biometric recognition Personal identification Face, iris, fingerprint Authorized users for access control Medical Computer aided diagnosis Microscopic image Cancerous/healthy cell Military Automatic target recognition Optical or infrared image Target type Industrial automation Printed circuit board inspec- tion Intensity or range image Defective/non-defective prod- uct Industrial automation Fruit sorting Images taken on a conveyor belt Grade of quality Remote sensing Forecasting crop yield Multispectral image Land use categories Bioinformatics Sequence analysis DNA sequence Known types of genes Data mining Searching for meaningful pat- terns Points in multidimensional space Compact and well-separated clusters CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 5 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 6 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 7 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 8 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 9 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 10 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 11 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 12 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 13 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 14 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 15 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 16 / 40
◮ Problem: Sorting incoming
◮ Assume that we have only
◮ sea bass, ◮ salmon.
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 17 / 40
◮ What kind of information can distinguish one species from
◮ length, width, weight, number and shape of fins, tail shape,
◮ What can cause problems during sensing?
◮ lighting conditions, position of fish on the conveyor belt,
◮ What are the steps in the process?
◮ capture image → isolate fish → take measurements → make
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 18 / 40
◮ Assume a fisherman told us that a sea bass is generally
◮ We can use length as a feature and decide between sea
◮ How can we choose this threshold?
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 19 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 20 / 40
◮ Even though sea bass is longer than salmon on the
◮ Try another feature: average lightness of the fish scales.
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 21 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 22 / 40
◮ We should also consider costs of different errors we make
◮ For example, if the fish packing company knows that:
◮ Customers who buy salmon will object vigorously if they see
◮ Customers who buy sea bass will not be unhappy if they
◮ How does this knowledge affect our decision?
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 23 / 40
◮ Assume we also observed that sea bass are typically wider
◮ We can use two features in our decision:
◮ lightness: x1 ◮ width: x2
◮ Each fish image is now represented as a point (feature
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 24 / 40
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 25 / 40
◮ Does adding more features always improve the results?
◮ Avoid unreliable features. ◮ Be careful about correlations with existing features. ◮ Be careful about measurement costs. ◮ Be careful about noise in the measurements.
◮ Is there some curse for working in very high dimensions?
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 26 / 40
◮ Can we do better with another decision rule? ◮ More complex models result in more complex boundaries.
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 27 / 40
◮ How can we manage the tradeoff between complexity of
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 28 / 40
1 −1 1
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 29 / 40
1 −1 1
(a) 0’th order polynomial
1 −1 1
(b) 1’st order polynomial
1 −1 1
(c) 3’rd order polynomial
1 −1 1
(d) 9’th order polynomial
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 30 / 40
1 −1 1
(a) 15 sample points
1 −1 1
(b) 100 sample points
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 31 / 40
Physical environment Data acquisition/sensing Pre−processing Feature extraction Features Classification Post−processing Decision Model learning/estimation Features Feature extraction/selection Pre−processing Training data Model
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 32 / 40
◮ Data acquisition and sensing:
◮ Measurements of physical variables. ◮ Important issues: bandwidth, resolution, sensitivity,
◮ Pre-processing:
◮ Removal of noise in data. ◮ Isolation of patterns of interest from the background.
◮ Feature extraction:
◮ Finding a new representation in terms of features. CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 33 / 40
◮ Model learning and estimation:
◮ Learning a mapping between features and pattern groups
◮ Classification:
◮ Using features and learned models to assign a pattern to a
◮ Post-processing:
◮ Evaluation of confidence in decisions. ◮ Exploitation of context to improve performance. ◮ Combination of experts. CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 34 / 40
◮ Data collection:
◮ Collecting training and testing data. ◮ How can we know when we have adequately large and
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 35 / 40
◮ Feature selection:
◮ Domain dependence and prior information. ◮ Computational cost and feasibility. ◮ Discriminative features. ◮ Similar values for similar patterns. ◮ Different values for different patterns. ◮ Invariant features with respect to translation, rotation and
◮ Robust features with respect to occlusion, distortion,
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 36 / 40
◮ Model selection:
◮ Domain dependence and prior information. ◮ Definition of design criteria. ◮ Parametric vs. non-parametric models. ◮ Handling of missing features. ◮ Computational complexity. ◮ Types of models: templates, decision-theoretic or statistical,
◮ How can we know how close we are to the true model
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 37 / 40
◮ Training:
◮ How can we learn the rule from data? ◮ Supervised learning: a teacher provides a category label or
◮ Unsupervised learning: the system forms clusters or natural
◮ Reinforcement learning: no desired category is given but the
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 38 / 40
◮ Evaluation:
◮ How can we estimate the performance with training
◮ How can we predict the performance with future data? ◮ Problems of overfitting and generalization. CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 39 / 40
◮ Pattern recognition techniques find applications in many
◮ There are many sub-problems in the design process. ◮ Many of these problems can indeed be solved. ◮ More complex learning, searching and optimization
◮ There remain many fascinating unsolved problems.
CS 551, Fall 2015 c 2015, Selim Aksoy (Bilkent University) 40 / 40