Intro to Pa)ern Recogni/on CSCI 8260 Spring 2016 Computer Network - - PowerPoint PPT Presentation

intro to pa ern recogni on
SMART_READER_LITE
LIVE PREVIEW

Intro to Pa)ern Recogni/on CSCI 8260 Spring 2016 Computer Network - - PowerPoint PPT Presentation

Intro to Pa)ern Recogni/on CSCI 8260 Spring 2016 Computer Network A)acks and Defenses Whats Pa)ern Recogni/on? Target problem: build a system that automa/cally recognizes and categorizes objects into classes Measure a number of


slide-1
SLIDE 1

Intro to Pa)ern Recogni/on

CSCI 8260 – Spring 2016 Computer Network A)acks and Defenses

slide-2
SLIDE 2

What’s Pa)ern Recogni/on?

  • Target problem: build a system that automa/cally

recognizes and categorizes objects into classes

– Measure a number of features that characterize the object (e.g., color, size, shape, weight, etc.) – Input these measurement into the system – Obtain the correct class label

Pa)ern Recogni/on System Apples Pears

slide-3
SLIDE 3

Types of Features

  • Quan/ta/ve
  • Weight (con/nuous)
  • Darkness level, 1-255 (discrete)
  • Qualita/ve
  • Color {red,yellow,green} (nominal)
  • Sweetness level {sour,sweet,very-sweet,extremely-sweet} (ordinal)

Example features for apples

slide-4
SLIDE 4

Real-World Applica/on Example

slide-5
SLIDE 5

Bayesian Decision Theory

  • Nota/on

– X = [length, lightness] – W1 = salmon – W2 = see bass

  • Assume we know the following

– P(X|W1) = P(length,lightness|salmon) – P(X|W2) = P(length,lightness|see bass)

  • Bayes Rule

– P(W1|X) = P(X|W1)P(W1)/P(X) – P(W2|X) = P(X|W2)P(W2)/P(X)

  • Op/mum Decision rule for a new input X’

– Decide W1 if P(W1|X’) > P(W2|X’), otherwise decide W2 – In other words: P(X’|W1)/P(X’|W2) > P(W2)/P(W1) : then W1, else W2

In reality, we do not know the true P(X|W1) and P(X|W2)!

slide-6
SLIDE 6

Approximate Class-condi/onal distribu/ons

  • ~P(length|Wi)

– E.g., es/mated from examples of labeled fish provided by a fisherman

slide-7
SLIDE 7

Approximate Class-condi/onal distribu/ons

  • ~P(lightness|Wi)

– Es/mated from examples

slide-8
SLIDE 8

Learning a Decision Surface

  • Linear classifier
slide-9
SLIDE 9

“Flexible” Classifier

  • Very “flexible” classifier (e.g., Ar/ficial Neural Nets)
slide-10
SLIDE 10

Learning a Decision Surface

  • Quadra/c Classifier
slide-11
SLIDE 11

Decision Surface

  • Different classifier learn different models

– Different generaliza/on ability – Different accuracy when tes/ng on a separate dataset

slide-12
SLIDE 12

Supervised Learning in prac/ce

  • Assume you have a large dataset of labeled examples

– Each entry represents an objects (its features) – Each object is assigned a “ground-truth” label”

  • e.g., labeled fruits, labeled
  • Split the dataset in

two parts

– Use a training set to automa/cally learn an object model – Use a test set to evaluate how your model is going to perform on never- before-seen data

slide-13
SLIDE 13

Another example

slide-14
SLIDE 14

Learned Decision Tree

slide-15
SLIDE 15

Evalua/on Metrics

  • Test results help es/mate accuracy
slide-16
SLIDE 16

False Posi/ves vs. True Posi/ves

  • Let N be the total number of test instances (or

pa)erns) in the test dataset

  • Instances can belong to two possible classes

– Posi/ve (or target) class – Nega/ve class

  • TP = Number of correctly classified posi/ve

sampels

  • FP = Num of misclassified nega/ve samples
slide-17
SLIDE 17

ROC and AUC

  • ROC = Receiver Opera/ng

Characteris/c Curve

– Plots trade-off between FPs and TPs for varying detec/on thresholds

  • AUC = Area under the ROC

(the larger the be)er)

slide-18
SLIDE 18

Unsupervised Learning

  • Learn from unlabeled examples

– Seriously??? – Yes!

  • Discover groups of similar objects in a mul/-

dimensional feature space

– Provides new useful informa/on – Discovers new “concepts” or previously unknown classes

slide-19
SLIDE 19

Clustering

  • Different clustering

algorithms find different data clusters

slide-20
SLIDE 20

Hierarchical Clustering

slide-21
SLIDE 21

Pa)ern Recogni/on Process

slide-22
SLIDE 22

Security Applica/ons

  • Network/Host-based Intrusion Detec/on
  • Malware Detec/on
  • Detec/ng Search Poisoning
  • Detec/ng Malicious Domain Names
  • Etc.
slide-23
SLIDE 23

Example Security Applica/on

  • Given a PE file (e.g., an MS-Windows .exe file)

– Decide if the file is “packed” without running it [1]

[1] Roberto Perdisci, Andrea Lanzi, Wenke Lee. "Classifica/on of Packed Executables for Accurate Computer Virus Detec/on." Pa)ern Recogni/on Le)ers, 29(14), 2008, pp. 1941-1946.