intro to pa ern recogni on
play

Intro to Pa)ern Recogni/on CSCI 8260 Spring 2016 Computer Network - PowerPoint PPT Presentation

Intro to Pa)ern Recogni/on CSCI 8260 Spring 2016 Computer Network A)acks and Defenses Whats Pa)ern Recogni/on? Target problem: build a system that automa/cally recognizes and categorizes objects into classes Measure a number of


  1. Intro to Pa)ern Recogni/on CSCI 8260 – Spring 2016 Computer Network A)acks and Defenses

  2. What’s Pa)ern Recogni/on? • Target problem: build a system that automa/cally recognizes and categorizes objects into classes – Measure a number of features that characterize the object (e.g., color, size, shape, weight, etc.) – Input these measurement into the system – Obtain the correct class label Apples Pa)ern Recogni/on System Pears

  3. Types of Features Example features for apples Quan/ta/ve • Weight (con/nuous) • Darkness level, 1-255 (discrete) • Qualita/ve • Color {red,yellow,green} (nominal) • Sweetness level {sour,sweet,very-sweet,extremely-sweet} (ordinal) •

  4. Real-World Applica/on Example

  5. Bayesian Decision Theory Nota/on • – X = [length, lightness] – W1 = salmon – W2 = see bass Assume we know the following • – P(X|W1) = P(length,lightness|salmon) – P(X|W2) = P(length,lightness|see bass) Bayes Rule • – P(W1|X) = P(X|W1)P(W1)/P(X) – P(W2|X) = P(X|W2)P(W2)/P(X) Op/mum Decision rule for a new input X’ • – Decide W1 if P(W1|X’) > P(W2|X’), otherwise decide W2 – In other words: P(X’|W1)/P(X’|W2) > P(W2)/P(W1) : then W1, else W2 In reality, we do not know the true P(X|W1) and P(X|W2)!

  6. Approximate Class-condi/onal distribu/ons • ~P(length|W i ) – E.g., es/mated from examples of labeled fish provided by a fisherman

  7. Approximate Class-condi/onal distribu/ons • ~P(lightness|W i ) – Es/mated from examples

  8. Learning a Decision Surface • Linear classifier

  9. “Flexible” Classifier • Very “flexible” classifier (e.g., Ar/ficial Neural Nets)

  10. Learning a Decision Surface • Quadra/c Classifier

  11. Decision Surface • Different classifier learn different models – Different generaliza/on ability – Different accuracy when tes/ng on a separate dataset

  12. Supervised Learning in prac/ce • Assume you have a large dataset of labeled examples – Each entry represents an objects (its features) – Each object is assigned a “ground-truth” label” • e.g., labeled fruits, labeled • Split the dataset in two parts – Use a training set to automa/cally learn an object model – Use a test set to evaluate how your model is going to perform on never- before-seen data

  13. Another example

  14. Learned Decision Tree

  15. Evalua/on Metrics • Test results help es/mate accuracy

  16. False Posi/ves vs. True Posi/ves • Let N be the total number of test instances (or pa)erns) in the test dataset • Instances can belong to two possible classes – Posi/ve (or target) class – Nega/ve class • TP = Number of correctly classified posi/ve sampels • FP = Num of misclassified nega/ve samples

  17. ROC and AUC • ROC = Receiver Opera/ng Characteris/c Curve – Plots trade-off between FPs and TPs for varying detec/on thresholds • AUC = Area under the ROC (the larger the be)er)

  18. Unsupervised Learning • Learn from unlabeled examples – Seriously??? – Yes! • Discover groups of similar objects in a mul/- dimensional feature space – Provides new useful informa/on – Discovers new “concepts” or previously unknown classes

  19. Clustering • Different clustering algorithms find different data clusters

  20. Hierarchical Clustering

  21. Pa)ern Recogni/on Process

  22. Security Applica/ons • Network/Host-based Intrusion Detec/on • Malware Detec/on • Detec/ng Search Poisoning • Detec/ng Malicious Domain Names • Etc.

  23. Example Security Applica/on • Given a PE file (e.g., an MS-Windows .exe file) – Decide if the file is “packed” without running it [1] [1] Roberto Perdisci, Andrea Lanzi, Wenke Lee. "Classifica/on of Packed Executables for Accurate Computer Virus Detec/on." Pa)ern Recogni/on Le)ers, 29(14), 2008, pp. 1941-1946.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend