computational forensics
play

Computational Forensics: Machine Learning and Predictive Analytics - PowerPoint PPT Presentation

Fundamentals of Computational Forensics: Machine Learning and Predictive Analytics Carl Stuart Leichter PhD carl.leichter@ntnu.no NTNU Testimon Digital Forensics Group NTNU Testimon Digital Forensics Group Cyber Threat Intelligence and


  1. (Internal) Model Complexity 47 MC-0

  2. 0 th Order Polynomial Regression estimated model MC-2 48

  3. 1 st Order Polynomial 49 MC-2

  4. 3 rd Order 3 3 50 MC-3

  5. 9 th Order What Happened?! 51 MC-6

  6. Model Complexity • Curse of Dimensionality (Too Much Complexity) • Overfitting 52 MC-7

  7. Training Performance Evaluation 53 MC-8

  8. The Machine Learning Process Output Evaluation Feature Learning/Adaptation Preprocessing Training Data Extraction/Selection Internal Model Feature Classification/ Preprocessing Testing Data Extraction/Selection Regression Application T&T-1 54

  9. Training Data, Testing Data & Over-fitting 55 MC-9

  10. A Central Principle in ML • The model complexity drives the training data requirements! 56 MC-10

  11. More Data Can Fix Overfitting Problem • N= 10 Data Points • N= 15 Data Points • N= 100 Data Points 57 MC-11

  12. Curse of Dimensionality (Model Complexity) 58 MC-12

  13. • More complex problems, require more complex models • More complex models, require more complex feature spaces – Need higher dimensionality to get good class separation Wood classifier with 1D feature space? Grain Prominence Wood Brightness 59 MC-13

  14. Distance Metrics 60 DM-0

  15. The Distance Metric • How the similarity of two elements in a set is determined, e.g. – Euclidean Distance – Inner Product (Vector Spaces) – Manhattan Distance – Maximum Norm – Mahalanobis Distance – Hamming Distance – Or any metric you define over the space … 61 DM-1

  16. Manhattan Distance https://www.quora.com/What-is-the-difference-between-Manhattan-and- Euclidean-distance-measures 62 DM-2

  17. Far From Normal? y X X X X X X X X X X X X X X X X X X X X X x Center = Mean Spread = Variance 63 DM-3

  18. Mahalanobis Distance http://www.jennessent.com/arcview/mahalanobis_description.htm 64 DM-4

  19. Mahalanobis Distance http://stats.stackexchange.com/questions/62092/bottom-to-top- explanation-of-the-mahalanobis-distance 65 DM-5

  20. Unsupervised Learning 66 U-0

  21. Clustering • Partitional • Hierarchical 67 U-C-1

  22. Anomaly Detection with Unlabelled Data Packet Size X X X X X X X X X X X X X X X X X X X X X Packet Data Size 68 U-C-1

  23. Recap of Wood Classification – 2 Optical Attributes or Features • Brightness • Grain prominence – Yielded a 2-Dimensional Feature Space – We had SUPERVISED learning: • We started with known pieces of wood • Gave each plotted training example its class LABEL – We chose our features well, we saw good clustering/separation of the different classes in the features space. 69 U-C-2

  24. Unlabelled Data Brightness 10 X X X X X X X X X X X X X X X Grain Prominence 0 1 70 U-C-3

  25. Partitional Clustering U-C-3 71

  26. Hierarchical Clustering: Corpus browsing www.yahoo.com/Science … (30) agriculture biology physics CS space ... ... ... ... ... dairy botany cell AI courses crops craft magnetism HCI missions agronomy evolution forestry relativity U-C-3

  27. Essentials of Clustering • Similarities – Natural Associations – Proximate* • Differences – Distant* *Implies a distance metric 73 U-C-3

  28. Essentials of Clustering • What is a “Good” Cluster? –Members are very “similar” to each other • Within Cluster Divergence Metric σ i – Variance also works • Relative Cluster Sizes versus Data Spread 74 U-C-4

  29. Partitional Clustering Methods • K-Means Clustering • Gaussian Mixture Models • Canopy Clustering • Vector Quantization 75 U-C-5

  30. Unsupervised Learning/Clustering Self Organizing Maps (SOM) 76 U-C-7

  31. SOMs Topology Preserving Projections http://www.cita.utoronto.ca/~murray/GLG130/Exercises/F2.gif 77 U-C-8

  32. http://www.cita.utoronto.ca/~murray/GLG130/Exercises/F2.gif 78 U-C-9

  33. Topology Preserving Projections http://www.cita.utoronto.ca/~murray/GLG130/Exercises/F2.gif 79 U-C-10

  34. Topology Preserving Projections • How will the distance metric handle polymorphous data? – Units of time (different units of time?) • Sprint performance data: years of age and seconds to finish – Units of space • (meters, lightyears) • Surface area • Volumetric – Units of mass (grams, kilograms, tonnes) – Units of $$$ • NOK • USD 80 U-C-11

  35. Proximity By Colour and Location Poverty Map of the World (1997) http://www.cis.hut.fi/research/som-research/worldmap.html 81 U-C-12

  36. Map of Labels in Titles From comp.ai.neural-nets-news newsgroup www.cs.hmc.edu/courses/2003/ fall/cs152/slides/som.pdf 82 U-C-13

  37. Learning As Search 83 LAS-0

  38. • Exhaustive search – DFS – BFS • Gradient search – Can Get Stuck in Local Optimal Solution • Simulated annealing – Avoids Local Optima • Genetic algorithms 84 LAS-1

  39. Exact vs Approximate Search • Exact: – Hashing techniques – S tring matching (“Murder”) • Approximate: – Approximate Hashing – Partial strings – Elastic Search • “murder” • “ merder ” 85 LAS-7

  40. Artificial Neural Networks (ANN) 86 ANN-0

  41. Inspired by Natural Neural Nets 87 ANN-1

  42. Perceptron (1950s) 88 ANN-2

  43. Perceptron Can Learn Simple Boolean Logic Single Boundary, Linearly Separable 89 ANN-03

  44. Perceptron Cannot Learn XOR 90 ANN-4

  45. Multi-Layer Perceptron Error Back-Propagation Network MLP-BP 91 ANN-5

  46. MLP-BP Internal Model Building Block 5 MLP-BP Neurons 92 ANN-7

  47. MLP- BP “Universal Voxel” 93 ANN-8

  48. NeuroFuzzy Methods 94 NF-0

  49. Neuro Fuzzy Overview • Neuro-Fuzzy (NF) is a hybrid intelligence / soft computing – (*Soft?) • A combination of Artificial Neural NetworkS (ANN) and Fuzzy Logic (FL) • Opposite of fuzzy logic is – Crisp – Sharp • ANN are black box statistics, modelled to simulate the activity of biological neurons • FL extracts human-explainable linguistic fuzzy rules • Applications in Decision Support Systems and Expert Systems 95 NF-1

  50. Fuzzy Basics • FL uses linguistic variables that can contains several linguistic terms • Temperature (linguistic variable) – Hot (linguistic terms) – Warm – Cold • Consistency (linguistic variable) – Watery (linguistic terms) – Gooey – Soft – Firm – Hard – Crunchy – Crispy 96 NF-2

  51. Triangular Fuzzy Membership Functions http://sci2s.ugr.es/keel/links.php 97 NF-3

  52. 98 Fuzzy Inference ● Sharp antecedent: “If the tomato is red, then it is sweet” ● Fuzzy antecedent: ● “If the piece of wood is more or less dark ( μ dark = 0.7 )” ● Fuzzy consequent(s): ● “The piece of is more of less pine ( μ pine = 0.64 )” ● “The piece of is more of less birch ( μ birch = 0.36 )” http://ispac.diet.uniroma1.it/scarpiniti/files/NNs/Less9.pdf NF-4

  53. Combining ANN/FL ● ANN black box approach requires sufficient data to find the structure (generalization learning) ● NO PRIORS required ● But cannot extract linguistically meaningful rules from trained ANN ● Fuzzy rules require prior knowledge ● Based on linguistically meaningful rules http://www.scholarpedia.org/article/Fuzzy_neural_network 99 NF-5

  54. Combining ANN/FL Combining the two gives us higher level of system ● intelligence Intelligence(?) ● Can handle the usual ML tasks ● (regression, classification, etc) ● http://www.scholarpedia.org/article/Fuzzy_neural_network 100 NF-6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend