Applications of Machine Learning to Performance Evaluation Daniel - PowerPoint PPT Presentation

Random walks on graphs Hidden Markov Models and prediction Simple example # of Visits Fraction of Visits Probability of Visit 1/3 1 1 200 0.20 0.20 p2 p1 2 67 0.06 0.06 1/3 3 400 0.40 0.40 1/2 p4 1/3 4 333 0.33 0.33 1 p3 Random Walker Power Method 1/2 1000 0 0.33 0.33 0.33 0 0 0 1 0.5 0 0 0.5 0 0 1 0 0.2 0.06 0.4 0.33 0.2 0.06 0.4 0.33 0.2 0.06 0.4 0.33 0.2 0.06 0.4 0.33 25/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction PageRank Centrality Important vertex connected to many vertices or few important vertices A = adjacency matrix A ij = 1, if i connected to j A ij = 0, otherwise d i = out degree of vertex i π i = centrality of vertex i , A ij � π i = π j d j j 26/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction PageRank Centrality 27/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction PageRank Centrality π i = centrality of vertex i , A ij � π i = π j d j j source = node that has in degree zero sink = node that has out degree zero (dangling node) What is the centrality of sources and sinks? 28/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction What is centrality of sources and sinks? Recall that citation networks are usually acyclic web page networks may have sources and sinks (e.g., pdf files) 1/3 1 p2 p1 1/3 1/2 p4 1/3 p3 1/2 The random surfer is stuck 29/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Possible solutions to dangling nodes After entering a dangling node ⇒ hyperlink to any page (equal probability) There is a set of trusted web pages . with probability 1 − d the random surfer does not follow hyperlinks. instead, jump to a trusted page (with equal probability). (1-d)/2 (1-d)/2+d/3 p2 d (1-d)/2 p1 (1-d)/2 d/3 (1-d)/2 d d/2 p4 (1-d)/2+d/3 (1-d)/2 p3 d/2 (1-d)/2 30/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Notation Let π be the vector of page ranks for the entire web T set of trusted pages n T = |T | cardinality of set of trusted pages T be a vector which has all zero’s except for non-zero values in positions corresponding to the trusted pages. In those locations the value is 1 / n T . (1-d)/2 (1-d)/2+d/3 p2 d (1-d)/2 p1 (1-d)/2 d/3 (1-d)/2 d d/2 p4 (1-d)/2+d/3 (1-d)/2 p3 d/2 31/82 (1-d)/2 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Google’s equation Google’s equation 1 � π i = d π j + ( 1 − d ) T [ i ] ∀ i c j j Everyday, new applications of random surfer abstraction Rank web pages [Brin and Page, 1998] Recommend movies [Bogers et al., CARS 2010] Control and planning [Mahadevan, AAAI 2010] Cure cancer [Winter et al., PLOS 2012] Heinrich Hertz (regarding Maxwell’s equation) One cannot escape the feeling that these mathematical formulas have an independent existence and an intelligence of their own, that they are wiser than we are, wiser even than their discoverers, that we get more out of them than was originally put into them. 32/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Random walks on graphs Until now : steady state analysis of Markov chains Applications of transient analysis Movement models for mobile computing Obtaining relevance score between two nodes (one the fundamental problems in data mining) Spectral partitioning of a network into clusters Given a node A , how closely related are B and C to A ? 33/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Relevance scores 1/3 1 p2 p1 1/3 Initial page: p1 1/2 p4 1/3 1 p3 1/2 1/3 1/3 1 p2 p2 p1 p1 1 1/3 1/3 1 1/2 p4 p4 1/3 1/3 1 1 p3 p3 1/2 mean number of transitions mean number of transitions to reach p3 = 1/ π 3 to reach p2 = 1/ π 2 34/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Relevance scores 1/3 1 p2 p1 1/3 Initial page: p1 1/2 p4 1/3 1 p3 1/2 1/3 1/3 1 p2 p2 p1 p1 1 1/3 1/3 1 p4 1/2 p4 1/3 1/3 1 1 p3 p3 1/2 mean number of transitions mean number of transitions 3 11 to reach p3 = 1/ π 3 = to reach p2 = 1/ π 2 = 35/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Summary: random walks on graphs Abstract model: graph model, probabilities associated with edges Useful theory reversible MCs theory aggregation/disaggregation theory transient analysis Markov chains applied to new problems 36/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Dynamics How to account for time component? Graph is discovered online over time Graph changes over time 37/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Dynamics: online graph discovery In real networks Network might be too large to compute exact centralities Network structure might be hidden from public view Question: how to efficiently identify top k most central nodes without complete access to entire network? Local versus global computations Degree centrality can be locally obtained Other centrality metrics require global knowledge 38/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Methodology Sampling Original Network Sampled Network Sampled Network Identification 39/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Methodology Node in the k most central nodes might not be sampled by Sampling sampling algorithm Original Network Estimated the k most Sampled Network Sampled Network central nodes can be in- correct top k in actual network Identification 40/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Devices, collaborations and social networks Rank correlation between degree and other centralities Set Type # of nodes # of edges Description AS-Snapshot Device 22,963 48,436 Snapshot of Internet at level of AS ca-CondMat Collaboration 23,133 186,936 ArXiv Condense Matter ca-HepPh Collaboration 12,008 237,010 ArXiv High Energy Physics email-Enron Social 36,692 367,662 Email network from Enron 41/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Sampling and identification options Sampling phase Random-walk sampling Identification phase Recalculation in sampled network Degree as alias to other centralities 42/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Random-walk sampling Start from randomly selected node Visit next node uniformly at random Assume random-walker can query degrees of visited nodes Sampling Information • Node Queried degree 1 node1 5 3 node2 6 2 node3 4 . . . 43/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Performance of random-walk sampling Random-walker quickly includes desired top k nodes into sampled set Fraction of top k nodes in sampled set RW efficiently col- lects most central nodes with small ca-CondMat sampled fraction 44/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Sampling and identification options Sampling phase Random-walk sampling Identification phase Recalculation in sampled network Degree as alias to other centralities 45/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Identification strategy: recalculation Centralities recalculated on sampled network Recalculated centralities are approximation of original centralities Centrality calcu- lation on sampled network 46/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Identification strategy: degree as alias Visited nodes sorted according to queried degrees Top k nodes in sorted list taken as top k highest other centralities Use side information for identifying most central nodes 47/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Performance of identification strategies How accurately identify desired top k nodes in sampled network? Overlap ration between identified top k nodes and original top k nodes Facebook: more ca-CondMat than 500 million users Deg-alias Re- calc. 48/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction For further details Online Estimating the k Central Nodes of a Network , Y. Lim, D. Menasche, B. Ribeiro, D. Towsley, P . Basu, IEEE Network Science Workshop, Westpoint, 2011 49/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Dynamics: graph changes over time How to infer most influential nodes in a changing world? Break for advertisement from authors Related work (main conference) Characterizing Continuous Time Random Walks on Time Varying Graphs , Daniel Figueiredo (Federal University of Rio de Janeiro - UFRJ), Philippe Nain (INRIA), Bruno Ribeiro (UMass Amherst), Edmundo de Souza e Silva (UFRJ) and Don Towsley (UMass Amherst) 50/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Outline Random walks on graphs 1 Hidden Markov Models and prediction 2 51/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Hidden Markov Models and prediction Use of HMMs speech recognition signal processing artificial intelligence computational biology image processing finance medical diagnosis . . . 52/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Hidden Markov Models Use of HMMs Given a time series, how to parameterize model to predict future values? inferring customer behavior modeling network channel losses modeling traffic generating workload . . . Note: we have traces of time series of one or more variables. Is there a structure behind the data? 53/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Foundation Performance/Availability Analyst understands how the system works Markovian models: Important issue : choice of state variables Models parameterized from some prior knowledge of the system behavior Key point : the system state is directly observable 54/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Foundation Hidden Markov Models System state is assumed to not be directly observable. But... can observe values that are a probabilistic function of the state of the underlying Markov process. 55/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Hidden Markov Models Example Customer browsing through the web pages of an online bookstore Problem : determine the probability that a specific customer is ready to order an item based on their past behavior Assumed that we have access to data that includes the intention of a customer 56/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Hidden Markov Models Example 0.2 0.18 0.3 JB, just browsing IP IP, interested in product JB 0.4 0.1 0.3 RO 0.6 RO, ready to order 0.02 LV, leaving 0.2 0.3 0.5 1.0 LV 57/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Hidden Markov Models Example But... user’s state of intent is not directly observable Suppose only the types of pages a customer visits are observable. O product overview D product details C set of products within a category S shopping cart P purchase E exit Trace: O C D C D C D O C D C D S O C D S P E 58/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Hidden Markov Models Example States are correlated with the sequence of observations 6 observable symbols IP (page types) O D JB C RO S P O E D O JB, just browsing C D S C P S E P IP, interested in product E RO, ready to order O D LV, leaving C LV S P E 59/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Hidden Markov Models Example Note: In most problems we do not know how many states the hidden chain contains. the interpretation of the hidden states (e.g. the “states of intent”) IP observable symbols (page types) O D C JB RO S P O E D O C D S C P S JB, just browsing E P E IP, interested in product O RO, ready to order D C LV S P LV, leaving E hidden states 60/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Hidden Markov Models HMM Elements The HMM elements are a set of hidden states a set of symbols the state transition probability matrix the probabilities of emitting each symbol at each state IP O D C JB RO S O P E D O JB, just browsing C D S C P S E P IP, interested in product E RO, ready to order O D LV, leaving C S LV P E 61/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Questions What is the most probable hidden state (the customer intent) given the observed sequence of pages visited? What are the model parameters that maximize the probability that the observed sequence is generated by the model? Assume 2 types of customers: young and mature. Given a user session and the sequence of page clicks (but not the customer type), what is the probability that the customer is young versus mature? 62/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Problems Problem 1 Given the observation sequence O T = O 1 O 2 . . . O T and a model M , compute the probability of observing the output sequence O T given the underlying model M , i.e., P [ O T |M ] . IP O D JB C RO S O P E D O C D S C P S E P E O D C LV S P E Hidden Markov Model 0.85 O O O D C S P E P E what is probability? trace probability 63/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Problems Problem 2 Given the observation sequence O T = O 1 O 2 . . . O T and a model M , how to determine a best state sequence Q T = q 1 , . . . , q T which best explains the output sequence O T . IP O D JB C S RO P O E D O C D S C P S E P E O D C S LV P E Hidden Markov Model JB JB IP RO LV JB O O O D C S P E P E what is best sequence of states? trace sequence of states 64/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Problems Two versions of problem 2 Given the observation sequence O T = O 1 O 2 . . . O T For each time t , what is the most probable state of the 1 underlying MC? What is the most probable sequence of states? 2 IP If pIP,LV =0, O D C JB RO S P * in (1), we could have O D E O C D S C P S E s t=IP and s t+1=LV P E * in (2), the sequence O D C of states < IP, LV > LV S P E could never occur. Hidden Markov Model JB JB IP LV IP LV IP O O O D C S P E P E for each time t, what is the most probable state? trace sequence of states 65/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Problems Two versions of problem 2 Given the observation sequence O T = O 1 O 2 . . . O T For each time t , what is the most probable state of the 1 underlying MC? What is the most probable sequence of states? 2 IP O D C JB RO S P O D E O C D S C P S E P E O D C LV S P E Hidden Markov Model JB JB IP LV IP LV JB O O O D C S P E P E what is most probable sequence of states? trace sequence of states 66/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Problems Problem 3 Given the observation sequence O T = O 1 O 2 . . . O T , construct an underlying model M , such that P [ O T |M ] is maximized. ? IP ? ? O ? D C ? JB RO S P O E ? D O ? C D ? S C P S E P ? ? E O O O D C S P P E P P E ? ? ? ? O D what are best C LV S model parameters? P E ? trace Hidden Markov Model 67/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Problems Problem 4 Given several possible models M i 1 ≤ i ≤ K and a sequence of observed values, what is the probability that M j is the actual model, i.e. P [ M j |O ] . IP O D JB C RO S P O D E O C D S C P S E P E O D C LV S P E what is most IP O O O D C S P P E P P E O likely model? D C JB RO S P O D E O C D S C P S E P E O D C LV S P E 68/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Algorithms Problem 1 There is an iterative algorithm for computing P [ O T |M ] which has complexity O ( N 2 T ) . IP O D JB C RO S P O E D O C D S C P S E P E O D C S LV P E Hidden Markov Model 0.85 O O O D C S P E P E what is probability? trace probability 69/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Algorithms Problem 2 The forward-backward algorithm solves a specific version of problem 2 computes P [ q T = s i |M , O T ] and has complexity O ( N 2 T ) IP O D JB C RO S O P E D O C D S C P S E P E O D C LV S P E Hidden Markov Model JB O O O D C S P E P E at time T, what is the most probable state? trace sequence of states 70/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Algorithms Problem 3 Procedure for adjusting the model parameters based on: the maximum likelihood estimation (MLE) method a technique derived from the Expectation-Maximization (EM) algorithm known as Baum-Welch iterative procedure to obtain a local maximum for the likelihood function. ? IP ? ? O ? D ? C JB RO S P O E ? D O ? C D ? S C P S E ? ? P E O O O D C S P P E P P E ? ? ? ? O D what are best C LV S model parameters? P ? E trace Hidden Markov Model 71/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Algorithms Problem 4 Basically this is a classification problem and one simply applies Bayes Theorem. IP O D JB C RO S P O E D O C D S C P S E P E O D C S LV P E what is most IP O O O D C S P P E P P E O likely model? D JB C S RO P O E D O C D S C P S E P E O D C S LV P E 72/82 D. Menasche, E. de Souza e Silva

Random walks on graphs Hidden Markov Models and prediction Examples in Performance Evaluation Traffic Modeling Build accurate traffic models for the packet flow generated by different applications such as SMTP , HTTP , network games, instant messaging. Observable data: packet inter-arrival time and packet size. Application: traffic generators, capacity planning. Traffic models for aggregate traffic . Application: traffic generators, capacity planning. Traffic classification . Application: Identifying distinct applications by observing traffic. This is an example of Problem 4. 73/82 D. Menasche, E. de Souza e Silva

Applications of Machine Learning to Performance Evaluation Daniel - PowerPoint PPT Presentation

Random walks on graphs Hidden Markov Models and prediction Applications of Machine Learning to Performance Evaluation Daniel Sadoc Menasche 1 Edmundo de Souza e Silva 2 Federal University of Rio de Janeiro 1 Computer Science Department,

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

MLCC 2015 machine learning applications Francesca Odone ML applications Machine Learning

Machine Learning 1 Machine(Learning(in(a(Nutshell ( Data$ Model$ Performance$ Measure$

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

What is a performance evaluation? Performance Management v. Performance Evaluation Evaluation

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

MORNING AGENDA: SUPPLY SIDE 10am Welcome 10-10.15am Agree on acceptable outcomes of WG meeting

IHI Expedition Expedition: Making Mental Health Care Safer in the Hospital Setting Session 1:

APTS Applied Stochastic Processes Markov chains and reversibility Renewal processes and

Advances in Programming Languages APL8: Monads and I/O Ian Stark School of Informatics The

Visualizing alignments DOROTHYCROWFOOTHODGKIN DOROTHY--------HODGKIN Bas E. Dutilh Systems

Investigating Techniques for Evaluating Fly Ash Behaviour in Air-entrained Concrete G M Sadiqul

Io IoPPN PPN Po Post stdoc doc Fe Fell llows owship hip Ap Appli plication cation Tra

Development of an EHR System for Sharing - a Semantic Perspective , black Recommended maximum