CS 147: Computer Systems Performance Analysis Workload - PowerPoint PPT Presentation

CS147 2015-06-15 CS 147: Computer Systems Performance Analysis Workload Characterization CS 147: Computer Systems Performance Analysis Workload Characterization 1 / 31

Overview CS147 Overview 2015-06-15 Terminology Specifying Parameters Identifying Parameters Histograms Principal-Component Analysis Markov Models Overview Clustering Clustering Steps Clustering Methods Terminology Using Clustering Specifying Parameters Identifying Parameters Histograms Principal-Component Analysis Markov Models Clustering Clustering Steps Clustering Methods Using Clustering 2 / 31

Terminology Workload Characterization Terminology CS147 Workload Characterization Terminology 2015-06-15 Terminology ◮ User (maybe nonhuman) requests service ◮ Also called workload component or workload unit ◮ Workload parameters or workload features model or characterize the workload Workload Characterization Terminology ◮ User (maybe nonhuman) requests service ◮ Also called workload component or workload unit ◮ Workload parameters or workload features model or characterize the workload 3 / 31

Terminology Selecting Workload Components CS147 Selecting Workload Components 2015-06-15 Terminology ◮ Most important: components should be external : at interface of SUT ◮ Components should be homogeneous ◮ Should characterize activities of interest to the study Selecting Workload Components ◮ Most important: components should be external : at interface of SUT ◮ Components should be homogeneous ◮ Should characterize activities of interest to the study 4 / 31

Terminology Choosing Workload Parameters CS147 Choosing Workload Parameters 2015-06-15 Terminology ◮ Select parameters that depend only on workload (not on SUT) ◮ Prefer controllable parameters ◮ Omit parameters that have no effect on system, even if important in real world Choosing Workload Parameters ◮ Select parameters that depend only on workload (not on SUT) ◮ Prefer controllable parameters ◮ Omit parameters that have no effect on system, even if important in real world 5 / 31

Specifying Parameters Averaging CS147 Averaging 2015-06-15 Specifying Parameters ◮ Basic character of a parameter is its average value ◮ Not just arithmetic mean ◮ Good for uniform distributions or gross studies Averaging ◮ Basic character of a parameter is its average value ◮ Not just arithmetic mean ◮ Good for uniform distributions or gross studies 6 / 31

Specifying Parameters Specifying Dispersion CS147 Specifying Dispersion 2015-06-15 Specifying Parameters ◮ Most parameters are non-uniform ◮ Specifying variance or standard deviation brings major improvement over average ◮ Average and s.d. (or C.O.V.) together allow workloads to be grouped into classes Specifying Dispersion ◮ Still ignores exact distribution ◮ Most parameters are non-uniform ◮ Specifying variance or standard deviation brings major improvement over average ◮ Average and s.d. (or C.O.V.) together allow workloads to be grouped into classes ◮ Still ignores exact distribution 7 / 31

Identifying Parameters Histograms Single-Parameter Histograms CS147 Single-Parameter Histograms 2015-06-15 Identifying Parameters ◮ Make histogram or kernel density estimate Histograms ◮ Fit probability distribution to shape of histogram ◮ Chapter 27 (not covered in course) lists many useful shapes ◮ Ignores multiple-parameter correlations Single-Parameter Histograms ◮ Make histogram or kernel density estimate ◮ Fit probability distribution to shape of histogram ◮ Chapter 27 (not covered in course) lists many useful shapes ◮ Ignores multiple-parameter correlations 8 / 31

Identifying Parameters Histograms Multi-Parameter Histograms CS147 Multi-Parameter Histograms 2015-06-15 Identifying Parameters ◮ Use 3-D plotting package to show 2 parameters ◮ Or plot each datum as 2-D point and look for “black spots” Histograms ◮ Shows correlations ◮ Allows identification of important parameters ◮ Not practical for 3 or more parameters Multi-Parameter Histograms ◮ Use 3-D plotting package to show 2 parameters ◮ Or plot each datum as 2-D point and look for “black spots” ◮ Shows correlations ◮ Allows identification of important parameters ◮ Not practical for 3 or more parameters 9 / 31

Identifying Parameters Principal-Component Analysis Principal-Component Analysis (PCA) CS147 Principal-Component Analysis (PCA) 2015-06-15 Identifying Parameters ◮ How to analyze more than 2 parameters? ◮ Could plot endless pairs Principal-Component Analysis ◮ Still might not show complex relationships ◮ Principal-component analysis solves problem mathematically ◮ Rotates parameter set to align with axes Principal-Component Analysis (PCA) ◮ Sorts axes by importance ◮ How to analyze more than 2 parameters? ◮ Could plot endless pairs ◮ Still might not show complex relationships ◮ Principal-component analysis solves problem mathematically ◮ Rotates parameter set to align with axes ◮ Sorts axes by importance 10 / 31

Identifying Parameters Principal-Component Analysis Advantages of PCA CS147 Advantages of PCA 2015-06-15 Identifying Parameters ◮ Handles more than two parameters ◮ Insensitive to scale of original data Principal-Component Analysis ◮ Detects dispersion ◮ Combines correlated parameters into single variable ◮ Identifies variables by importance Advantages of PCA ◮ Handles more than two parameters ◮ Insensitive to scale of original data ◮ Detects dispersion ◮ Combines correlated parameters into single variable ◮ Identifies variables by importance 11 / 31

Identifying Parameters Principal-Component Analysis Disadvantages of PCA CS147 Disadvantages of PCA 2015-06-15 Identifying Parameters ◮ Tedious computation (if no software) Principal-Component Analysis ◮ Still requires hand analysis of final plotted results ◮ Often difficult to relate results back to original parameters Disadvantages of PCA ◮ Tedious computation (if no software) ◮ Still requires hand analysis of final plotted results ◮ Often difficult to relate results back to original parameters 12 / 31

Identifying Parameters Markov Models Markov Models CS147 Markov Models 2015-06-15 Identifying Parameters ◮ Sometimes, distribution isn’t enough ◮ Requests come in sequences Markov Models ◮ Sequencing affects performance ◮ Example: disk bottleneck ◮ Suppose jobs need 1 disk access per CPU slice ◮ CPU slice is much faster than disk Markov Models ◮ Strict alternation uses CPU better ◮ Long disk-access strings slow system ◮ Sometimes, distribution isn’t enough ◮ Requests come in sequences ◮ Sequencing affects performance ◮ Example: disk bottleneck ◮ Suppose jobs need 1 disk access per CPU slice ◮ CPU slice is much faster than disk ◮ Strict alternation uses CPU better ◮ Long disk-access strings slow system 13 / 31

Identifying Parameters Markov Models Introduction to Markov Models CS147 Introduction to Markov Models 2015-06-15 ◮ Represent model as state diagram Identifying Parameters ◮ Probabilistic transitions between states ◮ Requests generated on transitions Markov Models 0.4 Network Introduction to Markov Models 0.6 0.3 0.3 ◮ Represent model as state diagram 0.4 0.2 CPU Disk 0.8 ◮ Probabilistic transitions between states ◮ Requests generated on transitions 0.4 Network 0.6 0.3 0.3 0.4 0.2 CPU Disk 0.8 14 / 31

Identifying Parameters Markov Models Creating a Markov Model CS147 Creating a Markov Model 2015-06-15 Identifying Parameters ◮ Observe long string of activity ◮ Use matrix to count pairs of states Markov Models ◮ Normalize rows to sum to 1.0 CPU Network Disk CPU 0.6 0.4 Creating a Markov Model Network 0.3 0.4 0.3 Disk 0.8 0.2 ◮ Observe long string of activity ◮ Use matrix to count pairs of states ◮ Normalize rows to sum to 1.0 CPU Network Disk CPU 0.6 0.4 Network 0.3 0.4 0.3 Disk 0.8 0.2 15 / 31

Identifying Parameters Markov Models Example Markov Model CS147 Example Markov Model 2015-06-15 Identifying Parameters ◮ Reference string of opens, reads, closes: ORORRCOORCRRRRCC ◮ Pairwise frequency matrix: Markov Models Open Read Close Sum Open 1 3 4 Example Markov Model Read 1 4 3 8 Close 1 1 1 3 ◮ Reference string of opens, reads, closes: ORORRCOORCRRRRCC ◮ Pairwise frequency matrix: Open Read Close Sum Open 1 3 4 Read 1 4 3 8 Close 1 1 1 3 16 / 31

Identifying Parameters Markov Models Markov Model for I/O String CS147 Markov Model for I/O String 2015-06-15 ◮ Divide each row by its sum to get transition matrix: Identifying Parameters Open Read Close Open 0.25 0.75 Read 0.13 0.50 0.37 Close 0.33 0.33 0.34 Markov Models ◮ Model: Read 0.50 Markov Model for I/O String 0.33 0.75 ◮ Divide each row by its sum to get transition matrix: 0.37 0.13 0.4 0.25 0.34 Open Close 0.33 Open Read Close Open 0.25 0.75 Read 0.13 0.50 0.37 Close 0.33 0.33 0.34 ◮ Model: 0.50 Read 0.33 0.75 0.37 0.13 0.4 0.25 0.34 Open Close 0.33 17 / 31

CS 147: Computer Systems Performance Analysis Workload - PowerPoint PPT Presentation

CS147 2015-06-15 CS 147: Computer Systems Performance Analysis Workload Characterization CS 147: Computer Systems Performance Analysis Workload Characterization 1 / 31 Overview CS147 Overview 2015-06-15 Terminology Specifying

CS 147: Computer Systems Performance Analysis Approaching Performance Projects 1 / 35 Overview

CS 147: Computer Systems Performance Analysis Comparing Systems and Analyzing Alternatives 1 /

CS 147: Computer Systems Performance Analysis Networks of Queues 1 / 18 Overview CS147

CS 147: Computer Systems Performance Analysis Selecting Techniques 1 / 37 Overview CS147

CS 147: Computer Systems Performance Analysis Advanced Regression Techniques 1 / 31 Overview

CS 147: Computer Systems Performance Analysis Introduction to Queueing Theory 1 / 27 Overview

CS 147: Computer Systems Performance Analysis Higher Designs and Other Considerations 1 / 25

CS 147: Computer Systems Performance Analysis Fractional Factorial Designs 1 / 26 Overview

CS 147: Computer Systems Performance Analysis Multiple and Categorical Regression 1 / 36

CS 147: Computer Systems Performance Analysis Examples Using a Distributed File System 1 / 37

CS 147: Computer Systems Performance Analysis Course Introduction 1 / 35 Overview CS147

CS 147: Computer Systems Performance Analysis Mistakes in Graphical Presentation 1 / 45

CS 147: Computer Systems Performance Analysis Specifics of Graphical Presentation 1 / 35

CS 147: Computer Systems Performance Analysis Workload Selection 1 / 39 Overview CS147

CS 147: Computer Systems Performance Analysis Summarizing Data 1 / 30 Overview CS147 Overview

CS 147: Computer Systems Performance Analysis Introduction to Graphical Presentation 1 / 25

Introduction to Machine Learning CMU-10701 19. Clustering and EM Barnabs Pczos Contents

Efficient Diameter Approximation for Large Graphs in MapReduce Geppino Pucci - Universit` a di

NVIDIA GPU Architecture for General Purpose Computing Anthony Lippert 4/27/09 1 Outline

Automatic clustering of similar VM to improve the scalability of monitoring and management in

FAQs Quiz #3 Scores will be available by 3/6 Programming Assignment #2 March 10

Chapter 5: Cluster ering ing Jilles Vreeken IRDM 15/16 10 Nov 2015 Qu Question o of f

Security in SESAR 2020 Ruben Flohr ATM Expert, SESAR JU GAMMA final event 15 November 2017

Cybersecurity and Africa Benot MOREL Carnegie Mellon University Afrinic Cyberization of