Basic Concepts G. Urvoy-Keller urvoy@unice.fr Probabilty and - PowerPoint PPT Presentation

Basic Concepts G. Urvoy-Keller urvoy@unice.fr Probabilty and Statistics

Outline Basic concepts � Probability � Conditional Probability � Moments � Common Distributions � Binomial � Zipf � Poisson � Uniform � Normal � Beta � Gamma � 2

Basic Concepts A random experiment is an experiment whose outcome � cannot be predicted with certainty The sample space is the set of all possible outcomes from an � experiment The outcomes from random experiments are called random � variables and often represented as uppercase variables (e.g. X) Random variables can be discrete or continuous � An event is a subset of outcomes in the sample space � Mutually exclusive events: 2 events that cannot occur � altogether Extension: n events that taken in every possible pairs are � 3 mutually exclusive

Probability Probability is the measure of the likelihood that some event will occur � Historically, there are two ways of computing probabilities � Equal likelihood model (classical theory): � For an event E we count (no experiment) the number n of favorable outcomes � We also know the total number of possible outcomes N � We then set P=n/N � We thus assume that all outcomes are equally likely • Works well for coin and die tossing, cards. � Relative frequency methods: � Can be used when all outcomes are not equally likely � “Active method” where the experiment is carried out n times � If the event E occurred f times, then P=f/n � Modern theory of probabilities is based on axiomatic theory � The probability of an event is computed based on: � Probability density function in the case of a continuous random variable � Probability mass function in the case of a discrete random variable � Common convention: use density (or pdf) for discrete and continuous rv � 4

Probability in the case of a continuous random variable Let f(x)=P(x<X<x+dx)/dx be the probability density function � (pdf) b P ( a ≤ X ≤ b )= ∫ f ( x ) dx a f(x) x) 5 x

Probability in the case of a discrete random variable Let f(x) be the probability mass function (pmf) � b P ( a ≤ X ≤ b )= ∑ f ( x ) a f(x) x) b a 6

Cumulative Distribution Function The cdf F(x) is the probability that the random variable X is � less than or equal to x: x F ( x )= ∫ f ( u ) du ( continuous case ) −∞ Cdf dfs s conver converge ge to o 1 F ( x )= ∑ f ( x i ) ( discrete case ) x i ≤ x 7

Axioms of Probability Let S be the sample space and E be an event (i.e., subset of S) � Axiom 1: The probability of event E must be between 0 and 1: � 0≤P(E)≤1 Axiom 2: � P(S)=1 Axiom 3: for mutually exclusive events E 1 ,E 2 ,…,E n � n P ( E 1 ∪ E 2 ∪ ... ∪ E n )= ∑ P ( E i ) 1 8

Axioms of Probability Axiom 1 states that a probability must be between 0 and 1. � This means that pdf and pmf must be positive and sum to 1 Axiom 2 says that an outcome must occur and the sample � space cover all possible outcomes Axiom 3 enables to compute the probability that at least one � of the mutually exclusive events occur by summing their individual probabilities. 9

Conditional Probability and Independence The conditional probability of event E given event F is defined � as: P ( E ∣ F )= P ( E ∩ F ) P ( F ) P(E∩F) represents the probability that E and F occur together � P(F) appears as a “re-normalization” factor � Example: for mutually exclusive events E and F, P(E∩F) =0 � and thus P(E|F)=0. The latter denotes a very strong dependence between the two events! 10

Conditional Probability and Independence Independence: two events E and F are said to be independent � if: P ( E ∣ F ) =P ( E ) which is equivalent to: P ( E ∩ F ) =P ( E ) P ( F ) Definition for the case of n events: E 1 ,…E n are said to be � independent if any subset E (1) , E (2) ,.. E (k) , is independent P ( E ( 1 ) ∩ E ( 2 ) ... ∩ E ( k ) ) =P ( E ( 1 ) )× P ( E ( 2 ) )× .... × P ( E ( k ) ) Independence is not transitive!!!! � If E1 is independent from E2 and E2 from E3, E1 might depend on E3 Independence is reflexive: if E is independent from F, F is � independent from E since P ( F ∣ E ) =P ( E ∣ F ) P ( F ) 11 P ( E )

Conditional Probability- Illustration It has been demonstrated that there was a lot free riders in � Gnutella networks. Free-riders: clients that retrieve documents but do not provide � any data to other peers. A natural question that may arise when studying such � systems is: “How many files does a client share with its peers?” Due to free-riding, you will find very low figures. It is thus � better to split the above question into two sub-questions: What is the probability that a client is a free-rider? � What is the probability that a non free-rider shares n files? � 12

Conditional Probability- Illustration Let: � Q be the random variable that denotes the number of files offered � by a client S be the random variable that denotes the type of client � F: free-rider � Non-F: not a free rider � The previous questions can be formulated as follows: � P ( S = F ) P ( Q=n ∣ S=non − F ) 13

Independence - Illustration A die is tossed twice. Consider the following events: � A: the first toss gives an odd number � B: the second toss gives an odd number � C: the sum of the two tosses is an odd number � Any pair of the previous events are independent. Indeed � P(A)=P(B)=P(C)=1/2 � P(A∩B)=P(A∩C)=P(B∩C)=1/4 � Since to obtain an odd number, you need one odd number � Still, P(A∩B∩C)=0. Hence (A,B,C) are not independent � 14

Total Probability Theorem Theorem: Let E 1 ,E 2 ,…E n be n mutually exclusive events such � that U i E i =S (S is the sample space) and P(E i ) ≠ 0. Let B be an event. Then: n P ( B )= ∑ P ( B ∣ E i ) P ( E i ) i= 1 Proof: � 15

Bayes Theorem Bayes theorem allows to estimate a “posteriori” probabilities � from “a priori” probabilities. Consider the following problem: one wants to evaluate the � efficiency of a test for a disease. Let: A= event that the test states that the person is infected � B=event that the person is infected � A c =event that the test states that the person is not infected � B c =event that the person is not infected � Suppose we have the following a-priori information: � P(A|B)=P(A c |B c )=0.95 - obtained from tests on well defined � populations P(B)=0.005 � A good measure of the efficiency of the test is the “a � posteriori” probability P(B|A) 16

Bayes Theorem Theorem: given a event F and a set of mutually exclusive � events E 1 ,E 2 ,…E n whose union makes up the entire sample space: P ( E i ) P ( F ∣ E i ) P ( E i ∣ F )= n ∑ P ( F ∣ E k ) P ( E k ) k= 1 A post poster erio o inf nfor ormat ation on A pr prior ori inf nfor ormat ation on Derivation of the theorem is straightforward using the � definition of conditional probabilities 17

Bayes Theorem Applied to the “disease test” problem stated before, we � obtain: P ( B ∣ A )= P ( B ) P ( A ∣ B ) P ( B ) P ( A ∣ B ) +P ( B c ) P ( A ∣ B c ) 0.005 × 0.95 0.005 × 0.95 + 0.995 ( 1 − 0.95 ) = 0.087 Thus, when the test is positive, the person is in fact infected in � only 8.7% of the cases! Very bad!!! Conclusion: even if “a priori” tests were correct for 95% of the � cases, this was not enough due to the scarcity of the disease For example, with P(B)=0.1 and , we would have obtained: � 18 P(B|A)=68% (not that good either…)

Mean and Variance The mean or average value E[X]= µ of a distribution provides a � measure of the tendency of a distribution. +∞ E [ X ]= ∫ xf ( x ) dx ( continuous case ) −∞ +∞ ∑ x i f ( x i ) ( discrete case ) i= 1 The variance V(X)= σ 2 of a random variable (r.v.) X measures the average dispersion around the mean μ +∞ 2 f ( x ) dx ( continuous case ) 2 ]= ∫ 2 =V ( X ) =E [( X − μ ) σ ( x − μ ) −∞ +∞ ( x i − μ ) 2 f ( x i ) ( dis crete case ) ∑ i= 1 19

Mean and Variance E[] is a linear � function E [ αX ] =αE [ X ] α is a scalar � E [ X+Y ] =E [ X ] +E [ Y ] X,Y r.v. � Practical formula: � V ( X ) =E [( X − μ ) 2 ] =E [ X 2 − 2μX +μ 2 ] E [ X 2 ]− 2μE [ X ] +μ 2 E [ X 2 ]− μ 2 20

Coefficient of Variation σ= √ V ( X ) is called the standard deviation of the r.v. X � C= σ is called the coefficient of variation of the r.v. X � μ Interpretation: � “C measures the level of divergence of X with respect to its mean” � or “C measures the variation of X in units of its mean” � C allows to compare two distributions with different means � C is independent of the chosen unity � 21

Coefficient of Variation To illustrate C, let us consider two sets of values drawn from normal distributions (defined later): Distribution 1 with µ =1, σ =10 => => C=10 Distribution 2 with μ=100,σ=10 => C=0.1 Looking at the pdfs, you might miss how values can be close or Set et 1 Set et 2 far away from the means: -11 11 106. 106.2 -9. 9.6 108 108 16 16 109. 109.4 1. 1.6 90. 90.08 08 -11 11 102. 102.1 0. 0.59 59 102.4 102. -10 10 89. 89.92 92 -12 12 92.58 92. 58 -1. 1.6 110.8 110. 22 11 11 98.69 98. 69

Basic Concepts G. Urvoy-Keller urvoy@unice.fr Probabilty and - PowerPoint PPT Presentation

Basic Concepts G. Urvoy-Keller urvoy@unice.fr Probabilty and Statistics Outline Basic concepts Probability Conditional Probability Moments Common Distributions Binomial Zipf Poisson Uniform

Basic Concepts of I R: Outline Basic Concepts of Information Retrieval: Task definition of

CONCEPTS AND CONCEPTS AND CONCEPTS AND CONCEPTS AND PR PR PRINC PRINC NCIPLES OF NCIPLES

Current C Current C Current C Current C Concepts of Concepts of Concepts of Concepts of

Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic

Basic Experimental Design Basic Concepts in Experimental Design Prof. Dr. Luc Duchateau Ghent

Important Concepts Important Concepts Some important concepts in financial and derivative

Nucleic Acids Basic Concepts Basic Concepts Nucleic Acids David Murray PhD UCD|Mater

Part I - Basic concepts of thermochronology Basic concepts of thermochronology

Survival analysis : from basic concepts to open research questions Ecole dt,

CS6220: DATA MINING TECHNIQUES Chapter 10: Cluster Analysis: Basic Concepts and Methods

Classification 1 Classification: Basic Concepts and Methods Classification: Basic Concepts

Lecture Lecture 3 3 Basic Concepts Basic Concepts Dr. Hazim Dwairi Dr Hazim Dwairi

Part I - Basic concepts of thermochronology Basic concepts of thermochronology

Part I - Basic concepts of thermochronology Basic concepts of thermochronology

Real Time Scheduling Basic Concepts Radek Pel anek Basic Elements Model of RT System

UML State Models U State ode s Basic State Model Concepts/Notations Basic State Model

Networks - Fall 2005 Chapter 2 Play on networks 1: Strategic substitutes Bramoull e and

Environmental Jus0ce and the SDWA Agenda Who is dispropor/onately affected? Why are they

CS145: INTRODUCTION TO DATA MINING Course Project Overview Instructor: Yizhou Sun

Your First MongoDB Environment: What You Should Know Before Choosing MongoDB as Your Database Me

Graph Theory and Network Measurment Social and Economic Networks Jafar Habibi MohammadAmin

Cloud Co-opetition A Digital Transformation Download from http://bit.ly/170411icf2017my Malaysia

MATH 105: Finite Mathematics 9-3: Organizing Data Prof. Jonathan Duncan Walla Walla College

Session 3: Summarizing data Stats 60/Psych 10 Ismael Lemhadri Summer 2020 This time