ProbabilityandStatistics* ! forComputerScience** - PowerPoint PPT Presentation

Probability*and*Statistics* � ! for*Computer*Science** "Sta&s&cal!thinking!will!one!day! be!as!necessary!for!efficient! ci&zenship!as!the!ability!to!read! and!write."!H.!G.!Wells ! Credit:!wikipedia! Hongye!Liu,!Teaching!Assistant!Prof,!CS361,!UIUC,!03.12.2020!

Midterm1** * Grading done is . be published today will Grades * . it 's curved will Points given be * relatively harder than last semester . You 're welcome come to to * hour today office discuss to about it .

stole MidTerm1 ' 9.724 ' ' 5 112.849 - r last semester 30 Std -25.413 median Frequency 20 = 124 mean =L 12.956 10 0 60 80 100 120 140 Score

Last*time* � Review!of!variance,!sample!mean! � Sum!and!difference!between! variables!of!normal!distribu&ons! � Hypothesis!test!of!equality!of!two! sample!means! � ChiRsquare!test!

Contents* � Review!of!sta&s&cal!inference! � Inferring!probability!model!from! data! � Maximum!likelihood!es&mate! 1 � Confidence!interval!for!MLE! � Bayesian!inference!

Categories*of*Statistical* inference** � Sta&s&cal!inference!includes! � Drawing!conclusion!from!samples! � Assessing!the!significance!of!evidence! for!a!hypothesis! � Inferring!the!parameters!of! = probabilis&c!model!from!data!

Contents* � Review!of!sta&s&cal!inference! � Inferring(probability(model(from( data( � Maximum!likelihood!es&mate! � Confidence!interval!for!MLE! � Bayesian!inference!

Motivation:*binomial*example* � Suppose!we!have!a!coin!with!unknown! probability!of!coming!up!heads! � We!toss!it!N!&mes!and!observe!k!heads! - - � We!know!that!this!data!comes!from!a! pix-ks-fkjpkci.PT " binomial!distribu&on! K � What!is!your!best!es&mate!of!the!probability! head ? of!coming!up!heads?! pot getting µ = Is 15-5 Credit:!David!Varodayan!

Motivation:*geometric*example* � Suppose!we!have!a!die!with!unknown! probability!of!coming!up!six! � We!roll!it!and!it!comes!up!six!for!the!first! &me!on!the!kth!roll! � We!know!that!this!data!comes!from!a! geometric!distribu&on! � What!is!your!best!es&mate!of!the!probability! of!coming!up!heads?! Credit:!David!Varodayan!

Motivation:*Poisson*example* � Suppose!we!have!data!on!the!number!of!babies! born!each!hour!in!a!large!hospital! known 1( 2( N" hour( …! ④ #!of!babies! k 1 # k 2 # k N # …! ! T T T � !We!can!assume!the!data!comes!from!a!Poisson! K , distribu&on! kN � What!is!your!best!es&mate!of!the!intensity!λ?! Atm known Credit:!David!Varodayan!

The*parameter*estimation*problem* � Suppose!we!have!a!dataset!that!we!know!comes!from! a!distribu&on!(ie.!Binomial,!Geometric,!or!Poisson,!etc.)! - T o � What!is!the!best!es&mate!of!the!parameters!( θ !or! θ s)! of!the!distribu&on?! � Examples:! � For!binomial!and!geometric!distribu&on,! θ( =! p !(probability!of! T success)! � For!Poisson!and!exponen&al!distribu&ons,! θ( =! λ !(intensity)! mu � For!normal!distribu&ons,! θ( could!be! μ !or! σ 2 .# n n

Maximum*likelihood*estimation*(MLE)* � We!write!the!probability!of!seeing!the!data!D! mm given!parameter!θ!! ' up ! " P L ( θ ) = P ( D | θ ) T H - *oi o � The! likelihood(func,on !!!!!!!!!!is! not !a! L ( θ ) probability!distribu&on! Ky � The! maximum(likelihood(es,mate((MLE) !of! Pata θ!is!! o is ˆ θ = arg max L ( θ ) an tf Yammerer θ !

Why*is* L (θ)*not*a*probability*distribution?* A.!!It!doesn’t!give!the!probability!of!all!the! ¥ possible!θ!values.!! B.!Don’t!know!whether!the!sum!or!integral!of!!!!!!!!!!! L ( θ ) for!all!possible!θ!values!is!one!or!not.!! 40 ) f # die I C.!Both.! a not O is random variable

Why*is* L (θ)*not*a*probability*distribution?* A.!!It!doesn’t!give!the!probability!of!all!the! possible!θ!values.!! B.!Don’t!know!whether!the!sum!or!integral!of!!!!!!!!!!! L ( θ ) for!all!possible!θ!values!is!one!or!not.!! C.!Both.!

Likelihood*function:*binomial*example* � Suppose!we!have!a!coin!with!unknown! probability!of!coming!up!heads! � We!toss!it! N !&mes!and!observe! k !heads! � We!know!that!this!data!comes!from!a!binomial! distribu&on! � What!is!the!likelihood!func&on!!!!!!!!!!!!!!!!!!!!!!!!!!!?! L ( θ ) = P ( D | θ ) !

Likelihood*function:*binomial*example* � Suppose!we!have!a!coin!with!unknown! probability!of!coming!up!heads! � We!toss!it! N !&mes!and!observe! k !heads! � We!know!that!this!data!comes!from!a!binomial! distribu&on! � What!is!the!likelihood!func&on!!!!!!!!!!!!!!!!!!!!!!!!!!!?! L ( θ ) = P ( D | θ ) replace � N � ! θ k (1 − θ ) N − k L ( θ ) = with o P k

MLE*derivation:*binomial*example* � N � θ k (1 − θ ) N − k L ( θ ) = k ˆ In!order!to!find:! θ = arg max L ( θ ) ! θ TI We!set:!! d L ( θ ) = 0 . d θ

MLE*derivation:*binomial*example* � N � θ k (1 − θ ) N − k L ( θ ) = k

MLE*derivation:*binomial*example* � N � " . tr ) ( c . fi θ k (1 − θ ) N − k L ( θ ) = k off . fit fits ) / - � N � d ( k θ k − 1 (1 − θ ) N − k − θ k ( N − k )(1 − θ ) N − k − 1 ) = 0 d θ L ( θ ) = k

MLE*derivation:*binomial*example* � N � θ k (1 − θ ) N − k L ( θ ) = k � N � d ( k θ k − 1 (1 − θ ) N − k − θ k ( N − k )(1 − θ ) N − k − 1 ) = 0 d θ L ( θ ) = k k θ k − 1 (1 − θ ) N − k = θ k ( N − k )(1 − θ ) N − k − 1

MLE*derivation:*binomial*example* � N � θ k (1 − θ ) N − k L ( θ ) = k � N � d ( k θ k − 1 (1 − θ ) N − k − θ k ( N − k )(1 − θ ) N − k − 1 ) = 0 d θ L ( θ ) = k k θ k − 1 (1 − θ ) N − k = θ k ( N − k )(1 − θ ) N − k − 1 # k − k θ = N θ − k θ

MLE*derivation:*binomial*example* � N � O θ k (1 − θ ) N − k L ( θ ) = k maximized at E � N � d ( k θ k − 1 (1 − θ ) N − k − θ k ( N − k )(1 − θ ) N − k − 1 ) = 0 d θ L ( θ ) = k k θ k − 1 (1 − θ ) N − k = θ k ( N − k )(1 − θ ) N − k − 1 - s p O k − k θ = N θ − k θ ECK f- NP θ = k The(MLE(of(p( ˆ N - -

Likelihood*function:*geometric*example* � Suppose!we!have!a!die!with!unknown!probability! of!coming!up!six! � We!roll!it!and!it!comes!up!six!for!the!first!&me!on! the!kth!roll! � We!know!that!this!data!comes!from!a!geometric! distribu&on! � What!is!the!likelihood!func&on!!!!!!!!!!!!!!!!!!!!!!!!!!!?! L ( θ ) = P ( D | θ ) Assume(θ(is(p .! !

MLE*derivation:*geometric*example* is pot o L ( θ ) = (1 − θ ) k − 1 θ head . - -

MLE*derivation:*geometric*example* f- e f u L ( θ ) = (1 − θ ) k − 1 θ fit . d d θ L ( θ ) = (1 − θ ) k − 1 − ( k − 1)(1 − θ ) k − 2 θ = 0 Titi ¥

⇒ MLE*derivation:*geometric*example* L ( θ ) = (1 − θ ) k − 1 θ d d θ L ( θ ) = (1 − θ ) k − 1 − ( k − 1)(1 − θ ) k − 2 θ = 0 / (1 − θ ) k − 1 = ( k − 1)(1 − θ ) k − 2 θ r = ( K - 1) O I - O p at sett 's head is - O - O = KO l ' = 'T o' - Ko - i

MLE*derivation:*geometric*example* L ( θ ) = (1 − θ ) k − 1 θ d d θ L ( θ ) = (1 − θ ) k − 1 − ( k − 1)(1 − θ ) k − 2 θ = 0 (1 − θ ) k − 1 = ( k − 1)(1 − θ ) k − 2 θ 1 − θ = k θ − θ

MLE*derivation:*geometric*example* L ( θ ) = (1 − θ ) k − 1 θ d d θ L ( θ ) = (1 − θ ) k − 1 − ( k − 1)(1 − θ ) k − 2 θ = 0 (1 − θ ) k − 1 = ( k − 1)(1 − θ ) k − 2 θ dis r - . Geometric Eckl = ¥ 1 − θ = k θ − θ θ = 1 The(MLE(of(p( ˆ = I 6 k

MLE*with*data*from*IID*trials* � If!the!dataset!!!!!!!!!!!!!!!!!!comes!from!IID!trials! D = { x } " " " Xi C- D � L ( θ ) = P ( D | θ ) = P ( x i | θ ) TT - x i ∈ D � Each! x i !!is!one!observed!result!from!an!IID!trial! -

Q:*MLE*with*data*from*IID*trials* � If!the!dataset!!!!!!!!!!!!!!!!!!comes!from!IID!trials! D = { x } � L ( θ ) = P ( D | θ ) = P ( x i | θ ) Xi GD x i ∈ D � Why!is!the!above!func&on!defined!by!the!product?! !A.!IID!samples!are!independent! !B.!Each!trial!has!iden&cal!probability!func&on! o !C.!Both.!

Q:*MLE*with*data*from*IID*trials* � If!the!dataset!!!!!!!!!!!!!!!!!!comes!from!IID!trials! D = { x } � L ( θ ) = P ( D | θ ) = P ( x i | θ ) x i ∈ D � Why!is!the!above!func&on!defined!by!the!product?! !A.!IID!samples!are!independent! !B.!Each!trial!has!iden&cal!probability!func&on! !C.!Both.!

MLE*with*data*from*IID*trials* � If!the!dataset!!!!!!!!!!!!!!!!!!comes!from!IID!trials! D = { x } � L ( θ ) = P ( D | θ ) = P ( x i | θ ) x i ∈ D � The!likelihood!func&on!is!hard!to!differen&ate!in! general,!except!for!the!binomial!and!geometric! cases.! � Clever!trick:!take!the!(natural)!log! -

LogJlikelihood*function* � Since!log!is!a!strictly!increasing!func&on! - ˆ ! θ = arg max L ( θ ) = arg max logL ( θ ) . θ θ � So!we!can!aim!to!maximize!the! logClikelihood( func,on( - I � � logL ( θ ) = logP ( D | θ ) = log P ( x i | θ ) = logP ( x i | θ ) x i ∈ D x i ∈ D � The!logRlikelihood!func&on!is!usually!much!easier! to!differen&ate!

ProbabilityandStatistics* ! forComputerScience** - PowerPoint PPT Presentation

ProbabilityandStatistics* ! forComputerScience** "Sta&s&cal!thinking!will!one!day! be!as!necessary!for!efficient! ci&zenship!as!the!ability!to!read! and!write."!H.!G.!Wells ! Credit:!wikipedia!

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Categorical Probability and Statistics Peter McCullagh Department of Statistics University of

Counting and Probability Whats to come? Counting and Probability Whats to come?

Unit 2: Probability and distributions Lecture 1: Probability and conditional probability

Which probability Which probability Which probability Which probability theory for cosmology?

Recap of Basic Probability Elements of basic probability theory probability theory The

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

ACMS 20340 Statistics for Life Sciences Chapter 9: Introducing Probability Why Consider

Statistics 1B Statistics 1B 1 (11) 0. Lecture 1. Introduction and probability review

Statistics 370 Probability and Statistics for Engineers Instructor: Peter Bloomfield Course

Chapter II.2: Basic Probability Theory and Statistics 1. What is a probability? 1.1. Probability

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

CERN PILOT PROJECT

Implementation and Evaluation of a Flow Map Demonstrator for Analyzing Work Commuting Flows

Translation Quality Estimation Tutorial Hands-on QuEst++ Carolina Scarton and Lucia Specia July

Towards cryptographic function distinguishers with evolutionary circuits Statistical testing of

Order isomorphisms of countable dense real sets which are universal entire functions (preliminary

Program Transformations in the Polca Project Jan Kuper April 6, 2016 partially funded by the

New Methods for Time Series and Panel Econometrics 1 6 00 0 H ig h e st 1 2 00 0 H igh 8 00

Differential Dynamic Logic and Differential Invariants for Hybrid Systems Andr e Platzer

Probability*and*Statistics* ! for*Computer*Science** - PowerPoint PPT Presentation

Probability*and*Statistics* ! for*Computer*Science** "Sta&s&cal!thinking!will!one!day! be!as!necessary!for!efficient! ci&zenship!as!the!ability!to!read! and!write."!H.!G.!Wells ! Credit:!wikipedia!

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Categorical Probability and Statistics Peter McCullagh Department of Statistics University of

Counting and Probability Whats to come? Counting and Probability Whats to come?

Unit 2: Probability and distributions Lecture 1: Probability and conditional probability

Which probability Which probability Which probability Which probability theory for cosmology?

Recap of Basic Probability Elements of basic probability theory probability theory The

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

ACMS 20340 Statistics for Life Sciences Chapter 9: Introducing Probability Why Consider

Statistics 1B Statistics 1B 1 (11) 0. Lecture 1. Introduction and probability review

Statistics 370 Probability and Statistics for Engineers Instructor: Peter Bloomfield Course

Chapter II.2: Basic Probability Theory and Statistics 1. What is a probability? 1.1. Probability

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

CERN PILOT PROJECT

Implementation and Evaluation of a Flow Map Demonstrator for Analyzing Work Commuting Flows

Translation Quality Estimation Tutorial Hands-on QuEst++ Carolina Scarton and Lucia Specia July

Towards cryptographic function distinguishers with evolutionary circuits Statistical testing of

Order isomorphisms of countable dense real sets which are universal entire functions (preliminary

Program Transformations in the Polca Project Jan Kuper April 6, 2016 partially funded by the

New Methods for Time Series and Panel Econometrics 1 6 00 0 H ig h e st 1 2 00 0 H igh 8 00

Differential Dynamic Logic and Differential Invariants for Hybrid Systems Andr e Platzer

ProbabilityandStatistics* ! forComputerScience** - PowerPoint PPT Presentation

ProbabilityandStatistics* ! forComputerScience** "Sta&s&cal!thinking!will!one!day! be!as!necessary!for!efficient! ci&zenship!as!the!ability!to!read! and!write."!H.!G.!Wells ! Credit:!wikipedia!