Announcements Piazza started Matlab Grader homework, email Friday, 2 - PowerPoint PPT Presentation

Announcements Piazza started Matlab Grader homework, email Friday, 2 (of 9) homeworks Due 21 April, Binary graded. Jupyter homework?: translate matlab to Jupiter, TA Harshul h6gupta@eng.ucsd.edu or me I would like this to happen. “GPU” homework. NOAA climate data in Jupyter on the datahub.ucsd.edu, 15 April. Projects: Any language Podcast might work eventually. Today: Stanford CNN • Bernoulli • Gaussian 1.2 • Gaussian 2.3 • Decision theory 1.5 • Information theory 1.6 • Monday Stanford CNN, Linear models for regression 3

Non-parametric method K means E

Coin estimate (Bishop 2.1) p CH µ PA 4 Binary variables x={0,1} • I µ par D PCT p ( x = 1 | µ ) = µ EG fx pas di Bernoulli distributed • Bern( x | µ ) = µ x (1 − µ ) 1 − x (2.2) I µ O Ci µ fl E [ x ] = µ the Bernoulli distribution. It is easily verified that this distribution 2 Ver Kk Efx E var[ x ] = µ (1 − µ ) . D µ 2 l fl o M µ • N observations, Likelihood: | N N � � µ x n (1 − µ ) 1 − x n . p ( D| µ ) = p ( x n | µ ) = (2.5) n =1 n =1 N N l � � ln p ( D| µ ) = ln p ( x n | µ ) = { x n ln µ + (1 − x n ) ln(1 − µ ) } . (2.6) E F n =1 n =1 the Ifn t Max likelihood • l µ N µ ML = 1 � x n N n =1

Coin estimate (Bishop 2.1) like post prior Bayes p(x|y)=p(y|x)p(x) • Beta( µ | a, b ) = Γ ( a + b ) Γ ( a ) Γ ( b ) µ a − 1 (1 − µ ) b − 1 Conjugate prior • 3 3 a = 1 a = 0 . 1 b = 0 . 1 b = 1 2 2 1 1 0 0 0 0.5 1 0 0.5 1 µ µ 3 3 a = 2 a = 8 b = 3 b = 4 2 2 1 1 0 0 0 0.5 1 0 0.5 1 µ µ Bayes: a 2 G l H 2 2 2 g prior likelihood function posterior I 1 1 1 0 0 0 0 0.5 1 0 0.5 1 0 0.5 1 µ µ µ

ML MAP BAYES ML point estimate • MAP point estimate (often in literature ML=MAP) • • Bayes => probability =>From which all information can be obtained – MAP, median, error estimates – Further analysis as sequential – Disadvantage… not a point estimate. 2 2 2 prior likelihood function posterior a 1 1 1 e 0 0 0 0 0.5 1 0 0.5 1 0 0.5 1 µ µ µ

Bayes Rule P ( hypothesis | data ) = P ( data | hypothesis ) P ( hypothesis ) P ( data ) Rev’d Thomas Bayes (1702–1761) • Bayes rule tells us how to do inference about hypotheses from data. • Learning and prediction can be seen as forms of inference.

The Gaussian Distribution Gaussian Mean and Variance

Gaussian Parameter Estimation Likelihood function L Een al o Maximum (Log) Likelihood

Curve Fitting Re-visited, Bishop1.2.5

Maximum Likelihood N � N � t n | y ( x n , w ) , β − 1 � p ( t | x , w , β ) = . (1.61) n =1 As we did in the case of the simple Gaussian distribution earlier, it is convenient to maximize the logarithm of the likelihood function. Substituting for the form of the Gaussian distribution, given by (1.46), we obtain the log likelihood function in the form N ln p ( t | x , w , β ) = − β { y ( x n , w ) − t n } 2 + N 2 ln β − N � 2 ln(2 π ) . (1.62) 2 n =1 Consider first the determination of the maximum likelihood solution for the polyno- N 1 = 1 { y ( x n , w ML ) − t n } 2 . � 6 (1.63) β ML N n =1 Giving estimates of W and beta, we can predict p ( t | x, w ML , β ML ) = N � t | y ( x, w ML ) , β − 1 � . (1.64) ML take a step towards a more Bayesian approach and introduce a prior

MAP: A Step towards Bayes 1.2.5 prior Nutt B l NG t In parte t ENT w Az WIN th f t cost 11Wh I Determine by minimizing regularized sum-of-squares error, . Regularized sum of squares

Predictive Distribution True data Estimated +/- std dev

Parametric Distributions Basic building blocks: Need to determine given Representation: or ? Recall Curve Fitting We focus on Gaussians!

The Gaussian Distribution i

Central Limit Theorem • The distribution of the sum of N i.i.d. random variables becomes increasingly Gaussian as N grows. • Example: N uniform [0,1] random variables. r

Geometry of the Multivariate Gaussian Cx Mlt E x µ e C myth y

Moments of the Multivariate Gaussian (2) M Ect I A Gaussian requires D*(D-1)/2 +D parameters. 62 I e Often we use D +D or Just D+1 parameters. G o V ate

Partitioned Conditionals and Marginals, page 89 Conditional marginal Mmm S

ML for the Gaussian (1) Bisphop 2.3.4 Given i.i.d. data , the log likelihood function is given by l lap zcnl zltftrfzcx mfz tx MD N N m1Ct mFE z.cn EltftrCZcx zfnlzlttnnsgz N FEG mkt.mg ljsg I 2M trLAB4 treats JIE 54g 52 40 ∂ A − 1 � T ∂ A ln | A | = � (C.28) ∂ ∂ A Tr ( AB ) = B T . Sµ (C.24) z ∂ = − A − 1 ∂ A � A − 1 � ∂ x A − 1 (C.21) ∂ x

Maximum Likelihood for the Gaussian Set the derivative of the log likelihood function to zero, • and solve to obtain • Similarly •

Mixtures of Gaussians (Bishop 2.3.9) Old Faithful geyser: The time between eruptions has a bimodal distribution, with the mean interval being either 65 or 91 minutes, and is dependent on the length of the prior eruption. Within a margin of error of ±10 minutes, Old Faithful will erupt either 65 minutes after an eruption lasting less than 2 1 ⁄ 2 minutes, or 91 minutes after an eruption lasting more than 2 1 ⁄ 2 minutes. I I Single Gaussian Mixture of two Gaussians

Mixtures of Gaussians (Bishop 2.3.9) • Combine simple models into a complex model: I Component Mixing coefficient K=3

Mixtures of Gaussians (Bishop 2.3.9)

Mixtures of Gaussians (Bishop 2.3.9) Determining parameters p , µ , and S using maximum log likelihood • Log of a sum; no closed form maximum. Solution: use standard, iterative, numeric optimization methods or the • expectation maximization algorithm (Chapter 9). EM

Entropy 1.6 Important quantity in • coding theory • statistical physics • machine learning

Differential Entropy Put bins of width ¢ along the real line For fixed differential entropy maximized when in which case

The Kullback-Leibler Divergence P true distribution, q is approximating distribution

Announcements Piazza started Matlab Grader homework, email Friday, 2 - PowerPoint PPT Presentation

Announcements Piazza started Matlab Grader homework, email Friday, 2 (of 9) homeworks Due 21 April, Binary graded. Jupyter homework?: translate matlab to Jupiter, TA Harshul h6gupta@eng.ucsd.edu or me I would like this to happen. GPU

DHTs and Sharding Aurojit Panda Announcements Announcements Fill out the Github consent

61A Lecture 35 Wednesday, December 4 Announcements 2 Announcements Homework 11 due Thursday

61A Lecture 6 Monday, February 2 Announcements 2 Announcements Homework 2 due Monday 2/2 @

61A Lecture 33 Monday, November 25 Announcements 2 Announcements Homework 10 due Tuesday

61A Lecture 6 Friday, September 13 Announcements 2 Announcements Homework 2 due Tuesday

61A Lecture 24 Monday, March 30 Announcements 2 Announcements Homework 7 due Wednesday 4/8

61A Lecture 37 Wednesday, April 29 Announcements 2 Announcements Homework 9 (4 pts) due

CS 61A Lecture 10 Friday, February 13 Announcements 2 Announcements Guerrilla Section 2 is

61A Lecture 14 Wednesday, February 25 Announcements 2 Announcements Project 2 due Thursday

Linearizability & CAP Announcements No hours this week. Announcements No hours this

61A Lecture 13 Wednesday, October 2 Announcements 2 Announcements Homework 3 deadline

61A Lecture 24 Friday, November 1 Announcements 2 Announcements Homework 7 due Tuesday 11/5

61A Extra Lecture 2 Thursday, February 5 Announcements 2 Announcements If you want 1 unit

CS 61A Lecture 11 Wednesday, February 18 Announcements 2 Announcements Optional Hog Contest

Announcements Lecture 22 System Development Leah Perlmutter / Summer 2018 Announcements

Lecture 30: Conclusion Brian Hou August 11, 2016 Announcements Announcements Final Exam

CS340: Machine Learning Modelling discrete data with Bernoulli and multinomial distributions

Adaptive Background Mixture Models for Real-Time Tracking Chris Stauffer and W.E.L Grimson CVPR

Math for Liberal Arts MAT 110: Chapter 6 Notes Characterizing Data Putting Statistics to Work

Math 140 The values of a summary statistic (e.g. the Introductory Statistics average age of the

The Epeak - Fluence Bimodality: A fundamental discriminator between long and short GRBs Adam

Revisiting Benchmarking Methodology for Interconnect Devices Daniel Raumer, Sebastian Gallemller,

Slice sampling Dr. Jarad Niemi STAT 615 - Iowa State University November 14, 2017 Jarad Niemi

Control Charts for x and R Subsequent use of the charts The next 20 samples are added to the

Announcements Piazza started Matlab Grader homework, email Friday, 2 - PowerPoint PPT Presentation

Announcements Piazza started Matlab Grader homework, email Friday, 2 (of 9) homeworks Due 21 April, Binary graded. Jupyter homework?: translate matlab to Jupiter, TA Harshul h6gupta@eng.ucsd.edu or me I would like this to happen. GPU

DHTs and Sharding Aurojit Panda Announcements Announcements Fill out the Github consent

61A Lecture 35 Wednesday, December 4 Announcements 2 Announcements Homework 11 due Thursday

61A Lecture 6 Monday, February 2 Announcements 2 Announcements Homework 2 due Monday 2/2 @

61A Lecture 33 Monday, November 25 Announcements 2 Announcements Homework 10 due Tuesday

61A Lecture 6 Friday, September 13 Announcements 2 Announcements Homework 2 due Tuesday

61A Lecture 24 Monday, March 30 Announcements 2 Announcements Homework 7 due Wednesday 4/8

61A Lecture 37 Wednesday, April 29 Announcements 2 Announcements Homework 9 (4 pts) due

CS 61A Lecture 10 Friday, February 13 Announcements 2 Announcements Guerrilla Section 2 is

61A Lecture 14 Wednesday, February 25 Announcements 2 Announcements Project 2 due Thursday

Linearizability &amp; CAP Announcements No hours this week. Announcements No hours this

61A Lecture 13 Wednesday, October 2 Announcements 2 Announcements Homework 3 deadline

61A Lecture 24 Friday, November 1 Announcements 2 Announcements Homework 7 due Tuesday 11/5

61A Extra Lecture 2 Thursday, February 5 Announcements 2 Announcements If you want 1 unit

CS 61A Lecture 11 Wednesday, February 18 Announcements 2 Announcements Optional Hog Contest

Announcements Lecture 22 System Development Leah Perlmutter / Summer 2018 Announcements

Lecture 30: Conclusion Brian Hou August 11, 2016 Announcements Announcements Final Exam

CS340: Machine Learning Modelling discrete data with Bernoulli and multinomial distributions

Adaptive Background Mixture Models for Real-Time Tracking Chris Stauffer and W.E.L Grimson CVPR

Math for Liberal Arts MAT 110: Chapter 6 Notes Characterizing Data Putting Statistics to Work

Math 140 The values of a summary statistic (e.g. the Introductory Statistics average age of the

The Epeak - Fluence Bimodality: A fundamental discriminator between long and short GRBs Adam

Revisiting Benchmarking Methodology for Interconnect Devices Daniel Raumer, Sebastian Gallemller,

Slice sampling Dr. Jarad Niemi STAT 615 - Iowa State University November 14, 2017 Jarad Niemi

Control Charts for x and R Subsequent use of the charts The next 20 samples are added to the

Linearizability & CAP Announcements No hours this week. Announcements No hours this