Statistics for Machine Learning Prof. Seungchul Lee Industrial AI - PowerPoint PPT Presentation

Oct 16, 2023 •483 likes •720 views

Statistics for Machine Learning Prof. Seungchul Lee Industrial AI Lab. Statistics and Probability statistics data model probability 2 Populations and Samples A population includes all the elements from a set of data A parameter is a

Statistics for Machine Learning Prof. Seungchul Lee Industrial AI Lab.
Statistics and Probability statistics data model probability 2
Populations and Samples • A population includes all the elements from a set of data • A parameter is a quantity computed from a population – mean, 𝜈 – variance, 𝜏 2 • A sample is a subset of the population. – one or more observations • A statistic is a quantity computed from a sample – sample mean, ҧ 𝑦 – sample variance, 𝑡 2 – sample correlation, 𝑇 𝑦𝑧 3
How to Generate Random Numbers • Data sampled from population/process/generative model 4
Histogram • Graphical representation of data distribution ⇒ rough sense of density of data counts/freq ... ... bin 5
Inference • True population or process is modeled probabilistically • Sampling supplies us with realizations from probability model • Compute something, but recognize that we could have just as easily gotten a different set of realizations 6
Inference 7
Inference • We want to infer the characteristics of the true probability model from our one sample. 8
The Law of Large Numbers • Sample mean converges to the population mean as sample size gets large • True for any probability density functions 9
Sample Mean and Sample Size • Sample mean and sample variance 10
The Central Limit Theorem • Sample mean (not samples) will be approximately normally distributed as a sample size 𝑛 → ∞ • More samples provide more confidence (or less uncertainty) • Note: true regardless of any distributions of population 11
Uniform Distribution: 𝒚~𝑽 𝟏, 𝟐 12
Sample Size 13
Variance Gets Smaller as 𝒏 is Larger • Seems approximately Gaussian distributed • Numerically demonstrate that sample mean follows Gaussian distribution 14
Multivariate Statistics • 𝑛 observations 𝑦 𝑗 , 𝑦 2 , ⋯ , 𝑦 𝑛 15
Correlation of Two Random Variables • Correlation – Strength of linear relationship between two variables, 𝑦 and 𝑧 16
Correlation of Two Random Variables • Assume 17
Correlation Coefficient • +1 → close to a straight line • −1 → close to a straight line • Indicate how close to a linear line, but • No information on slope • Does not tell anything about causality 18
Correlation Coefficient 19
Correlation Coefficient 20
Correlation Coefficient Plot • Plots correlation coefficients among pairs of variables • http://rpsychologist.com/d3/correlation/ 21
Covariance Matrix 22

Recommend

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Rob Schapire Princeton University www.cs.princeton.edu/ schapire Machine

1.26k views • 38 slides

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum Computing Machine Learning Quantum Computing Machine Learning so hot so so hot Quantum Computing Machine Learning Quantum Computing Machine Learning

837 views • 51 slides

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is Machine Learning? Azure Machine Learning: How it works Azure Machine Learning in action Get started Contents What is Machine Learning?

461 views • 21 slides

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING Exam Format The exam lasts a total of 3 hours: - Upon entering the room, you must

379 views • 21 slides

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

MACHINE LEARNING 2012 MACHINE LEARNING MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How to separate the red class from the grey class? x 2 360 r x 1 Polar coordinates Data

1.04k views • 44 slides

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach to Preventing to Preventing to Preventing to Preventing Avoidable ED Utilization Avoidable ED Utilization Avoidable ED

730 views • 13 slides

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official Statistics Official Statistics 3 Official Statistics National Statistics Official Statistics 4 2007 2009 2008 Official Statistics 5 2007 2009

262 views • 10 slides

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

MACHINE LEARNING TOOLBOX Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret R package Automates supervised learning (a.k.a. predictive modeling ) Target variable Machine Learning Toolbox

634 views • 16 slides

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Introduction to Machine Learning 1 / 18 Outline 1 Classification, Regression, Unsupervised Learning 2 About Dimensionality 3 Drawings and

702 views • 18 slides

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is Machine learning is about predicting the future based on the past. -- Hal Daume III Machine Learning is Machine learning is about predicting

917 views • 59 slides

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University April 23, 2008 1 How can studies of machine (human) learning inform machine (human) learning inform studies of h human (machine) learning? (

875 views • 49 slides

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification Rob

831 views • 58 slides

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me This class is going to be interactive! What is Machine Learning? 2 What is Machine Learning? 3 What is Machine Learning? Study of

853 views • 52 slides

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Structure of todays and next weeks class 1) Briefly go through one

619 views • 28 slides

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

DataCamp Machine Learning for Finance in Python MACHINE LEARNING FOR FINANCE IN PYTHON Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning for Finance in Python Machine Learning in Finance source:

389 views • 36 slides

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Objectives Learn basic techniques for data

1.1k views • 74 slides

Attribute-Efficient Learning of Monomials over Highly-Correlated Variables Alexandr Andoni,

Attribute-Efficient Learning of Monomials over Highly-Correlated Variables Alexandr Andoni, Rishabh Dudeja, Daniel Hsu, Kiran Vodrahalli Columbia University Algorithmic Learning Theory 2019 Learning Sparse Monomials A Simple 3 dimensions

390 views • 24 slides

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation When does UI evaluation happen? Design Testing and Implementation Development evaluation Testing 2 CS 349 - UI evaluation Types of tests

820 views • 20 slides

Enrique Soriano Martn Universidad Politcnica de Madrid 3rd International Electronic

3rd International Electronic Conference on Water Sciences Selection of bias correction methods to assess the impact of climate change on flood frequency curves Enrique Soriano Martn Universidad Politcnica de Madrid 3rd International

381 views • 14 slides

Computing Nucleon Electric Dipole Moments in Lattice QCD Hiroshi Ohki Nara Womens University

Computing Nucleon Electric Dipole Moments in Lattice QCD Hiroshi Ohki Nara Womens University (RBC/UKQCD collaboration) RIKEN BNL Research

730 views • 50 slides

Note This slide was added after the presentation at the Stata User Group Meeting in London. As of

piecewise ginireg 1 Piecewise Gini Regressions in Stata Jan Ditzen 1 Shlomo Yitzhaki 2 1 Heriot-Watt University, Edinburgh, UK Center for Energy Economics Research and Policy (CEERP) 2 The Hebrew University and Hadassah Academic College,

956 views • 24 slides

Correlation Learning Objectives At the end of this lecture, the student should be able to:

Chapter 4.1 Scatter Diagrams and Linear Correlation Learning Objectives At the end of this lecture, the student should be able to: Explain what a scattergram is and how to make one State what strength and direction mean with

867 views • 51 slides

Computational Linguistics: Evaluation Methods Raffaella Bernardi University of Trento Contents

Computational Linguistics: Evaluation Methods Raffaella Bernardi University of Trento Contents First Last Prev Next 1. Admin Perusall sends email reminders to students 3, 2, and 1 day before the deadline of an assignment. Only

477 views • 34 slides

15 GHz Monitoring of Gamma-ray Blazars with the OVRO 40 Meter Telescope in Support of Fermi

15 GHz Monitoring of Gamma-ray Blazars with the OVRO 40 Meter Telescope in Support of Fermi Joseph L. Richards, W. Max-Moerbeck, V. Pavlidou, T. J. Pearson, A. C. S. Readhead, M. A. Stevenson California Institute of Technology, Owens Valley

289 views • 17 slides