Mac hine L e ar ning Intr oduc tion Stanley Liang, PhD York - PDF document

6/2/2017 Mac hine L e ar ning Intr oduc tion Stanley Liang, PhD York University What is Mac hine L e ar ning? Definitio n • “The goal of machine learning is to program computers to use example data or past experience to solve a given problem. Many successful applications of machine learning exist already, including systems that analyze past sales data to predict customer behavior, optimize robot behavior so that a task can be completed using minimum resources, and extract knowledge from bioinformatics data.” – ACM ( Association for Computing Machinery ) • “ How do we create computer programs that improve with experience ” – Tom Mitchell • “Machine learning is the science of getting computers to act without being explicitly programed.” – Andrew Ng 1

6/2/2017 T ypic al Mac hine L e ar ning topic s Supervised Learning Unsupervised Learning Supervised learning is a type of machine learning algorithm that Unsupervised learning is a type of machine learning uses a known dataset (called the training dataset) to make algorithm used to draw inferences from datasets consisting predictions. The training dataset includes input data and of input data without labeled responses. response values. From it, the supervised learning algorithm seeks to build a model that can make predictions of the response values for a new dataset. Classification / Regression Clustering, semi ‐ supervised learning • Clustering: the method for exploratory data • Classification: for categorical response values, where the analysis to find hidden patterns or grouping in data can be separated into specific “classes”. data. The clusters are modeled using a measure of • Regression: for continuous-response values. similarity which is defined upon metrics such as Euclidean or probabilistic distance. • Semi ‐ supervised learning is a class of supervised learning tasks and techniques that also make use of unlabeled data for training – typically a small amount of labeled data with a large amount of unlabeled data. Supe r vise d L e ar ning Algor ithms • Supervised learning is to build a model that makes predictions based on evidence in the presence of uncertainty. • As adaptive algorithms identify patterns in data, a computer ʺ learns ʺ from the observations. When exposed to more observations, the computer improves its predictive performance. • Typical Algorithms – Classification: Decision Trees, Discriminant Analysis, Naive Bayes, k Nearest Neighbors (kNN), Support Vector Machines (SVM), Classification Ensembles ( a predictive model composed of a weighted combination of multiple classification models) – Regression: Linear Regression, Generalized Linear Models, Nonlinear Regression, Support Vector Machines (SVM), Gaussian Process Regression Models, Decision Trees, Regression Tree Ensembles 2

6/2/2017 Unsupe r vise d L e ar ning Algor ithms • Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data without labeled responses. • Typical Algorithms – Hierarchical clustering : builds a multilevel hierarchy of clusters by creating a cluster tree – k ‐ Means clustering : partitions data into k distinct clusters based on distance to the centroid of a cluster – Gaussian mixture models : models clusters as a mixture of multivariate normal density components – Self ‐ organizing maps : uses neural networks that learn the topology and distribution of the data – Hidden Markov models : uses observed data to recover the sequence of states • Unsupervised learning methods are used in bioinformatics for sequence analysis and genetic clustering; in data mining for sequence and pattern mining; in medical imaging for image segmentation; and in computer vision for object recognition Mac hine L e ar ning F lowc har t 3

6/2/2017 Mac hine L e ar ning Apps Classification Learner App Regression Learner App Unde r - fitting vs. Ove r - fitting • Underfitting occurs when a statistical • Overfitting occurs when a model is model or machine learning algorithm excessively complex, such as having cannot capture the underlying trend too many parameters relative to the of the data. number of observations. • Underfitting would occur when fitting • Overfitting model has poor predictive a linear model to non ‐ linear data. performance, as it overreacts to minor Such a model would have poor fluctuations in the training data. predictive performance. 4

6/2/2017 E valuate a Mac hine L e ar ning Mode l • Cross validation : a model validation technique for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used to estimate how accurately a predictive model will perform in practice. • Confusion matrix : a specific table layout that allows visualization of the performance of an algorithm. Each column of the matrix represents the instances in a predicted class while each row represents the instances in an actual class. 5

Mac hine L e ar ning Intr oduc tion Stanley Liang, PhD York - PDF document

6/2/2017 Mac hine L e ar ning Intr oduc tion Stanley Liang, PhD York University What is Mac hine L e ar ning? Definitio n The goal of machine learning is to program computers to use example data or past experience to solve a given

Intr Intr oduc tion oduc tion Infor mality is only a pr oble m in de ve loping c ountr

KIP 2300 High Pr oduc tion Sc anne r KIP 2300 High Pr oduc tion CCD Sc anne r Inc r e dible

Bo a rd o f T ruste e s Me e ting Apr il 29, 2015 Ag e nda Intr oduc tion to the te am

Outline Wh y Mac hine Learning? What is a w ell-dened learning problem?

Outline Wh y Mac hine Learning What is a w elldened learning problem

Intr oduc tion to the Contr a Costa County L oc al Par tne r ship Agr e e me nt Ro se Do

Ideas o Ideas on M n Mac achine L hine Lear earning ning In Inter erpr pretabilit ability

1 B-MAC Implementation B-MAC Implementation Low Power Listening (LPL) B-MAC = Link Protocol

INTR IN TROD ODUC UCTION TION OF M OF MEDIC EDICLINIC LINIC REPRESENT REPRES ENTATIVES

An Intr oduc tion to the F utur e Re ady PA Inde x What is the F utur e Re ady PA Inde

CNG: An Intr oduc tion Who We Are T he la rg e st fle e t c o nve rsio n c o mpa ny in No

An An Intr Introd oduc uction tion to JDR to JDR www.jdr jdrglobal lobal.com JDR is a

The AMR Group An n In Intr trod oduc uction tion 2013 2013 The Group The he AMR AMR

E mploye e Assistanc e Pr ogr am (E AP) UWL Intr oduc tion to F E I Be ha vior a l

Department of Local Government Finance In Intr trod oduc ucti tion on to o th the e New F

Intr oduc tion to E c onome tr ic s Chapte r 3 E ze quie l Ur ie l Jim ne z Unive r

Department of Computer Science Lehman College, City University of New York Summer 2020 CMP

Mentoring Undergraduates Through Competition Xuanhua Shi Huazhong University of Science and

Constructing low star discrepancy point sets with genetic algorithms Franois-Michel De

making Julia more inclusive and accessible Jane Herriman Julia Computing & Caltech Our

Agenda Announcements Structure APT Membership and for loops 1/14/2013 CompSci101

Journals Scientific journals started in 1665 French Journal des savans English

Reported Bugs in a Software Repository Hadi Jahanshahi Mucahit Cevik Aye Baar May 2020 33

Mat2170 Course Goals Develop Algorithm Design Skills : writing step-by-step instructions to

Mac hine L e ar ning Intr oduc tion Stanley Liang, PhD York - PDF document

6/2/2017 Mac hine L e ar ning Intr oduc tion Stanley Liang, PhD York University What is Mac hine L e ar ning? Definitio n The goal of machine learning is to program computers to use example data or past experience to solve a given

Intr Intr oduc tion oduc tion Infor mality is only a pr oble m in de ve loping c ountr

KIP 2300 High Pr oduc tion Sc anne r KIP 2300 High Pr oduc tion CCD Sc anne r Inc r e dible

Bo a rd o f T ruste e s Me e ting Apr il 29, 2015 Ag e nda Intr oduc tion to the te am

Outline Wh y Mac hine Learning? What is a w ell-dened learning problem?

Outline Wh y Mac hine Learning What is a w elldened learning problem

Intr oduc tion to the Contr a Costa County L oc al Par tne r ship Agr e e me nt Ro se Do

Ideas o Ideas on M n Mac achine L hine Lear earning ning In Inter erpr pretabilit ability

1 B-MAC Implementation B-MAC Implementation Low Power Listening (LPL) B-MAC = Link Protocol

INTR IN TROD ODUC UCTION TION OF M OF MEDIC EDICLINIC LINIC REPRESENT REPRES ENTATIVES

An Intr oduc tion to the F utur e Re ady PA Inde x What is the F utur e Re ady PA Inde

CNG: An Intr oduc tion Who We Are T he la rg e st fle e t c o nve rsio n c o mpa ny in No

An An Intr Introd oduc uction tion to JDR to JDR www.jdr jdrglobal lobal.com JDR is a

The AMR Group An n In Intr trod oduc uction tion 2013 2013 The Group The he AMR AMR

E mploye e Assistanc e Pr ogr am (E AP) UWL Intr oduc tion to F E I Be ha vior a l

Department of Local Government Finance In Intr trod oduc ucti tion on to o th the e New F

Intr oduc tion to E c onome tr ic s Chapte r 3 E ze quie l Ur ie l Jim ne z Unive r

Department of Computer Science Lehman College, City University of New York Summer 2020 CMP

Mentoring Undergraduates Through Competition Xuanhua Shi Huazhong University of Science and

Constructing low star discrepancy point sets with genetic algorithms Franois-Michel De

making Julia more inclusive and accessible Jane Herriman Julia Computing &amp; Caltech Our

Agenda Announcements Structure APT Membership and for loops 1/14/2013 CompSci101

Journals Scientific journals started in 1665 French Journal des savans English

Reported Bugs in a Software Repository Hadi Jahanshahi Mucahit Cevik Aye Baar May 2020 33

Mat2170 Course Goals Develop Algorithm Design Skills : writing step-by-step instructions to

making Julia more inclusive and accessible Jane Herriman Julia Computing & Caltech Our