Conclusions Larry Holder CptS 570 Machine Learning School of - - PowerPoint PPT Presentation

conclusions
SMART_READER_LITE
LIVE PREVIEW

Conclusions Larry Holder CptS 570 Machine Learning School of - - PowerPoint PPT Presentation

Conclusions Larry Holder CptS 570 Machine Learning School of Electrical Engineering and Computer Science Washington State University 1 Outline Overview of machine learning Fundamental research issues Grand challenge problems 2


slide-1
SLIDE 1

1

Conclusions

Larry Holder CptS 570 – Machine Learning School of Electrical Engineering and Computer Science Washington State University

slide-2
SLIDE 2

2

Outline

Overview of machine learning Fundamental research issues Grand challenge problems

slide-3
SLIDE 3

3

Overview of Machine Learning

Supervised learning Evaluation of learning methods Learning theory Unsupervised learning Other learning methods Applications Related fields

slide-4
SLIDE 4

4

Supervised Learning

Traditional methods

Version space Candidate elimination algorithm Decision tree induction Neural networks Bayesian learning Instance-based learning

slide-5
SLIDE 5

5

Supervised Learning

Advanced methods

Kernel methods

Support vector machines

Ensembles

Bagging Boosting

Learning rule sets

Relational learning Inductive logic programming (ILP) Graph-based learning

slide-6
SLIDE 6

6

Evaluation of Learning Methods

True error vs. sample error Bounding true error Comparison of hypotheses Comparison of learners Significance testing ROC curves

slide-7
SLIDE 7

7

Learning Theory

Bayes optimal learning Sample complexity PAC learning framework VC dimension

slide-8
SLIDE 8

8

Unsupervised Learning

Non-linear regression Pattern discovery Clustering Grammar (language) learning EM algorithm

slide-9
SLIDE 9

9

Other Learning Methods

Genetic algorithms Analytical learning Reinforcement learning Integrated learning

slide-10
SLIDE 10

10

Applications

Classification and prediction

Chemical properties Biometrics Object recognition Organizational and behavioral patterns

Skill acquisition

Robot navigation Control and optimization Heuristic search

slide-11
SLIDE 11

11

Related Fields

Statistics Pattern recognition Control theory Cognitive science Psychology Neurophysiology

slide-12
SLIDE 12

12

Fundamental Research Issues

General learning methods Limits of general methods Theory and principles guiding development of

domain-specific learning algorithms

Multi-relational learning Learning in dynamic environments Incorporation of domain-specific background

knowledge

Ethical responsibility and privacy

slide-13
SLIDE 13

13

Grand Challenge Problems

“What are the Grand Challenges for Data

Mining,” SIGKDD Explorations, 8(2):70-77, 2006.

KDD 2006 conference panel

  • G. Piatetsky-Shapiro, C. Djeraba, L. Getoor,
  • R. Grossman, R. Feldman, M. Zaki

GC problems define directions for the field

and motivate and excite researchers

E.g., Netflix Prize

slide-14
SLIDE 14

14

Good Grand Challenge Problems

Problem is hard – very difficult to solve given

the current state of the art

Based on a large, publicly available data set There is a specific goal – it is clear when the

problem is solved

Problem is interesting to researchers and

understandable to the public; preferably stated in one sentence

There is significant public benefit if it is solved

slide-15
SLIDE 15

15

Grand Challenge Problem (1)

Automatically annotate 1000 hours of digital

video in 1 hour

E.g., “basketball game”, “Michael Jordan” General approach

Automatically extract primitive features Manually annotate subset of videos Learn to predict annotations based on features Use learned classifiers to annotate subsequent

videos

slide-16
SLIDE 16

16

Grand Challenge Problem (2)

Functional annotation of the proteome, the

set of proteins in the cell

What is the function of a protein (e.g., insulin

production, metabolism)?

What other proteins does it interact with?

100,000+ proteins, some with multiple

functions

Approach: Link mining, “guilt” by association

slide-17
SLIDE 17

17

Grand Challenge Problem (3)

System capable of passing SAT reading

comprehension test given access to the World-Wide Web

Approach

Entity and relation extraction Natural language understanding Relational rule learning Reasoning

Automated student

slide-18
SLIDE 18

Conclusions

Machine learning seeks to give

computers the ability to improve their performance based on experience

Many mature methods available and

some theoretical results

Basis of multi-billion dollar data mining

industry

Much research left to be done

18