The Research Journey CSCI 8901: Research & Evaluation Methods - - PowerPoint PPT Presentation

the research journey
SMART_READER_LITE
LIVE PREVIEW

The Research Journey CSCI 8901: Research & Evaluation Methods - - PowerPoint PPT Presentation

The Research Journey CSCI 8901: Research & Evaluation Methods Prof. Tim Wood GWU These slides include material from a similar course by David Jensen, UMass HMS Beagle Tim Wood - The George Washington University - Department of Computer


slide-1
SLIDE 1

The Research Journey

CSCI 8901: Research & Evaluation Methods

  • Prof. Tim Wood

GWU

These slides include material from a similar course by David Jensen, UMass

slide-2
SLIDE 2

Tim Wood - The George Washington University - Department of Computer Science

HMS Beagle

2

slide-3
SLIDE 3

Tim Wood - The George Washington University - Department of Computer Science

Charles Darwin

Born 1809

  • Went to college in 1828, but

“preferred riding and shooting to studying”. Collected beetles.

Applied to be the naturalist

  • n HMS Beagle in 1831

3

slide-4
SLIDE 4

Tim Wood - The George Washington University - Department of Computer Science

Charles Darwin

Born 1809

  • Went to college in 1828, but

“preferred riding and shooting to studying”. Collected beetles.

Applied to be the naturalist

  • n HMS Beagle in 1831
  • Captain almost rejected his

application:

4

“He was… convinced that he could judge a man’s character by the outline of his features; and he doubted wheather [sic] anyone with my nose could possess sufficient energy and determination for the voyage.”

slide-5
SLIDE 5

Tim Wood - The George Washington University - Department of Computer Science

5 Year Journey

1831 - 1836 (a short PhD?)

5

slide-6
SLIDE 6

Tim Wood - The George Washington University - Department of Computer Science

Evolution

Ideas came together after his trip In 1838 read work by Malthus

  • n population growth
  • Should be exponential!

But populations tend to be stable…

  • Which ones will survive?

6

slide-7
SLIDE 7

Tim Wood - The George Washington University - Department of Computer Science

Journey of Ideas

7

Fact 1 Potential exponential increase of population (Paley, Malthus, etc.) Fact 2 Observed stead‐state stability of populations (observations) Fact 3 Limitation of resources (observations, Malthus)

Source: E. Mayr (1991). One Long Argument. Harvard. via David Jensen

slide-8
SLIDE 8

Tim Wood - The George Washington University - Department of Computer Science

Journey of Ideas

8

Fact 1 Potential exponential increase of population (Paley, Malthus, etc.) Fact 2 Observed stead‐state stability of populations (observations) Fact 3 Limitation of resources (observations, Malthus) Inference 1 Struggle for existence among individuals (Malthus) Fact 4 Uniqueness of individuals (animal breeders, taxonomists) Fact 5 Heritability of individual variation (animal breeders)

Source: E. Mayr (1991). One Long Argument. Harvard. via David Jensen

slide-9
SLIDE 9

Tim Wood - The George Washington University - Department of Computer Science

9

Fact 1 Potential exponential increase of population (Paley, Malthus, etc.) Fact 2 Observed stead‐state stability of populations (observations) Fact 3 Limitation of resources (observations, Malthus) Inference 1 Struggle for existence among individuals (Malthus) Fact 4 Uniqueness of individuals (animal breeders, taxonomists) Fact 5 Heritability of individual variation (animal breeders) Inference 2 Differential survival or Natural selection (Darwin) Inference 3 Through many generations => evolution (Darwin)

Source: E. Mayr (1991). One Long Argument. Harvard. via David Jensen

Journey of Ideas

slide-10
SLIDE 10

Tim Wood - The George Washington University - Department of Computer Science

Why is science hard?

Intrinsic:

  • Science is about discovery and thus inherently about something

that is unknown

Personal:

  • We as scientists make mistakes, have biases, get distracted, etc

Communal:

  • Progress depends on many researchers coming together, yet our

communities don’t always recognize important work or share information

10

Following slides from D. Jensen, Research Methods, UMass

slide-11
SLIDE 11

Tim Wood - The George Washington University - Department of Computer Science

11 All possible theories Theories that are actually true

slide-12
SLIDE 12

Tim Wood - The George Washington University - Department of Computer Science

12

Theories that are actually true Theories we think are true

slide-13
SLIDE 13

Tim Wood - The George Washington University - Department of Computer Science

13

Theories that are actually true Theories we think are true Theories we think we have tested well

slide-14
SLIDE 14

Tim Wood - The George Washington University - Department of Computer Science

14 Theories that are actually true Theories we think are true Theories we think we have tested well Theories we have even considered

slide-15
SLIDE 15

Tim Wood - The George Washington University - Department of Computer Science

Darwin’s Natural Selection

Took a 5-year journey around the world, plus 23 years of further study, and data gathering

  • Broke from prior theories

Proposed a new theory based on extensive evidence

15

slide-16
SLIDE 16

Tim Wood - The George Washington University - Department of Computer Science

Darwin’s Intrinsic Challenges

Took a 5-year journey around the world, plus 23 years of further study, and data gathering Proposed a new theory based on extensive evidence

16

Theories that are actually true Theories we think are true Theories we think we have tested well Theories we have even considered

slide-17
SLIDE 17

Tim Wood - The George Washington University - Department of Computer Science

Why is science hard?

Intrinsic:

  • Science is about discovery and thus inherently about something

that is unknown

Personal:

  • We as scientists make mistakes, have biases, get distracted, etc

Communal:

  • Progress depends on many researchers coming together, yet our

communities don’t always recognize important work or share information

17

slide-18
SLIDE 18

Tim Wood - The George Washington University - Department of Computer Science

Researcher Sins

Slop:

  • Doing research in such a way that it is impossible to know for

certain what was done or observed;

  • Confused or unclear procedures and data‐recording techniques;
  • Imprecise theorizing, unexpressed assumptions, and informal

derivation of predictions.

Sloth:

  • Doing too little;
  • Laziness such that important potential data are not obtained or

recorded;

  • Partial or incomplete analysis of data.

18

From: Donald D. Jensen (circa 1995), Unpublished lecture notes. University of Nebraska - Lincoln via David Jensen, UMass

slide-19
SLIDE 19

Tim Wood - The George Washington University - Department of Computer Science

Researcher Sins

Precipitance:

  • Jumping to a conclusion;
  • Premature decision on an issue;
  • Accepting as established something that deserves further

investigation.

Propaganda:

  • Biased presentation of a theory or data;
  • Also called "special pleading";
  • Acting as a proponent rather than an disinterested presenter of

facts and interpretation;

  • Salesmanship rather than science.

19

From: Donald D. Jensen (circa 1995), Unpublished lecture notes. University of Nebraska - Lincoln via David Jensen, UMass

slide-20
SLIDE 20

Tim Wood - The George Washington University - Department of Computer Science

Researcher Sins

Prejudice:

  • Biased evaluation of theory and data;
  • expecting more of other theories than of one's own;
  • "Tilting the playing field" in favor of one's own theory.

Perseveration:

  • Holding to a theory despite clear evidence that it is false.

20

From: Donald D. Jensen (circa 1995), Unpublished lecture notes. University of Nebraska - Lincoln via David Jensen, UMass

slide-21
SLIDE 21

Tim Wood - The George Washington University - Department of Computer Science

Researcher Sins

Finagle:

  • “Adjusting” data so that it fits a favored theory. Minor fraud.

Filch:

  • Stealing ideas or data without giving appropriate credit;
  • Plagiarism or other unauthorized use of the work of others.

Fraud:

  • Falsifying data and investigation

21

From: Donald D. Jensen (circa 1995), Unpublished lecture notes. University of Nebraska - Lincoln via David Jensen, UMass

slide-22
SLIDE 22

Tim Wood - The George Washington University - Department of Computer Science

Integrity and Ethics

These are incredibly important!

  • If people question your integrity, they will doubt all of your

science! And rightfully so!

We can only make progress if we can trust each

  • ther and trust each other’s results!
  • Never violate this trust!

It is always better to be late/wrong/not the best than to lie/cheat for temporary success It doesn’t matter if you get caught or not

  • Always try to do the right thing

22

slide-23
SLIDE 23

Tim Wood - The George Washington University - Department of Computer Science

Ethics and Integrity

We can only make progress if we can trust each

  • ther and trust each other’s results!
  • Never violate this trust!

These are incredibly important!

  • If people question your integrity, they will doubt all of your

science! And rightfully so!

It is always better to be late/wrong/not the best than to lie/cheat for temporary success It doesn’t matter if you get caught or not

  • Always try to do the right thing

23

slide-24
SLIDE 24

Tim Wood - The George Washington University - Department of Computer Science

Darwin’s Personal Challenges

Had many personal difficulties in his life

  • Recurrent illnesses, several of his children died

Lack of focus?

  • Hard to say if this was a strength or a weakness
  • Spent many years on less impactful work like barnacles

24

slide-25
SLIDE 25

Tim Wood - The George Washington University - Department of Computer Science

Darwin’s Personal Challenges

Had many personal difficulties in his life

  • Recurrent illnesses, several of his children died

Lack of focus?

  • Hard to say if this was a strength or a weakness
  • Spent many years on less impactful work like barnacles

But his barnacle classification schemes inspired all his later work! Didn’t feel a rush to complete his work

  • Helps to be an independently wealthy English Gentleman
  • Delayed several years and only published when he realized

Wallace was reaching similar conclusions!

25

slide-26
SLIDE 26

Tim Wood - The George Washington University - Department of Computer Science

Why is science hard?

Intrinsic:

  • Science is about discovery and thus inherently about something

that is unknown

Personal:

  • We as scientists make mistakes, have biases, get distracted, etc

Communal:

  • Progress depends on many researchers coming together, yet our

communities don’t always recognize important work or share information

26

slide-27
SLIDE 27

Tim Wood - The George Washington University - Department of Computer Science

Research Communities

Historically were not broadly inclusive Getting better, but still tends to be some bias towards past stars Fast advances in CS make it difficult to keep up

  • Hard to judge what will have lasting effect
  • Good work can get ignored or overlooked

Fragmentation between communities limits progress and sharing of knowledge

27

slide-28
SLIDE 28

Tim Wood - The George Washington University - Department of Computer Science

Systemic Challenges

Research is primarily funded by grants

  • Most are short term, 3-5 years
  • Is that enough time to make a difference?

Race to get more publications

  • Hard to judge impact in short term; is being at a good

conference all that matters?

Encourages Minimal Publishable Unit (MPU)

28

slide-29
SLIDE 29

Tim Wood - The George Washington University - Department of Computer Science

Darwin’s Community Challenges

Prejudice and skepticism were major deterrents “Vestiges of the Natural History of Creation”

  • Published 1844 anonymously
  • Proposed transmutation: a single linear chain of evolution,

culminating in the white, English man…

  • Suggested this was not necessarily guided by an active god!
  • Quickly became popular for its radical ideas, but…

Darwin delayed publication for years because of an unwelcome community!

29

"The Vestiges of the Natural History of Creation," has started into public favour with a fair chance of poisoning the fountains of science, and sapping the foundations of

  • religion. — Sir David Brewster
slide-30
SLIDE 30

Tim Wood - The George Washington University - Department of Computer Science

How to overcome?

Intrinsic:

  • Science is about discovery and thus inherently about something

that is unknown

Personal:

  • We as scientists make mistakes, have biases, get distracted, etc

Communal:

  • Progress depends on many researchers coming together, yet our

communities don’t always recognize important work or share information

Solution:

  • Systematic, research methodology that avoids bias and lets us

empirically validate our ideas and their impact!

30

slide-31
SLIDE 31

Tim Wood - The George Washington University - Department of Computer Science

Project Report 1

Present the problem context

  • What is difficult? Why is it important to solve?

Describe the System, Task, Environment, and Behavior you will study

  • High level explanation, not enough detail to implement

~4 pages, double spaced

  • 1 page of problem statement
  • ~0.75 pages per section

Due: Tuesday 2/12, 11:59PM

31

System Task Environment Behavior

slide-32
SLIDE 32

Tim Wood - The George Washington University - Department of Computer Science

Speech Practice

Timing: meeting time constraints (1:30) Body language: posture and gestures Eye contact: Voice modulation Bad Words: uh, ums, well, basically, like, so, and Smiling / or happiness Volume: Quality/Content: (not a big deal now) Knowing audience: Engagement/Confidence: come from the above

32

slide-33
SLIDE 33

Tim Wood - The George Washington University - Department of Computer Science

Paper Reading

Use Zotero to track papers you are reading How many papers do you currently read each week?

  • Try to skim 1-2 more!

Create your folders under People

  • YourName/Read and YourName/To Read

Import papers you read Add a Note with a brief writeup (to help you remember)

33

slide-34
SLIDE 34

Tim Wood - The George Washington University - Department of Computer Science

Acknowledgements

Much of the slide content is derived from the Research Methods for Empirical Computer Science course taught by David Jensen

  • http://dx.doi.org/11084/10002
  • https://people.cs.umass.edu/~jensen/courses/index.html
  • https://people.cs.umass.edu/~jensen
  • Many thanks for allowing me to make use of his materials!

34