TDDD89 Lecture 4 - Research methods Ola Leifler 2 Literature - PowerPoint PPT Presentation

TDDD89 Lecture 4 - Research methods Ola Leifler

2 Literature • Cohen, Paul. Empirical Methods in Artificial Intelligence • Experimentation in Software Engineering • Case Study Research in Software Engineering • Weapons of Math Destruction

3 What is a scientific method? • Design, implement, test? • Acquire data, aggregate, visualise? • …

4 Different types of methods • Qualitative methods: establish concepts, describe a phenomenon, find a vocabulary, create a model • Quantitative methods: make statistical analyses, quantify correlations, ..

5 Human-Centered methods • Surveys • Interviews • Observations • Think-aloud sessions • Competitor analysis • Usability evaluation • …

6 Method choice? • What do you want to find more about? • Identify the stakeholders (users, customers, and purchaser) • Identify their needs

7 Interviews • Structured or unstructured? • Group interviews (focus groups) or individual interviews? • Telephone interviews

8 • Use open-ended questions: – ”Do you like your job?” vs ”What do you think about your job?" • Active listning • Record the interview • Plan and schedule for that!

9 Interview analysis • Transcribe or not? • Categorize what has been said (encode)

10 Observations • Understand the context • Write down what you see, hear, and feel • Take pictures • Combine with interview • Ask users to use systems if availabe

11 Usability evaluation • System usability scale (SUS) • Post-Study System Usability Questionnaire (PSSUQ) • Heuristic evaluations • Eye tracking • First click Testing • …

12 • System usability scale (SUS) Note the differences

13 Usability performance measurement • Task success • Time (time/task) • Effectiveness (errors/task) • Efficiency (operations/task) • Learnability (performance change)

14 Describing a method • ”To implement a Flux controller, I first needed to learn about Flux” Don’t write a diary! Write that which convinces someone you have done a good job • ”The Flux controller was evaluated using the Flux controller evaluation protocol [1]”

15 Engineering method vs scientific method Method questions Engineering aspect Scientific aspect Have you verified that Can I trust your work? Have you properly you obtain the same tested your solution? data in different settings/scenarios? Can I run/create the Can I build on your Can I replicate the same system work? results of the study? somewhere else?

16 Case Study • Investigates a phenomenon in a context, • with multiple sources of information, • where the boundary between context and phenomenon may be unclear – Uses predominantly qualitative methods to study a phenomenon P. Runeson and M. Höst, “Guidelines for conducting and reporting case study research in software engineering,” Empirical Softw. Engg., vol. 14, pp. 131–164, Apr. 2009.

17 Experimental study design Experiment Experiment goal idea Hypothesis Experiment planning Experiment operation Experiment analysis C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén, Experimentation in Software Engineering. Springer Berlin Heidelberg, 2012.

18 Experiment goal Analyze <Object> for the purpose of <Purpose> with respect to their <Quality> from the point of view of the <Perspective> in the context of <Context> Example Object Product, process, resource, model, metric, … Purpose evaluate choice of technique, describe process, predict cost, … Quality effectiveness, cost, … Perspective developer, customer, manager Context Subjects (personell) and objects (artifacts under study)

19 Experiment analysis H0 hypothesis: there are no underlying differences between two sets of data Type I error: Reject H0 even though H0 is true Type II error: Accept H0 even though it is false

20 Example H0 hypothesis: ”Data-corrupting faults are as common as non-corrupting faults” There are 11 non-corrupting faults and 4 corrupting faults 4 ◆ i ✓ 1 ◆ 15 − i ✓ 15 ◆✓ 1 X What is the probability of up to four corruptive faults? 2 2 i i =0 4 ✓ 15 ◆ X a i (1 − a ) 15 − i What is the risk of a type I error, i given the probability ’a’ (!= 1/2) of the outcome? i =0

21 Parametric vs nonparametric tests Can your data be described by an underlying (normal) probability distribution? https://en.wikipedia.org/wiki/Normal_distribution#/media/File:Normal_Distribution_PDF.svg

Parametric Non-parametric 22 distribution? distribution? One factor? Chi-2, Binomial test Mann-Whitney One treatment/sample? Paired comparison/ randomized design?

23 Statistical power • P = 1 - risk of type II error

24 ”Given luminosity, hue and saturation regional values, Classification problems determine whether the picture contains a face” Factor 1 ”Given that an image contains a face, determine luminosity, hue and saturation regional values” Factor 2 Variable Factor 3 Distribution of Gray Matter Volume Brain Regions Exhibiting the Brain Scan Results (each column represents for Left Hippocampus Largest Sex Differences Vermic lobule X 33% most extreme 33% most extreme Right caudate nucleus males in the females in the Left caudate nucleus sample sample Right hippocampus Left hippocampus Right gyrus rectus Left gyrus rectus Left superior frontal gyrus, medial orbital Right superior frontal gyrus, orbital part “Male end” Intermediate “Female end” Left superior frontal gyrus, orbital part

25 Data analysis Which tasks ”Can AI agents be useful for physicians in are relevant cancer diagnosis?” to automate? Exploration ”How can we efficiently generate training data?” Validation ”What is the accuracy when detecting What data oesophageal tumors in MRI scans?” can we train agents on?

Data analysis, exploration 26 Trial Wind RTK First Plan Num plans Fireline Area Finish time Outcome speed built burned 1 high 5 model 1 27056 23.81 27.8 Success 2 high 1.67 shell 1 14537 9.6 20.82 Success 3 high 1 mbia 3 0 42.21 150 Failure 4 high 0.71 model 1 27055 40.21 44.12 Success 5 high 0.56 shell 8 0 141.05 150 Failure 6 high 0.45 model 3 0 82.48 150 Failure 7 high 5 model 1 25056 25.82 29.41 Success 8 high 1.67 model 1 27054 27.74 31.19 Success 9 medium 0.71 model 1 0 63.86 150 Failure 10 medium 0.56 mbia 7 0 68.39 150 Failure 11 medium 0.45 mbia 5 0 55.12 150 Failure 12 medium 0.71 model 1 0 13.48 150 Failure 13 medium 0.56 shell 4 42286 10.9 75.62 Success 14 low 0.71 model 1 11129 5.34 20.69 Success Paul R. Cohen, Empirical Methods in Artificial Intelligence. The MIT Press, 1995

27 Data types • Categorical data (Outcome) => Count frequency • Ordinal values (Wind speed) => Correlation coefficients • Interval or ratio scales (time to finish/best time to finish) => linear correlation coefficients

Distributions of data 28 • Parametric distributions (assuming a probability distribution) Sample/Value frequency 1 2 3 A 1/2 1/3 1/4 B 1/3 4 1/3 C 4 5 6

29 Transformations of data 1 4 5 7 45 1 1 -1 1 -10 or 1 1 -1 1 -1 2 5 4 8 35

30 Quantitative studies • Uses statistical analyses of some empirical data – Randomization of subjects – Blocking (grouping) subjects based on confounding factors

31 Factors • That which may correlate with (and possibly cause) an effect – ”How does SCRUM affect product quality as measured by the number of bugs?” – ”How is code quality affected by the choice of programming language ?” – ”How understandable is a design document when creating procedural and OO design, based on good/bad requirements ?”

32 Analysis • There must be a null hypothesis which we can test our data against • One factor, two treatments: t-test, Mann-Whitney • One factor, several treatments: ANOVA • Two factors: ANOVA

33 Statistics • There are separate statistics courses, but.. – Separate correlation and causality – Unless >= 95% confidence, there is no correlation – Confidence only part of statistical power (confidence + effect size + sample size)

34 Discussion, example Does agile development lead to higher quality code? cause-effect construct Hypothesis Fewer Agile dev defects treatment-outcome construct SCRUM/ Bugs No SCRUM reported

Your work in a wider context 35 Why do we as humans have to solve this problem?

Your work in a wider context 36 Economic Ecological Direct effects Social effects effects effects System effects stress, Job awareness, opportunities, Emissions, trust, market resource use engagement dynamics C. Becker, R. Chitchyan, L. Duboc, S. Easterbrook, B. Penzenstadler, N. Seyff, and C. C. Venters, “Sustainability design and software: the Karlskrona manifesto,” in IEEE International Conference on Software Engineering (ICSE), vol. 2, pp. 467–476, IEEE, 2015.

37 The effects of Big Data • A level 1 non-linear, chaotic dynamic system: the climate system, turbulence, population dynamics • A level 2 chaotic system: Human activities such as stock markets Stuff I like My inputs

TDDD89 Lecture 4 - Research methods Ola Leifler 2 Literature - PowerPoint PPT Presentation

TDDD89 Lecture 4 - Research methods Ola Leifler 2 Literature Cohen, Paul. Empirical Methods in Artificial Intelligence Experimentation in Software Engineering Case Study Research in Software Engineering Weapons of Math

TDDD89 Lecture 4 - Study methods Ola Leifler 2 Literature Cohen, Paul. Empirical Methods in

TD TDDD89 2018 2018 Academic Writing Pamela Vang IEI facksprk What is academic writing?

TDDD89 Lecture 3. Study methods What is a scientific method? Design, implement, test?

TDDD89 Introductions Workshop Pamela Vang Overview Structure Language Motivation Johan

TDDD89 Introduction Ola Leifler, 2017-10-31 2 Part I Course format Activities

for feature selection: An approach in breast cancer diagnosis on mammography Noel Prez 1 ,

under-predictions Barron Henderson 1 , 2 , Robert Pinder 2 , Wendy Goliff 3 , William Stockwell 4 ,

Automatic Detection of Borrowings in Lexicostatistic Datasets . . . . . Johann-Mattis List 1

Nonparametric methods and tidyr BIO5312 FALL2017 STEPHANIE J. SPIELMAN, PHD General notes

Multiplicative Masking for AES in Hardware CHES 2018 Lauren De Meyer, Oscar Reparaz, Begl

CSE 154 LECTURE 26: WEB SECURITY Our current view of security until now, we have assumed:

Running Android in a Container How the play store runs on Chrome OS How Android Runs On Chrome

CSc 337 LECTURE 24: SECURITY Our current view of security until now, we have assumed:

Estimating the EOS from the measurement of NS radii with 5% accuracy Magdalena Sieniawska 1 ,

WireGuard zswu Computer Center, CS, NCTU WireGuard Introduction Simple and fast VPN solution

#15: CS Ethics SAMS SENIOR CS TRACK Last Time Understand how efficiency changes how long a

1 Principle of Easiest Penetration Principle of Easiest Penetration: An intruder must be expected

SPEED. I AM SPEED Rich Yumul Sage Tree Solutions rmy@sagetree.net twitter: @richyumul gdo:

r str

P P Partial Partial-Scan & Scan ti l ti l S S Scan & Scan & S & S

CISC 322 Software Architecture Lecture 16: Design Patterns 3 Emad Shihab Material drawn from

Classification of communication and cooperation mechanisms for logical and symbolic computation

DRAFT Scaling MySQL with Python draft2 Roberto Polli - roberto.polli@par-tec.it Par-Tec Spa -

Invasive Malleable Applications Sebastian Buchwald, Manuel Mohr, Andreas Zwinkau Karlsruhe

TDDD89 Lecture 4 - Research methods Ola Leifler 2 Literature - PowerPoint PPT Presentation

TDDD89 Lecture 4 - Research methods Ola Leifler 2 Literature Cohen, Paul. Empirical Methods in Artificial Intelligence Experimentation in Software Engineering Case Study Research in Software Engineering Weapons of Math

TDDD89 Lecture 4 - Study methods Ola Leifler 2 Literature Cohen, Paul. Empirical Methods in

TD TDDD89 2018 2018 Academic Writing Pamela Vang IEI facksprk What is academic writing?

TDDD89 Lecture 3. Study methods What is a scientific method? Design, implement, test?

TDDD89 Introductions Workshop Pamela Vang Overview Structure Language Motivation Johan

TDDD89 Introduction Ola Leifler, 2017-10-31 2 Part I Course format Activities

for feature selection: An approach in breast cancer diagnosis on mammography Noel Prez 1 ,

under-predictions Barron Henderson 1 , 2 , Robert Pinder 2 , Wendy Goliff 3 , William Stockwell 4 ,

Automatic Detection of Borrowings in Lexicostatistic Datasets . . . . . Johann-Mattis List 1

Nonparametric methods and tidyr BIO5312 FALL2017 STEPHANIE J. SPIELMAN, PHD General notes

Multiplicative Masking for AES in Hardware CHES 2018 Lauren De Meyer, Oscar Reparaz, Begl

CSE 154 LECTURE 26: WEB SECURITY Our current view of security until now, we have assumed:

Running Android in a Container How the play store runs on Chrome OS How Android Runs On Chrome

CSc 337 LECTURE 24: SECURITY Our current view of security until now, we have assumed:

Estimating the EOS from the measurement of NS radii with 5% accuracy Magdalena Sieniawska 1 ,

WireGuard zswu Computer Center, CS, NCTU WireGuard Introduction Simple and fast VPN solution

#15: CS Ethics SAMS SENIOR CS TRACK Last Time Understand how efficiency changes how long a

1 Principle of Easiest Penetration Principle of Easiest Penetration: An intruder must be expected

SPEED. I AM SPEED Rich Yumul Sage Tree Solutions rmy@sagetree.net twitter: @richyumul gdo:

r str

P P Partial Partial-Scan &amp; Scan ti l ti l S S Scan &amp; Scan &amp; S &amp; S

CISC 322 Software Architecture Lecture 16: Design Patterns 3 Emad Shihab Material drawn from

Classification of communication and cooperation mechanisms for logical and symbolic computation

DRAFT Scaling MySQL with Python draft2 Roberto Polli - roberto.polli@par-tec.it Par-Tec Spa -

Invasive Malleable Applications Sebastian Buchwald, Manuel Mohr, Andreas Zwinkau Karlsruhe

P P Partial Partial-Scan & Scan ti l ti l S S Scan & Scan & S & S