It is better to observe than to criticise. Bobby Wellins (Jazz - PowerPoint PPT Presentation

“It is better to observe than to criticise.” – Bobby Wellins (Jazz Line-up, 13/2/2011) Teesside University, Social Futures Institute, seminar, 18/11/2015 1

“Best of all is is to to co convey vey th the mag e magnitude nitude of of th the eff e effect ect an and d th the e de degr gree ee of of ce cert rtai ainty nty ex expl plicitly icitly .” – Pinker (2014, p. 45) Teesside University, Social Futures Institute, seminar, 18/11/2015 2

“Usually wh what at on one e wa wants nts to to kn know ow is is no not t wh whether the cha ether the change nge ma make kes s an any di diff fferenc erence, e, bu but t to to kn know ow how w li like kely ly it is it is th that the at the ch chan ange ge wi will ll be be bi big g en enou ough gh .” – (Landauer , 1997, p. 222)”) Teesside University, Social Futures Institute, seminar, 18/11/2015 3

Ma Magnitude gnitude-based based in infer erence ence in in be beha haviour vioural al resear search Paul ul van an Schaik haik p.van an-sc schaik haik@t @tee ees.ac .ac.uk .uk http://sss p://sss-studne udnet.tees t.tees.a .ac.uk/p .uk/psy sycholog hology/staf /staff/P /Paul aul_vs/i vs/inde ndex.htm .htm Teesside University, Social Futures Institute, seminar, 18/11/2015 4

Ou Outline tline • Problem and proposed solution • Quantification in behavioural research • Statistical inference in behavioural research • Magnitude-based inference • The application of magnitude-based inference in behavioural research • Other approaches • Limitations • Recommendations Teesside University, Social Futures Institute, seminar, 18/11/2015 5

The e pr prob oblem lem A researcher conducts a study comparing two software designs in terms of their usability She conducts usability tests with two groups, each using one of the designs, and collects various measures These include perceived usability, error rate and time- on-task She then compares the two groups in terms of their mean scores on the measures, using a t test She finds that, although differences in mean scores are apparent, the test results do not show statistical significance What should the researcher conclude about the difference in usability between the two designs? Teesside University, Social Futures Institute, seminar, 18/11/2015 6

A pr proposed posed solution lution As an altnernative to null-hypothesis significance- testing (NHST), use information about uncertainty in the data, • the observed value of the effect and • smallest substantial values for the effect • to make two kinds of magnitude-based inference: mechanistic and practical Use the results of (NHST) as input Use spreadsheets available on the Internet to generate inferences Developed and influential in sport- and exercise science Teesside University, Social Futures Institute, seminar, 18/11/2015 7

Qua uantifi ntification cation in in us user rese search arch • “The systematic study of the goals, needs, and capabilities of users so as to specify design, construction, or improvement of tools to benefit how users work and live” (Schumacher, 2009, p. 6) • Usability- and user-experience data • E.g. psychometric data, error rate and time-on-task • Formative research • users ’ interaction with an artefact is studied to generate data that, when analysed, provide information to inform system improvement • Summative research • establishes the quality interaction of an artefact in comparison with another artefact or a benchmark Teesside University, Social Futures Institute, seminar, 18/11/2015 8

Sta tatistical tistical in inferen erence ce in in us user er re rese search arch Usually, null-hypothesis significance testing (NHST) is used; limitations: 1. null hypothesis of no effect is (almost) always false 2. ignores the smallest important effect: has no effect on the inference that is made in NHST 3. does not address practical relevance; does not clearly define or distinguish practical and mechanistic significance 4. a non-significant result is inconclusive and a crude classification of inference is used (reject or retain H 0 ) 5. sample size estimation is based on NHST Teesside University, Social Futures Institute, seminar, 18/11/2015 9

Me Merits its of of magnitude gnitude-based based in inference rence 1. Requires the researcher to define smallest important effect, rather than null effect 2. Uses smallest important effect as integral part of inference, so inferences are not an artefact of sample size 3. Provides a rigorous and principled approach to infer practical significance; provides a rigorous distinction between practical and mechanistic significance Teesside University, Social Futures Institute, seminar, 18/11/2015 10

Mo More merits its 4. Provides a more refined classification of inferences that can be made than merely rejecting or retaining the null hypothesis 5. Estimates of required sample size are based on practical significance or mechanistic significance and researcher-defined smallest important effect Teesside University, Social Futures Institute, seminar, 18/11/2015 11

Inf nference erence of of me mech chanistic anistic sig ignificance nificance (1 (1) For descriptive purposes, an effect can be • classified in terms of its size • in relation to smallest important + and - effect size • as positive, trivial or negative For inference proper, the chances of an effect • being positive, negative or trivial are used • The chances of the effect being positive: effect falling above the threshold of the smallest important + effect • The chances of the effect being negative: effect falling below the threshold of the smallest important - effect • The chances of a trivial effect: 100% minus the sum of the chances of a + effect and those of a - effect Teesside University, Social Futures Institute, seminar, 18/11/2015 12

Inf nference erence of of me mech chanistic anistic sig ignificance nificance (2 (2) An inference is then made from the chances of • each of three ranges of outcome (positivity, triviality and negativity) as follows • Unclear effect: both the chances of the obtained effect being + and the chances of the effect being - effect are too large (e.g., both greater than the default value of 0.05 or other appropriate cut-offs). • Otherwise, clear effect, seen as substantially +, - or trivial and considered to have the size of the observed value, with a qualification of probability Proposed interpretation of probability ranges • Teesside University, Social Futures Institute, seminar, 18/11/2015 13

The effect … Probability Chances Odds positive/trivial/negative beneficial/negligible/harmful is almost certainly not … <0; 0.005] <0; 0.5%] <0; 1:199] is very unlikely to be … <0.005; 0.05] <0.5%; 5%] <1:199: 1:19] is unlikely to be …, is probably not <0.05; 0.25] <5%; 25%] <1:19; 1:3] … is possibly (not) …, may (not) be … <0.25; 0.75] <25%; 75%] <1:3; 3:1] is likely to be ..., is probably … <0.75; 0.95] <75%; 95%] <3:1; 19:1] is very likely to be … <0.95; 0.995] <95%; 99.5%] <19:1; 199:1] <199:1;  > is almost certainly … <0.995; 1> <99.5; 100> Teesside University, Social Futures Institute, seminar, 18/11/2015 14

Teesside University, Social Futures Institute, seminar, 18/11/2015 15

Teesside University, Social Futures Institute, seminar, 18/11/2015 16

Inf nference erence of of pra ractical ctical sig ignific nificance ance (1 (1) For descriptive purposes, an effect can be • classified in terms of its size • in relation to smallest important beneficial and harmful effect size • as beneficial, negligible or harmful For inference proper, the chances of an effect • being beneficial, harmful or negligible are used • The chances of the effect being beneficial: effect falling above the threshold of the smallest important ben. effect • The chances of the effect being harmful: effect falling below the threshold of the smallest important harmf. effect • The chances of a negligible effect: 100% minus the sum of the chances of a ben. effect and those of a harmf. effect Teesside University, Social Futures Institute, seminar, 18/11/2015 17

Inf nference erence of of pra ractical ctical sig ignific nificance ance (2 (2) Type-1 practical error • • analogous to that of Type-I error in NHST (rejecting the null hypothesis when it is true) Type-2 practical error • • analogous to that of Type-II error in NHST (retaining the null hypothesis when it is false) In the practical (‘clinical’) application of effects • • the chance of using a harmful effect (a Type-1 practical error) needs to be far smaller than • the chance of not using a beneficial effect (a Type-2 practical error) Teesside University, Social Futures Institute, seminar, 18/11/2015 18

It is better to observe than to criticise. Bobby Wellins (Jazz - PowerPoint PPT Presentation

It is better to observe than to criticise. Bobby Wellins (Jazz Line-up, 13/2/2011) Teesside University, Social Futures Institute, seminar, 18/11/2015 1 Best of all is is to to co convey vey th the mag e magnitude nitude of

ROCKBOX FABRIQ EDITION ITS TIME FOR FOR BETTER SOUND. BETTER DESIGN. BETTER SPECS.

Better Advice, Better Lives Adults Select Committee 21 st June Usk 1 Better Advice, Better Lives

Architecture Research On Transport Information Services of EXPO 2010 Shanghai China Better City,

Introductory Webinar Better Care, Better Health, Better Value A Better Rehabilitative Care System

Better health Better health Better health Better health for Europe: for Europe: p equitable

BETTER BART BETTER BAY AREA BETT BETTER ER BAR ART T / / BETT BETTER ER BAY Y AREA AREA

Night Float Read the Checklist on the following page in your packet and be prepared to observe and

Using light to observe the brain Gyorgy Lur, PhD Bio Sci H195, University of California, Irvine

The Top Quark We Observe Gustaaf Brooijmans Indirect Searches for New Physics at the Time of the

Automated Observation 4 4 Automated Observation what when what to observe to observe to

PRESENTATION 29 November 2018 CXBLADDER BETTER SOLUTIONS BETTER CARE Our goals are to enable

>>> import this The Zen of Python, by Tim Peters Beautiful is better than ugly.

Better Data, Better Tools, Better Decisions: Introduction to the Office of Computational Science

Feel Better. Perform Better. Fly Better. Holistic Mental and Physical Health Practitioner Options

RescueNet Road Safety for Manitoulin-Sudbury EMS Better Driving Better Safety Better Results

+ Better negotiations. Better decision making. Better results. VY Nuclear Decommissioning

methodological & statistical issues to communicate in research proposals w. cools Compiled

Evaluation of Data Needs to Support Water Quality Models for Setting Nutrient Targets Tuesday,

Outline GoDetect-ESD TM Developed GoDetect-ESD TM features Test time significantly

Power Analysis Ben Kite and Terrance Jorgensen KU CRMDA 2017 Stats Camp Recall Hypothesis

The Modelling and Simulation Process 1. History of Modelling and Simulation 2. Modelling and

A Brief Overview 26 AUG 2014 Dr Rohit A Chitale Director, Division of Integrated

Stormwater Legacy systems Ageing and fragile networks Community connection Legacy

Custom Writing Service - Special Prices Presentation of research paper year 3 Dance your

Sambuz

Useful Links

Newsletter

Mail Us

It is better to observe than to criticise. Bobby Wellins (Jazz - PowerPoint PPT Presentation

It is better to observe than to criticise. Bobby Wellins (Jazz Line-up, 13/2/2011) Teesside University, Social Futures Institute, seminar, 18/11/2015 1 Best of all is is to to co convey vey th the mag e magnitude nitude of

ROCKBOX FABRIQ EDITION ITS TIME FOR FOR BETTER SOUND. BETTER DESIGN. BETTER SPECS.

Better Advice, Better Lives Adults Select Committee 21 st June Usk 1 Better Advice, Better Lives

Architecture Research On Transport Information Services of EXPO 2010 Shanghai China Better City,

Introductory Webinar Better Care, Better Health, Better Value A Better Rehabilitative Care System

Better health Better health Better health Better health for Europe: for Europe: p equitable

BETTER BART BETTER BAY AREA BETT BETTER ER BAR ART T / / BETT BETTER ER BAY Y AREA AREA

Night Float Read the Checklist on the following page in your packet and be prepared to observe and

Using light to observe the brain Gyorgy Lur, PhD Bio Sci H195, University of California, Irvine

The Top Quark We Observe Gustaaf Brooijmans Indirect Searches for New Physics at the Time of the

Automated Observation 4 4 Automated Observation what when what to observe to observe to

PRESENTATION 29 November 2018 CXBLADDER BETTER SOLUTIONS BETTER CARE Our goals are to enable

&gt;&gt;&gt; import this The Zen of Python, by Tim Peters Beautiful is better than ugly.

Better Data, Better Tools, Better Decisions: Introduction to the Office of Computational Science

Feel Better. Perform Better. Fly Better. Holistic Mental and Physical Health Practitioner Options

RescueNet Road Safety for Manitoulin-Sudbury EMS Better Driving Better Safety Better Results

+ Better negotiations. Better decision making. Better results. VY Nuclear Decommissioning

methodological &amp; statistical issues to communicate in research proposals w. cools Compiled

Evaluation of Data Needs to Support Water Quality Models for Setting Nutrient Targets Tuesday,

Outline GoDetect-ESD TM Developed GoDetect-ESD TM features Test time significantly

Power Analysis Ben Kite and Terrance Jorgensen KU CRMDA 2017 Stats Camp Recall Hypothesis

The Modelling and Simulation Process 1. History of Modelling and Simulation 2. Modelling and

A Brief Overview 26 AUG 2014 Dr Rohit A Chitale Director, Division of Integrated

Stormwater Legacy systems Ageing and fragile networks Community connection Legacy

Custom Writing Service - Special Prices Presentation of research paper year 3 Dance your

Sambuz

Useful Links

Newsletter

Mail Us

>>> import this The Zen of Python, by Tim Peters Beautiful is better than ugly.

methodological & statistical issues to communicate in research proposals w. cools Compiled