Operational Research in Assessm ent Program s as a W indow into - PowerPoint PPT Presentation

Operational Research in Assessm ent Program s as a W indow into Task and I tem Design Principles: Exam ples from NAEP Panel: Madeleine Keehner, Hilary Persky, and Luis Saldivia, Educational Testing Service Discussant: Robin Hill, Kentucky Department of Education 2018 NCSA, San Diego, CA

Overview of key aspects of hum an cognition that are relevant to item and task design Research findings and theory from cognitive science Madeleine Keehner

Design Decisions in Assessm ent Developm ent How we measure Task structure, item types, response modes, interactive capabilities, design devices, Constructs – graphics, text, media, Target KSAs layouts… What we measure

These Design Decisions I m pact Key Processes Cognitive Affective Perception and attention Engagement WM load, exec functions Motivation Intrinsic/extraneous load Enjoyment Task structure, item LTM schema activation Frustration types, response modes, Metacognition Boredom interactive capabilities, design devices, Behavioral Social graphics, text, media, Affordances for action Collaborative layouts… Embodiment Communicative

Zoom ing in on Cognition and Behavior Perception and How do external attention item and task design features influence these Long internal cognitive Working term processes? memory schema Action planning and control 5

How External Design Features affect I nternal Processes Perception can be Attention can be captured Perception and overloaded by too by salient features ; it can be attention much information directed through signaling Long Working term memory schema Action planning and control 6

How External Design Features affect I nternal Processes Perception and attention Total processing load may exceed WM capacity Long Working term memory schema With good design, extraneous load can be minimized, intrinsic Action planning load can be optimized and control 7

How External Design Features affect I nternal Processes Perception and Familiar response modes, attention technology, or task types can activate learned schema and reduce WM load Long Working term memory schema Schema may be inappropriately triggered by familiar-feeling formats Action planning and control 8

How External Design Features affect I nternal Processes Perception and attention Long Working term memory schema Action planning and control The affordances of a We may not know what display can make some behaviors we are behaviors more likely ‘inviting’ with our design 9

Conclusion: External Representations affect I nternal Processes Perception and E xternal item and attention task design features interact with internal Long cognitive Working term processes memory schema Action planning and control 10

NAEP Reading Exam ple: I nsights from Pretesting an I nnovative I nterface Design NAEP eReader design problem: • How to present reading passages and items on tablet • Allow students to interact fluently with them • Gather evidence of reading processes • Full-screen presentation would allow for widest variety of passages • Items presented in a separate window or panel would allow for wide variety of item types • Navigational aides provided to facilitate navigation between items and passage

Com parison of Different Layouts 1 vs 2 column passage Items swiped in from the Dinosaur Skeleton Fish Fossils right side Fish Fossils • WM load if items not always visible? • How do interactive behaviors differ with visual occlusion? Look-back buttons in items • Schema for use? • Sufficiently salient?

I nteraction Behaviors: Sw iping I tem s On and Off Two-column layouts: 4 th and 8 th • Swiping (L and R) happened more in • layouts where items overlap text Graders differed (two-column passages) – 4 th Graders: swiped on and then – Where there was no overlap (one- off column - blue) students still swipe L (on) but hardly ever swipe R (off) – 8 th Graders: swiped on, did other actions, then swiped off • Item is visible all the time • Is this too different from P&P? • Does it change the way students read/ search? Some performance differences: G4 did a little better with 1-column, G8 had longer CRs with 1-column

Overall I nsights and Eventual Design Decisions • Different behavioral affordances from 1 and 2 column layouts – Students do not remove items if they are not occluding text • Suggests less cognitive effort to leave on – only removed when in the way – Performance similar but not identical (note: no P&P baseline) – More process information when swiping on and off – Always-visible items might change reading strategy/ approach (diff from P&P) – Expert committee decision: Two-column layout appropriate operational trade-off – (Note: interface design still evolving) • Use of look-back buttons in items hardly ever observed – Interview questions indicated students had not noticed them – Suggests no schema to look for them and not salient enough to capture attention – Design tweak: Visual salience was enhanced, instruction added to tutorial

Take-Hom e 1 : Design Decisions I m pact Basic Processes, and the Reverse should Also be True Cognitive Affective Perception and attention Engagement WM load, exec functions Motivation Task structure, item Intrinsic/extraneous load Enjoyment types, response modes, LTM schema activation Frustration Metacognition Boredom interactive capabilities, design devices, graphics, text, media, Behavioral Social layouts… Affordances for action Collaborative Embodiment Communicative Knowledge of these basic processes should also impact our design decisions

Take-Hom e 2 : I nterdisciplinary Collaboration is needed to do Justice to both W hat and How • Assessment developers – Subject-matter content expertise, item and task design experience • Learning scientists – Subject-relevant cognitive and learning expertise • Cognitive scientists – Expertise in general cognitive, metacognitive, behavioral, social, and affective processes; usability and cognitive research methods; human-computer interaction, etc. – (And many others, of course… ..)

Take-Hom e 3 : More and Better Research Needed • Traditional items are supported by decades of psychometric research – Empirical data: item response characteristics, validity studies, etc. • Digital assessments allow many more options for: – Varied stimuli and representations – Different response modes and response behaviors – Other kinds of behaviors and interactions • Psychometric approach alone may not be enough – Basic properties of cognition need to be examined, and considered a priori – Requires experimental cognitive research methods and analyses – Meanwhile, let’s look at some insights from operational pretesting studies… . 19

A Pretesting Study: Effects of avatars ( and leveling) in SBTs on students Hilary Persky

Background  The affordances of DBA allow assessments to better reflect authentic reading experiences, which are purpose driven, at times collaborative, and involve various types and levels of support.  Many believe the construct of reading comprehension has broadened with advent of digital literacies.  Purpose driven tasks have been taken up by the next generation state assessments as well as national and international assessments (PIRLS and PISA). 21

W hy the study?  Avatars used in new NAEP reading tasks to: • introduce and reaffirm overall task and specific activity purposes • simulate conversation/ collaboration • assist in task transitions • reset student understanding (leveling)  Some stakeholder concerns: • Do avatars add cognitive load? • Are avatars actually engaging? • Does “leveling” negatively affect students? 22

Study Questions  Main focus: Does having student avatars affect • Test performance? • Test-taking behaviors? • Affective responses?  Do we see any effects of leveling? 23

Study Design  Two assessment tasks: literary and informational • Two versions of each task – Avatar vs Non-avatar • Leveling in both versions • Student survey on – Preferences and affective responses – Background information (digital access; reading motivation) 24

Study Approach  Tryout (like normal admin): • 100 students recruited from the DC area • Randomly assigned to the Avatar or Non- avatar conditions (each student took only one task)  Cog labs (one on one; think aloud, eye tracking, post-task interview): • 12 students, recruited from Trenton, Ewing, Princeton • Randomly assigned to the Avatar and Non- avatar conditions 25

Tryout Perform ance Results  No significant effects on total tasks scores or item scores  The number of high- and low-performing students was similarly distributed in the avatar and non-avatar conditions.  No significant interactions with gender, race/ ethnicity, SES, or digital access (based on survey items included in the tryout). 26

Tryout Process Data Results  No significant effect of avatars on reading behaviors such as reading speed, or the number of page turns.  No significant effect of avatars on question answering behaviors such as the number of times answers are changed, back navigation, or specific item behavior, such as select in passage behavior.  No significant effects of avatars on time use (that is, time on reading or items) 27

Operational Research in Assessm ent Program s as a W indow into - PowerPoint PPT Presentation

Operational Research in Assessm ent Program s as a W indow into Task and I tem Design Principles: Exam ples from NAEP Panel: Madeleine Keehner, Hilary Persky, and Luis Saldivia, Educational Testing Service Discussant: Robin Hill, Kentucky

Regulatory I m pact Assessm ent Regulatory I m pact Assessm ent - Main Findings and Policy

Innovative Approaches to Alternate Assessm ent Design: Tennessee Grade 2 Alternate Assessm ent

San Bernardino Com m unity College I T Services Assessm ent I T Services Assessm ent May 1 2 2

Life Cycle Assessm ent: Life Cycle Assessm ent: Discussion on full Discussion on full- -scale

Ow n Risk Solv ency Assessm ent (ORSA) Linking Risk Ma na gem ent, Ca p ita l Ma na gem ent

Topics 1 . Building national interest in threat assessm ent 2 . Virginia threat assessm ent m

EOPA Sta te Guid a nce Docum ent Mam ie Hanson State CTAE Assessm ent Coordinator (40 4) 6 57-6

The Use of Global Assessm ents in The Use of Global Assessm ents in Atopic Derm atitis Research-

Central Florida Expressw ay Authority Multim odal I nvestm ent Assessm ent Status Report and

Community Equipment Partnership Joint training on the assessm ent, prescription, & dem

I nternational research assessm ent revisited - A com parison betw een the research perform ance

Dep Depar artmen ent of of Lo Loca cal Go Governmen ent Finan Finance ce Per ersonal

Pavem ent Type Selection and Pavem ent Type Selection and Alternate Pavem ent Bidding Alternate

Departm Departm tment tment ent of Local ent of Local l Government l Government ent Finance

I EEE Conform ity Assessm ent Program IEEE Standards Association (IEEE-SA) Ravi Subramaniam

Building Useful Factors and Scales to Aid in the Assessm ent of Learning Gains and Other Student

Jamal Lewis, Principal June Campolongo, Assistant Principal Courtney Simon, Assistant Principal

Measuring Generality Jos Jos He Hernndez-Orall llo (jorallo@dsic.upv.es) Universitat

Engineering of Mind An Introduction to the Science of Intelligent Systems John Wiley & Sons,

SWATcademics: Heres the Scoop Advising Fair -- tomorrow Advising -- tomorrow Registration --

Study in the Netherlands at Tilburg University Study in the Netherlands at Tilburg University

Ready for Kindergarten Parent University Coppell ISD PRE K PROGRAM Elementary Schools: Denton

Communicating data services to cognitive misers Shawn W. Nicholson, Michigan State University,

Origins of s of M Mind St. Andrews, Scotland Dr. Christine Johnson Dept. of Cognitive Science

Sambuz

Useful Links

Newsletter

Mail Us

Operational Research in Assessm ent Program s as a W indow into - PowerPoint PPT Presentation

Operational Research in Assessm ent Program s as a W indow into Task and I tem Design Principles: Exam ples from NAEP Panel: Madeleine Keehner, Hilary Persky, and Luis Saldivia, Educational Testing Service Discussant: Robin Hill, Kentucky

Regulatory I m pact Assessm ent Regulatory I m pact Assessm ent - Main Findings and Policy

Innovative Approaches to Alternate Assessm ent Design: Tennessee Grade 2 Alternate Assessm ent

San Bernardino Com m unity College I T Services Assessm ent I T Services Assessm ent May 1 2 2

Life Cycle Assessm ent: Life Cycle Assessm ent: Discussion on full Discussion on full- -scale

Ow n Risk Solv ency Assessm ent (ORSA) Linking Risk Ma na gem ent, Ca p ita l Ma na gem ent

Topics 1 . Building national interest in threat assessm ent 2 . Virginia threat assessm ent m

EOPA Sta te Guid a nce Docum ent Mam ie Hanson State CTAE Assessm ent Coordinator (40 4) 6 57-6

The Use of Global Assessm ents in The Use of Global Assessm ents in Atopic Derm atitis Research-

Central Florida Expressw ay Authority Multim odal I nvestm ent Assessm ent Status Report and

Community Equipment Partnership Joint training on the assessm ent, prescription, &amp; dem

I nternational research assessm ent revisited - A com parison betw een the research perform ance

Dep Depar artmen ent of of Lo Loca cal Go Governmen ent Finan Finance ce Per ersonal

Pavem ent Type Selection and Pavem ent Type Selection and Alternate Pavem ent Bidding Alternate

Departm Departm tment tment ent of Local ent of Local l Government l Government ent Finance

I EEE Conform ity Assessm ent Program IEEE Standards Association (IEEE-SA) Ravi Subramaniam

Building Useful Factors and Scales to Aid in the Assessm ent of Learning Gains and Other Student

Jamal Lewis, Principal June Campolongo, Assistant Principal Courtney Simon, Assistant Principal

Measuring Generality Jos Jos He Hernndez-Orall llo (jorallo@dsic.upv.es) Universitat

Engineering of Mind An Introduction to the Science of Intelligent Systems John Wiley &amp; Sons,

SWATcademics: Heres the Scoop Advising Fair -- tomorrow Advising -- tomorrow Registration --

Study in the Netherlands at Tilburg University Study in the Netherlands at Tilburg University

Ready for Kindergarten Parent University Coppell ISD PRE K PROGRAM Elementary Schools: Denton

Communicating data services to cognitive misers Shawn W. Nicholson, Michigan State University,

Origins of s of M Mind St. Andrews, Scotland Dr. Christine Johnson Dept. of Cognitive Science

Sambuz

Useful Links

Newsletter

Mail Us

Community Equipment Partnership Joint training on the assessm ent, prescription, & dem

Engineering of Mind An Introduction to the Science of Intelligent Systems John Wiley & Sons,