Welcome and Introduction 5 Presenter G. Gag age e Kingsbur - - PowerPoint PPT Presentation

▶

Sep 08, 2022 281 likes •578 views

Decisions to be made in developing an adaptive testing system for K 12 education G. Gage Kingsbury March 9, 2012 Welcome and Introduction 5 Presenter G. Gag age e Kingsbur sbury Vice President for the International Association for

SLIDE 1

Decisions to be made in developing an adaptive testing system for K–12 education

G. Gage Kingsbury

March 9, 2012

SLIDE 2

Welcome and Introduction

SLIDE 3

Presenter

G. Gag

age e Kingsbur sbury Vice President for the International Association

for Computerized Adaptive Testing (IACAT) and Senior Research Fellow at the Northwest Evaluation Association (NWEA)

SLIDE 4

Decisions to be made in developing an adaptive testing system for K–12 education

SLIDE 5

The Idea An adaptive test is a test that adjusts its characteristics based

n the performance of a test taker.

SLIDE 6

Questions and Answers

SLIDE 7

Computerized Adaptive Testing

175 191 202 210 216 221 228 231 229 228 230 232 234 235 234 233 234 235 150 160 170 180 190 200 210 220 230 240 250 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Test Questions Achievement Score

Basic Proficient Advanced

20 Item Test

225 226

SLIDE 8

Pioneers of adaptive testing

Alfred Binet
Frederick Lord
David J. Weiss
Fumiko Samejima
Mark Reckase

SLIDE 9

First implementers

David Foster
Jim McBride
Tony Zara
Gage Kingsbury

SLIDE 10

You have chosen to use an adaptive test because …

It can be more efficient than a fixed-form test
It provides good information across a broader

spectrum of student performance

It can provide immediate scoring and

reporting

It can provide better security than a fixed-form

test

It can be designed to measure growth

SLIDE 11

Since the first implementations

We have seen international growth in the use
f CAT for

– Educational testing – Medical outcomes assessment – Certification and licensure

SLIDE 12

Accuracy of adaptive tests

Compared to a fixed-form test
As a function of test length
Depending on termination procedure

SLIDE 13

Relationship between Spring and Fall Reading Scores

150 160 170 180 190 200 210 220 230 240 250 150 160 170 180 190 200 210 220 230 240 250

Spring RIT Fall RIT

PP to CAT PP to PP

SLIDE 14

Students' Mean = 211.7 s.d. = 192 Proficiency = 205 11.11 Basic =

Test Information Functions for Grade 4 Mathematics

.00 .02 .04 .06 .08 .10 .12

165 175 185 195 205 215 225 235 245

RIT Information

SLIDE 15

Choosing to use an adaptive test requires making a series of decisions in the areas of…

Psychometrics
Interface (including accommodations)
Item designs
Test designs
Test distribution
Item usage
Item and test security
Proctor training
Reporting

SLIDE 16

Basics of a theoretical CAT

IRT model
Item pool
Select first item
Select next item
Terminate test
Score

SLIDE 17

Decision areas for an operational CAT for measuring student achievement

Before

re th

the e tes est t (T (Test est stu tuff)

– How will we develop the measurement scale? – What mix of item styles will we need? – Which IRT model is appropriate? – What depth do we need in our item bank? – How will we choose an operational item pool? – What will our test blueprint include? – How will we QA everything involved?

SLIDE 18

Questions and Answers

SLIDE 19

Decision areas for an operational CAT for measuring student achievement

Before

re th

the e tes est t (S (School chool stu tuff) f)

– School, teacher, and student identification – Establishing a testing environment – Teacher training – Software/hardware setup – Proctor training – Student familiarization – Student scheduling – QA

SLIDE 20

Decision areas for an operational CAT for measuring student achievement

Test

est ad admi ministra stration tion

– Student verification process – Test selection – Proctor throughout – Identify previously used items

SLIDE 21

Decision areas for an operational CAT for measuring student achievement

Test

est even ent

– Apply test blueprint – Select first item or set of items – Check for effort – Update item selection theta hat – Update constraints – Select next item – Terminate test

SLIDE 22

Decision areas for an operational CAT for measuring student achievement

After

er the e tes est

– Calculate final score – Calculate growth – Terminate test session – Store data – Identify student as completing test – Compare to norms, growth norms, content, etc. – Create individual student report – Add information to teacher/administrator reports

SLIDE 23

Measuring growth and adaptive testing

Measuring at multiple points in time
The standard deviation of growth
The standard error of growth
Reduction of uncertainty
Growth and instruction

SLIDE 24

Adaptive testing and idiosyncratic knowledge patterns

Can there be multiple thetas without

multidimensionality?

Selecting items to reveal knowledge patterns
A simple algorithm
The impact on instruction

SLIDE 25

Field testing within an adaptive testing system

Calibration differences from paper to CAT
Random sampling for calibration in CAT
Using provisional calibrations in CAT field

tests

SLIDE 26

Cautionary notes

Adaptive testing needs to be well tuned to

avoid bad tests.

The item pool must support the stakes.
Adaptive testing changes, but doesn’t

eliminate, security issues.

– Brain dump sites

Limit desire. No test can do everything.
Adaptive test development is never done.

SLIDE 27

Have fun

The decisions to be made should consider the

good of the students for whom the test is designed.

Don’t try to build the perfect test—it won’t be.
Consider a ―dry eye‖ policy—making kids cry

isn’t the purpose of the test.

SLIDE 28

Decisions to be made in developing an adaptive testing system for K–12 education

March 9, 2012

Welcome and Introduction

Presenter

age e Kingsbur sbury Vice President for the International Association

for Computerized Adaptive Testing (IACAT) and Senior Research Fellow at the Northwest Evaluation Association (NWEA)

Decisions to be made in developing an adaptive testing system for K–12 education

The Idea An adaptive test is a test that adjusts its characteristics based

Questions and Answers

Computerized Adaptive Testing

Test Questions Achievement Score

Pioneers of adaptive testing

First implementers

You have chosen to use an adaptive test because …

spectrum of student performance

reporting

test

Since the first implementations

– Educational testing – Medical outcomes assessment – Certification and licensure

Accuracy of adaptive tests

Relationship between Spring and Fall Reading Scores

Test Information Functions for Grade 4 Mathematics

Choosing to use an adaptive test requires making a series of decisions in the areas of…

Basics of a theoretical CAT

Decision areas for an operational CAT for measuring student achievement

Before

the e tes est t (T (Test est stu tuff)

– How will we develop the measurement scale? – What mix of item styles will we need? – Which IRT model is appropriate? – What depth do we need in our item bank? – How will we choose an operational item pool? – What will our test blueprint include? – How will we QA everything involved?

Questions and Answers

Decision areas for an operational CAT for measuring student achievement

Before

the e tes est t (S (School chool stu tuff) f)

– School, teacher, and student identification – Establishing a testing environment – Teacher training – Software/hardware setup – Proctor training – Student familiarization – Student scheduling – QA

Decision areas for an operational CAT for measuring student achievement

est ad admi ministra stration tion

– Student verification process – Test selection – Proctor throughout – Identify previously used items

Decision areas for an operational CAT for measuring student achievement

est even ent

– Apply test blueprint – Select first item or set of items – Check for effort – Update item selection theta hat – Update constraints – Select next item – Terminate test

Decision areas for an operational CAT for measuring student achievement

er the e tes est

– Calculate final score – Calculate growth – Terminate test session – Store data – Identify student as completing test – Compare to norms, growth norms, content, etc. – Create individual student report – Add information to teacher/administrator reports

Measuring growth and adaptive testing

Adaptive testing and idiosyncratic knowledge patterns

multidimensionality?

Field testing within an adaptive testing system

tests

Cautionary notes

avoid bad tests.

eliminate, security issues.

– Brain dump sites

Have fun

good of the students for whom the test is designed.

isn’t the purpose of the test.

Thank you Gage Kingsbury gagekingsbury@comcast.net