Welcome and Introduction 5 Presenter G. Gag age e Kingsbur - - PowerPoint PPT Presentation

welcome and introduction
SMART_READER_LITE
LIVE PREVIEW

Welcome and Introduction 5 Presenter G. Gag age e Kingsbur - - PowerPoint PPT Presentation

Decisions to be made in developing an adaptive testing system for K 12 education G. Gage Kingsbury March 9, 2012 Welcome and Introduction 5 Presenter G. Gag age e Kingsbur sbury Vice President for the International Association for


slide-1
SLIDE 1

Decisions to be made in developing an adaptive testing system for K–12 education

  • G. Gage Kingsbury

March 9, 2012

slide-2
SLIDE 2

Welcome and Introduction

5

slide-3
SLIDE 3

Presenter

  • G. Gag

age e Kingsbur sbury Vice President for the International Association

for Computerized Adaptive Testing (IACAT) and Senior Research Fellow at the Northwest Evaluation Association (NWEA)

6

slide-4
SLIDE 4

Decisions to be made in developing an adaptive testing system for K–12 education

7

slide-5
SLIDE 5

The Idea An adaptive test is a test that adjusts its characteristics based

  • n the performance of a test taker.

8

slide-6
SLIDE 6

Questions and Answers

9

slide-7
SLIDE 7

Computerized Adaptive Testing

175 191 202 210 216 221 228 231 229 228 230 232 234 235 234 233 234 235 150 160 170 180 190 200 210 220 230 240 250 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Test Questions Achievement Score

Basic Proficient Advanced

20 Item Test

225 226

10

slide-8
SLIDE 8

Pioneers of adaptive testing

  • Alfred Binet
  • Frederick Lord
  • David J. Weiss
  • Fumiko Samejima
  • Mark Reckase

11

slide-9
SLIDE 9

First implementers

  • David Foster
  • Jim McBride
  • Tony Zara
  • Gage Kingsbury

12

slide-10
SLIDE 10

You have chosen to use an adaptive test because …

  • It can be more efficient than a fixed-form test
  • It provides good information across a broader

spectrum of student performance

  • It can provide immediate scoring and

reporting

  • It can provide better security than a fixed-form

test

  • It can be designed to measure growth

13

slide-11
SLIDE 11

Since the first implementations

  • We have seen international growth in the use
  • f CAT for

– Educational testing – Medical outcomes assessment – Certification and licensure

14

slide-12
SLIDE 12

Accuracy of adaptive tests

  • Compared to a fixed-form test
  • As a function of test length
  • Depending on termination procedure

15

slide-13
SLIDE 13

Relationship between Spring and Fall Reading Scores

150 160 170 180 190 200 210 220 230 240 250 150 160 170 180 190 200 210 220 230 240 250

Spring RIT Fall RIT

PP to CAT PP to PP

16

slide-14
SLIDE 14

Students' Mean = 211.7 s.d. = 192 Proficiency = 205 11.11 Basic =

Test Information Functions for Grade 4 Mathematics

.00 .02 .04 .06 .08 .10 .12

165 175 185 195 205 215 225 235 245

RIT Information

17

slide-15
SLIDE 15

Choosing to use an adaptive test requires making a series of decisions in the areas of…

  • Psychometrics
  • Interface (including accommodations)
  • Item designs
  • Test designs
  • Test distribution
  • Item usage
  • Item and test security
  • Proctor training
  • Reporting

18

slide-16
SLIDE 16

Basics of a theoretical CAT

  • IRT model
  • Item pool
  • Select first item
  • Select next item
  • Terminate test
  • Score

19

slide-17
SLIDE 17

Decision areas for an operational CAT for measuring student achievement

  • Be

Before

  • re th

the e tes est t (T (Test est stu tuff)

– How will we develop the measurement scale? – What mix of item styles will we need? – Which IRT model is appropriate? – What depth do we need in our item bank? – How will we choose an operational item pool? – What will our test blueprint include? – How will we QA everything involved?

20

slide-18
SLIDE 18

Questions and Answers

21

slide-19
SLIDE 19

Decision areas for an operational CAT for measuring student achievement

  • Be

Before

  • re th

the e tes est t (S (School chool stu tuff) f)

– School, teacher, and student identification – Establishing a testing environment – Teacher training – Software/hardware setup – Proctor training – Student familiarization – Student scheduling – QA

22

slide-20
SLIDE 20

Decision areas for an operational CAT for measuring student achievement

  • Test

est ad admi ministra stration tion

– Student verification process – Test selection – Proctor throughout – Identify previously used items

23

slide-21
SLIDE 21

Decision areas for an operational CAT for measuring student achievement

  • Test

est even ent

– Apply test blueprint – Select first item or set of items – Check for effort – Update item selection theta hat – Update constraints – Select next item – Terminate test

24

slide-22
SLIDE 22

Decision areas for an operational CAT for measuring student achievement

  • After

er the e tes est

– Calculate final score – Calculate growth – Terminate test session – Store data – Identify student as completing test – Compare to norms, growth norms, content, etc. – Create individual student report – Add information to teacher/administrator reports

25

slide-23
SLIDE 23

Measuring growth and adaptive testing

  • Measuring at multiple points in time
  • The standard deviation of growth
  • The standard error of growth
  • Reduction of uncertainty
  • Growth and instruction

26

slide-24
SLIDE 24

Adaptive testing and idiosyncratic knowledge patterns

  • Can there be multiple thetas without

multidimensionality?

  • Selecting items to reveal knowledge patterns
  • A simple algorithm
  • The impact on instruction

27

slide-25
SLIDE 25

Field testing within an adaptive testing system

  • Calibration differences from paper to CAT
  • Random sampling for calibration in CAT
  • Using provisional calibrations in CAT field

tests

28

slide-26
SLIDE 26

Cautionary notes

  • Adaptive testing needs to be well tuned to

avoid bad tests.

  • The item pool must support the stakes.
  • Adaptive testing changes, but doesn’t

eliminate, security issues.

– Brain dump sites

  • Limit desire. No test can do everything.
  • Adaptive test development is never done.

29

slide-27
SLIDE 27

Have fun

  • The decisions to be made should consider the

good of the students for whom the test is designed.

  • Don’t try to build the perfect test—it won’t be.
  • Consider a ―dry eye‖ policy—making kids cry

isn’t the purpose of the test.

30

slide-28
SLIDE 28

Thank you Gage Kingsbury gagekingsbury@comcast.net

31