The Role of Item Models in Automatic Item Generation Mark J. Gierl - PowerPoint PPT Presentation

The Role of Item Models in Automatic Item Generation Mark J. Gierl Hollis Lai Centre for Research in Applied Measurement and Evaluation University of Alberta CCSSO Symposium—Orlando, FL June 22, 2011

CHANGING TIMES • Developments in cognitive science, mathematical statistics, the learning sciences, computer technology, and educational psychology are creating profound changes in educational measurement • Assessment engineering (AE; Luecht, 2006a, 2006b, 2011) is an innovative approach to measurement where engineering ‐ like principles are used to direct the design and analysis of assessments as well as the scoring and reporting of the results Our vision of a 21st ‐ century testing program capitalizes on modern technology and takes advantage of recent innovations in testing. Using an analogy from engineering, we envision a modern testing program as an integrated system of systems. (Drasgow, Luecht, & Bennett, 2006)

CHANGING TIMES • Developing a test using AE requires three explicit steps: STEP #1: An assessment begins with specific, empirically ‐ derived cognitive model of task performance; STEP #2: Item models are then created to produce replicable assessment tasks; STEP #3: Psychometric methods are applied to the examinee response data—typically in a confirmatory mode—to produce scores that are both replicable and interpretable

CONVENTIONAL ITEM DEVELOPMENT

AUTOMATIC ITEM GENERATION  By way of contrast, the idea of automatic item generation is seen as a dream come true by many testing agencies given that large item banks are required for continuous testing and that item development with humans is both time consuming and expensive  The first requirement is that an item class can be described sufficiently for a computer to create instances of that class automatically—the purpose of our study is to describe how item models can be used to specify the item class  The second requirement is that the determinants of item difficulty be understood well enough so that each of the generated instances need not be calibrated individually

AUTOMATIC ITEM GENERATION  STRONG THEORY: The goal of automatic item generation from strong theory is to generate calibrated items automatically from design principles using a theory of difficulty based on a cognitive model  The theory needs to describe the cognitive mechanism required to solve the items and the features of items that cause difficulty levels to vary  Item generation from strong theory, at least right now, is best suited to specific domains where cognitive analysis is more feasible and where well ‐ developed theories are more likely to exist

AUTOMATIC ITEM GENERATION  WEAK THEORY: The goal of automatic item generation from weak theory is to generate calibrated items automatically from design guidelines using a theory of invariance  Often, the starting point is to use a parent item whose psychometric characteristics are known; then through experience, intuition, theory, and luck create an item model by identifying characteristics of the parent item that affect item difficulty; finally, vary those characteristics that affect difficulty to generate new items  Weak theory has resulted in many operational examples of item generation—however, because the determinants of difficulty are not well understood, fewer item characteristics can be varied simultaneously and items, as a result, may be more visibly similar that those generated by strong theory

ITEM MODELS BASIC MATH ITEM AND ITEM MODEL Ann has paid $1525 for planting her lawn. The cost of lawn is $45/m 2 . Given the shape of her lawn is square, what is the side length of Ann’s lawn? A. 5.8 B. 6.8 C. 4.8 D. 7.3

ITEM MODELS STEM: Ann has paid $I1 for planting her lawn. The cost of lawn is $I2/m 2 . Given the shape of her lawn is S1, what is the S2 of Ann’s lawn? Manipulating the integers can increase or ELEMENTS: decrease the range of generated items I1 Value Range: 1525 ‐ 1675 by 75 I2 Value Range: 45 or 30 Any geometric concept could be added for our S1 Range: “square” or “round” string variables S2 Range: “side length” or “radius” OPTIONS: S1=”square” S1=”round” S2=”side length” S2=”radius” A= I I 1 2 I I 1 2*3.14 A= B=A+1 B=A+1 C=A ‐ 1 C=A ‐ 1 D=A+1.5 D=A+1.5 KEY: A

ITEM MODELS Model ‐ based item development has many practical advantages: • More strategic test construction where the purpose of development is, first, to create item models, and then to generate content for the models to populate an item bank • Test assembly becomes model based, meaning that tests are composed of instances from the item bank • The logic behind model ‐ based item development can lead to more efficient test construction (i.e., larger number of items and fewer discarded items after field testing) because it treats items as classes rather than treating items as an isolated entities that are individually authored, reviewed, and formatted

ITEM MODELS • To create item models systematically and strategically, an item model taxonomy is required (Gierl, Zhou, Alves, 2008) • This type of taxonomy is a prerequisite for automatic item generation because it provides the guiding principles necessary for designing a large number of diverse item models by outlining their structure, function, similarities, differences, and limitations (i.e., taxonomy helps us avoid creating item models that produce generated items that all look the same) • A taxonomy for item model development must manipulate three variables: the stem , options , and auxiliary information

ITEM MODELS • The stem is the section of the model used to formulate context, content, and/or questions • Independent indicates that the n i element(s) (n i >=1) in the stem are independent or unrelated to one another (that is, a change in one element will have no affect on the other stem elements) • Dependent indicate n d element(s) (n d >=2) in the stem are dependent or directly related to one other • Mixed include both independent (n i >=1) and dependent (n d >=1) elements in the stem • Fixed represents a constant stem format with no variation or change

ITEM MODELS • The options contain the alternatives for the item model • Randomly ‐ s elected options refers to the manner in which the distractors are selected from their corresponding content pools (the distractors are selected randomly) • Constrained options mean that the keyed option and the distractors are generated according to specific constraints, such as formulas, calculation, and/or context • Fixed options occurs when both the keyed option and distractors are invariant or unchanged in the item model • Auxiliary information includes any additional material, in either the stem or option, required to generate an item, including texts, images, tables, and/or diagrams

ITEM MODELS • By crossing the 4 stem and 3 options categories, a matrix of 12 item model types can be produced • 10 functional combinations can be created from the matrix of 12 (the two remaining combinations are not applicable) Table 1. Plausible Stem ‐ by ‐ Option Combinations in the Item Model Taxonomy Stem Independent Dependent Mixed Fixed Options Randomly Selected √ √ √ √ Constrained N/A √ √ √ Fixed N/A √ √ √

ITEM MODELS • We have also developed software for item generation—IGOR (Item GeneratOR)—which is now operational • IGOR was programmed using Sun Microsystems JAVA SE 6 and it is available either as a desktop program or a web ‐ based application

IGOR GRADE 3 Stem: Independent; Options: Constrained; Auxiliary Information: None I have 13 tens, 2 hundreds, and 21 ones. What number am I? A. 351 B. 324 C. 234 D. 213

IGOR STEM: I have I1 tens, I2 hundreds, and I3 ones. What number am I? ELEMENTS: I1 Value range: 11 to 19 by 1 I2 Value range: 1 to 9 by 1 I3 Value range: 11 to 49 by 1 OPTIONS: A. I1*10+I2*100+I3 B. I2*100+I3 C. I1+I2*100+I3 D. I1*10+I2*100 KEY: A

IGOR When IGOR was used with 10 item model in math, which represented • each cell in our taxonomy, 331371 unique items were generated We have also applied IGOR to 31 different item models using Grade 3, • 6, and 9 content from Mathematics, Social Studies, Science, and Language Arts to generate tens of thousands of test items We have applied IGOR to 6 different item models in the College Board’s • AP Biology program producing 2263 unique items Finally, we have archived our work over the past 4 years and, in the • process, created an item model bank which currently contains 182 different item models across an array of content areas, grade levels, and testing programs (e.g., achievement and licensure testing)

CONCLUSION Model ‐ based item development has many practical advantages: • More strategic test construction where the purpose of development is, first, to create item models, and then to generate items • Test assembly becomes model based, meaning that tests are composed of generated items from the bank • The logic behind model ‐ based item development can lead to more efficient test construction because it treats items as classes rather than treating items as an isolated entities that are individually authored, reviewed, and formatted and, therefore, you have the potential to get more item development *bang* for your valuable item development dollars

THANK YOU If you have questions or comments, please contact me Dr. Mark J. Gierl (mark.gierl@ualberta.ca)

The Role of Item Models in Automatic Item Generation Mark J. Gierl - PowerPoint PPT Presentation

The Role of Item Models in Automatic Item Generation Mark J. Gierl Hollis Lai Centre for Research in Applied Measurement and Evaluation University of Alberta CCSSO SymposiumOrlando, FL June 22, 2011 CHANGING TIMES Developments in cognitive

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

A Framework for Automatic Generation A Framework for Automatic Generation of Configuration Files

Mental Health Adult Pre-Charge Diversion Program Agenda Why Pre-Charge Diversion? Item 1 Item 1

Automatic Enrollment and Automatic IRAs David C. John The Heritage Foundation The Retirement

Automatic Registration and Calibration Automatic Registration and Calibration Automatic

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Pathfinder: /| child::element person { item* } (iter, item1) /| child::element closed_auction {

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 8: Hidden

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 14: Language

Digital Testing Digital Testing Lecture 9 : Combinational Automatic Test Pattern Automatic

Seminar 18122 Automatic Quality Assurance and Release Seminar 18122 Automatic Quality

Advice Automatic Structures and Uniformly Automatic Classes Faried Abu Zaid 1 , Erich Grdel 2 ,

Automatic NUMA Balancing Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 15: Language

ICTs in education Moving from 1 Generation to 2 Generation models a framework for program

Automatic Generation of Minimal and Reduced Models for Structured Parametric Dynamical Systems

Gainesville Regional Utilities 2019 Series A, B, and C Standard & Poors January 25, 2019

First-Quarter 2020 Earnings Presentation Forward-Looking / Cautionary Statements This

First Quarter 2020 Results May 14, 2020 2 Important Notice This presentation and the

Normalization of Databases By Krishnan Ramakrishnan Senior Presentation (Jul 2017) San Francisco

AN APPLICATION OF THE HARDENED FLOATING-POINT CORES ON HIL SIMULATIONS Elas Todorovich,

M obius transformations and Furstenbergs theorem Piotr Rutkowski BSc Wednesday 8 th April,

Weakly self-avoiding walk in dimension four Gordon Slade University of British Columbia

PocketSphinx: Open-Source Speech Recognition for Hand-held and Embedded Devices David

The Role of Item Models in Automatic Item Generation Mark J. Gierl - PowerPoint PPT Presentation

The Role of Item Models in Automatic Item Generation Mark J. Gierl Hollis Lai Centre for Research in Applied Measurement and Evaluation University of Alberta CCSSO SymposiumOrlando, FL June 22, 2011 CHANGING TIMES Developments in cognitive

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

A Framework for Automatic Generation A Framework for Automatic Generation of Configuration Files

Mental Health Adult Pre-Charge Diversion Program Agenda Why Pre-Charge Diversion? Item 1 Item 1

Automatic Enrollment and Automatic IRAs David C. John The Heritage Foundation The Retirement

Automatic Registration and Calibration Automatic Registration and Calibration Automatic

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Pathfinder: /| child::element person { item* } (iter, item1) /| child::element closed_auction {

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 8: Hidden

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 14: Language

Digital Testing Digital Testing Lecture 9 : Combinational Automatic Test Pattern Automatic

Seminar 18122 Automatic Quality Assurance and Release Seminar 18122 Automatic Quality

Advice Automatic Structures and Uniformly Automatic Classes Faried Abu Zaid 1 , Erich Grdel 2 ,

Automatic NUMA Balancing Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 15: Language

ICTs in education Moving from 1 Generation to 2 Generation models a framework for program

Automatic Generation of Minimal and Reduced Models for Structured Parametric Dynamical Systems

Gainesville Regional Utilities 2019 Series A, B, and C Standard &amp; Poors January 25, 2019

First-Quarter 2020 Earnings Presentation Forward-Looking / Cautionary Statements This

First Quarter 2020 Results May 14, 2020 2 Important Notice This presentation and the

Normalization of Databases By Krishnan Ramakrishnan Senior Presentation (Jul 2017) San Francisco

AN APPLICATION OF THE HARDENED FLOATING-POINT CORES ON HIL SIMULATIONS Elas Todorovich,

M obius transformations and Furstenbergs theorem Piotr Rutkowski BSc Wednesday 8 th April,

Weakly self-avoiding walk in dimension four Gordon Slade University of British Columbia

PocketSphinx: Open-Source Speech Recognition for Hand-held and Embedded Devices David

Gainesville Regional Utilities 2019 Series A, B, and C Standard & Poors January 25, 2019