Alignment in Validity Evaluation and Education Policy
Ellen Forte CEO & Chief Scientist edCount, LLC CCSSO 2018 National Conference on Student Assessment San Diego, CA
Alignment in Validity Evaluation and Education Policy Ellen Forte - - PowerPoint PPT Presentation
Alignment in Validity Evaluation and Education Policy Ellen Forte CCSSO 2018 CEO & Chief Scientist National Conference on Student Assessment edCount, LLC San Diego, CA Since 1994, US educational policy has been based on Systemic Reform,
Ellen Forte CEO & Chief Scientist edCount, LLC CCSSO 2018 National Conference on Student Assessment San Diego, CA
Content Standards Performance Standards Curriculum and Instruction Assessment Evaluation and Accountability
model falls apart
2
Alignment is “the degree to which expectations and assessments are in agreement and serve in conjunction with one another to guide the system toward students learning what they are expected to know and do.” (Webb, 1997, p. 4)
Webb (1997) introduced a comprehensive framework for evaluating alignment of a state’s assessment with its standards or curriculum The original framework included five sets of criteria:
Webb (1999) used four criteria from the content focus set in an exploratory study
Common attributes of current alignment studies:
Alignment evaluation should:
Alignment is “the degree to which expectations and assessments are in agreement and serve in conjunction with one another to guide the system toward students learning what they are expected to know and do.” (Webb, 1997, p. 4)
“Alignment is about coherent connections across various aspects within and across a system and relates not simply to an assessment, but to the scores that assessment yields and their interpretations.” (Forte, 2017, p. 3)
1.0 – Clear articulation of each intended test score interpretation for a specified use should be set forth, and appropriate validity evidence in support of each intended interpretation should be provided. 4.0 – “Tests and testing programs should be designed and developed in a way that supports the validity
document steps taken during the design and development process to provide evidence of fairness, reliability, and validity for intended uses for individual in the intended examinee population” (p. 85). 4.12 – “Test developers should document the extent to which the content domain of a test represents the domain defined in the test specifications” (p. 89). 12.4 – “When a test is used as an indicator of achievement in an instructional domain or with respect to specified content standards, evidence of the extent to which the test samples the range of knowledge and elicits the processes reflects in the target domain should be provided” (p. 196).
11Forte, 2013
1. The relationship among the measurement targets and the state’s academic content standards; 2. The relationship among the measurement targets and the item specifications and development guidelines; 3. The relationship among the measurement targets and the assessment blueprints; 4. The relationship between the measurement targets and the performance level descriptors (PLDs/ALDs); 5. The relationship between the measurement targets (via task models and item templates) and the assessment items; and 6. The relationship between the measurement targets and the items that contribute to students’ test scores.
1. How were targets and claims and established to reflect the full depth and breadth of the standards? Is this method reasonable and sound? 2. How were the task models and item templates developed to reflect the claims and measurement targets? Is this method reasonable and sound? 3. How were the blueprints developed to reflect the claims and measurement targets? Is this method reasonable and sound? 4. How were the PLDs developed to reflect the claims and measurement targets? Is this method reasonable and sound? 5. How were the items developed to reflect the claims and measurement targets (via task models and item templates)? Is this method reasonable and sound? 6. How were the forms and scoring rules developed to reflect the claims and measurement targets? Is this system reasonable and sound?
1. How well do claims and measurement targets address the full depth and breadth of the standards? 2. How well do the task models and item templates reflect the claims and measurement targets? 3. How well do the blueprints reflect the claims and measurement targets? 4. How well do the PLDs reflect the claims and measurement targets? 5. How well do the items reflect the claims and measurement targets? 6. How well do the sets of items that contribute to students’ scores reflect the claims and measurement targets?