Alignment in Validity Evaluation and Education Policy Ellen Forte - - PowerPoint PPT Presentation

alignment in validity evaluation and
SMART_READER_LITE
LIVE PREVIEW

Alignment in Validity Evaluation and Education Policy Ellen Forte - - PowerPoint PPT Presentation

Alignment in Validity Evaluation and Education Policy Ellen Forte CCSSO 2018 CEO & Chief Scientist National Conference on Student Assessment edCount, LLC San Diego, CA Since 1994, US educational policy has been based on Systemic Reform,


slide-1
SLIDE 1

Alignment in Validity Evaluation and Education Policy

Ellen Forte CEO & Chief Scientist edCount, LLC CCSSO 2018 National Conference on Student Assessment San Diego, CA

slide-2
SLIDE 2

Since 1994, US educational policy has been based on Systemic Reform, which is the foundation of standards-based assessment and accountability

  • Standards define expectations for student learning

Content Standards Performance Standards Curriculum and Instruction Assessment Evaluation and Accountability

  • Curricula and assessments are interpretations of the standards
  • Without clear alignment among standards, curricula, and assessment the

model falls apart

  • Evaluation and accountability rely on the meaning of scores

2

slide-3
SLIDE 3

Alignment is “the degree to which expectations and assessments are in agreement and serve in conjunction with one another to guide the system toward students learning what they are expected to know and do.” (Webb, 1997, p. 4)

Webb Alignment Criteria, history

slide-4
SLIDE 4

Webb (1997) introduced a comprehensive framework for evaluating alignment of a state’s assessment with its standards or curriculum The original framework included five sets of criteria:

  • Content focus
  • Articulation across grades and ages
  • Equity and fairness
  • Pedagogical implications
  • System applicability

Webb (1999) used four criteria from the content focus set in an exploratory study

Those four criteria became the de facto definition of alignment that have been driving large-scale educational test design and evaluation in the US ever since

Webb Alignment Criteria, history

slide-5
SLIDE 5

Standards Items Set of items that contribute to scores DOK

slide-6
SLIDE 6

Stop rating DOK. Stop analyzing DOK as an independent indicator. The “thinking” and the content should not be separated. Stop.

slide-7
SLIDE 7

Shifting our expectations for alignment evaluation…

Common attributes of current alignment studies:

  • Post hoc
  • Link items directly to content standards
  • Ignore blueprints
  • Ignore achievement/performance standards
  • Ignore principled-design philosophy and components
  • Ignore scores
  • Ignore interpretations and uses of scores

Alignment evaluation should:

  • Provide formative information
  • Consider all axis points in the path from standards to scores
  • Address scores as they are reported and meant to be interpreted
  • Yield critical information to support a validity argument
slide-8
SLIDE 8

Alignment is “the degree to which expectations and assessments are in agreement and serve in conjunction with one another to guide the system toward students learning what they are expected to know and do.” (Webb, 1997, p. 4)

An Updated View of Alignment

“Alignment is about coherent connections across various aspects within and across a system and relates not simply to an assessment, but to the scores that assessment yields and their interpretations.” (Forte, 2017, p. 3)

slide-9
SLIDE 9

Some Standards Relevant to Alignment

1.0 – Clear articulation of each intended test score interpretation for a specified use should be set forth, and appropriate validity evidence in support of each intended interpretation should be provided. 4.0 – “Tests and testing programs should be designed and developed in a way that supports the validity

  • f interpretations of the test scores for their intended uses. Test developers and publishers should

document steps taken during the design and development process to provide evidence of fairness, reliability, and validity for intended uses for individual in the intended examinee population” (p. 85). 4.12 – “Test developers should document the extent to which the content domain of a test represents the domain defined in the test specifications” (p. 89). 12.4 – “When a test is used as an indicator of achievement in an instructional domain or with respect to specified content standards, evidence of the extent to which the test samples the range of knowledge and elicits the processes reflects in the target domain should be provided” (p. 196).

slide-10
SLIDE 10

     

11Forte, 2013

slide-11
SLIDE 11

Six Key Relationships

1. The relationship among the measurement targets and the state’s academic content standards; 2. The relationship among the measurement targets and the item specifications and development guidelines; 3. The relationship among the measurement targets and the assessment blueprints; 4. The relationship between the measurement targets and the performance level descriptors (PLDs/ALDs); 5. The relationship between the measurement targets (via task models and item templates) and the assessment items; and 6. The relationship between the measurement targets and the items that contribute to students’ test scores.

slide-12
SLIDE 12
  • A. System Design

1. How were targets and claims and established to reflect the full depth and breadth of the standards? Is this method reasonable and sound? 2. How were the task models and item templates developed to reflect the claims and measurement targets? Is this method reasonable and sound? 3. How were the blueprints developed to reflect the claims and measurement targets? Is this method reasonable and sound? 4. How were the PLDs developed to reflect the claims and measurement targets? Is this method reasonable and sound? 5. How were the items developed to reflect the claims and measurement targets (via task models and item templates)? Is this method reasonable and sound? 6. How were the forms and scoring rules developed to reflect the claims and measurement targets? Is this system reasonable and sound?

slide-13
SLIDE 13
  • B. System Implementation

1. How well do claims and measurement targets address the full depth and breadth of the standards? 2. How well do the task models and item templates reflect the claims and measurement targets? 3. How well do the blueprints reflect the claims and measurement targets? 4. How well do the PLDs reflect the claims and measurement targets? 5. How well do the items reflect the claims and measurement targets? 6. How well do the sets of items that contribute to students’ scores reflect the claims and measurement targets?