 
              Adva Advancing ncing Multid Multidimens imensional ional Science Asse Science Assessment ssment Design: A V Des ign: A View thr iew through ough Two Lenses o Lenses I-SMAR SMART T an and d SCIL SCILLS LSS Friday riday, , Jun une 29 e 29, , 20 2018 18 11:00 11 :00 AM AM – 12 12:00 :00 PM PM Aqua Aqua Salon Salon D (Le D (Level 3) el 3)
Strengthening Cla laims-based In Interpretations and Uses of f Lo Local and La Large-scale Science Assessment Scores (S (SCILLSS) Using Principled-Design to Support Coherence in State and Local Assessment Systems Presentation for the National Conference on Student Assessment June 29, 2018 2
Overview ➢ The SCILLSS Project: Purpose, Players, and Products ➢ Coherence-based Principled-design – Large-scale science assessments – Classroom-based, instructionally-embedded assessments (not a focus of this presentation) – Theory of Action – Self-evaluation Protocols: Reflecting on and evaluating assessment systems – Digital Workbook on Educational Assessment Design and Evaluation ➢ SCILLSS Implementation in Nebraska ➢ Project Impact 3
About SCIL ILLSS • One of two projects funded by the US Department of Education’s Enhanced Assessment Instruments Grant Program (EAG), announced in December, 2016 • Collaborative partnership including three states, four organizations, and 10 expert panel members • Nebraska is the grantee and lead state; Montana and Wyoming are partner states • Four year timeline (April 2017 – December 2020) 4
SCIL ILLSS Project Goals ls • Create a science assessment design model that establishes alignment with three-dimensional standards by eliciting common construct definitions that drive curriculum, instruction, and assessment • Strengthen a shared knowledge base among instruction and assessment stakeholders for using principled-design approaches to create and evaluate science assessments that generate meaningful and useful scores • Establish a means for state and local educators to connect statewide assessment results with local assessments and instruction in a coherent, standards-based system 5
SCILLSS Partner States, , Organizations, , and Staff Co-Principal Investigators: Ellen Forte and Chad Buckendahl Project Director: Liz Summers Deputy Project Director: Erin Buchanan Psychometric Leads: Andrew Wiley and Susan Davis-Becker Principled-Design Leads: Howard Everson and Daisy Rutstein 6
Project Deli liverables 1 - Project Foundations 2 - Large-scale assessment • SCILLSS website resources • Theory of Action for the project • Three sets of claim-specific and for each state Year 1 resources: • Local and state needs assessment o PLDs tools o measurement targets, task • Assessment literacy module 1 Year 2 • Three prioritized science claims models, and design patterns o sample items • Assessment literacy modules 2-5 3 - Classroom-based Year 3 assessment resources 4 - Reporting and • Six task models Dissemination • Six tasks • Database of student artifacts • Six sets of student artifacts Year 4 corresponding to the performance levels • Post-project survey • Post-project action plans for each state • Final project report 7
Overview ➢ The SCILLSS Project: purpose, players, and products ➢ Coherence-based Principled-design – Benefits and Phases of a Principled-design approach – Theory of Action – Self-evaluation Protocols: Reflecting on and evaluating assessment systems – Digital Workbook on Educational Assessment Design and Evaluation ➢ SCILLSS Implementation in Nebraska ➢ Project Impact 8
Benefits of f a Pri rincipled-Design Approach • Principled articulation and alignment of design components • Articulation of a clear assessment argument • Reuse of extensive libraries of design templates • For accountability – Clear warrants for claims about what students know and can do – Build accessibility into design of tasks (not retrofitted into tasks) – Cost v. scale 9
Three It Iterative Evidence-Centered Design Phases Phase 3: Phase 1: Phase 2: Task Development and Domain Analysis Domain Modeling Implementation What do we What does that look like Build and Implement intend to measure? in an assessment context? the Assessment Representations of the Articulation of how the Task models → items three dimensions in the construct should NGSS manifest in the Items → tests assessment Increases in Specificity Adapted from Huff, Steinberg, & Matts, 2010 10
Theory ry of f Actio ion Purpose The purpose of a Theory of Action is to: • Articulate the claims and assumptions that must hold true to support the interpretation(s) and use(s) of assessment scores; • Articulate how assessment claims connect with, and are supported by, test scores and other sources of evidence; • Strengthen both the validity and coherence of an assessment system; and • Provide stakeholders with ample documentation of design and development logic and decisions, which can be used for future learning, evaluations, and development projects. 11
Theory ry of f Actio ion Components Statewide System Setting Student Assessment Teacher Actions Student Actions and Use Outcomes System Design • What are the • How are • What activities • What activities • What are the assessment stakeholders are expected of intended are expected of system claims? meant to use students? student goals, teachers? assessment outcomes, or • How is the • How do • How do information? consequences assessment students teachers of the system • What are some interact with interact with assessment designed? of the students in the teachers and system (e.g., for conditions that other students? classroom? • How must the students, must be in place assessment • How do • How do teachers, for the system function students track teachers use instruction)? assessment to provide their progress? student work to system to interpretable track progress? function as and usable intended? scores? 12
13
Self-Evaluation Protocol Purpose The local and state self-evaluation tools are frameworks to support state and local educators in reflecting upon and evaluating the assessments they use. Local or District State • Designed to focus on assessments • Focused on large-scale that districts or schools require assessments required statewide • Usually used for lower-stakes • Some assessments have high decisions stakes – Curriculum reviews – Accountability for students – Malleable instructional – Accountability for decisions educators – Monitoring student progress – Accountability for schools, proximally districts, programs 14
Self-Evaluation Protocol Steps Articulate the primary goals and objectives of your Articulate assessment program Identify Identify all current and planned assessments Evaluate the data and evidence available for each Evaluate assessment to support the program goals and objectives and to address four fundamental validity questions Synthesize results from the initial steps to determine an Synthesize appropriate path forward 15
Vali lidity Questions 1. Construct Coherence To what extent has the assessment been designed and developed to yield scores that can be interpreted in relation to the target domain? 2. Comparability To what extent does the assessment yield scores that are comparable across students, sites, time, forms? 3. Accessibility and Fairness To what extent are students able to demonstrate what they know and can do in relation to the target knowledge and skills on the test in a manner that can be recognized and accurately scored? 4. Consequences To what extent does the test yield information that can be and is used appropriately to achieve specific goals? 16
Dig igit ital Work rkbook Purp rpose The digital workbook includes five assessment literacy modules designed to: • Inform state and local educators and other stakeholders on the purposes of assessments; • Ensure a common understanding of the purposes and uses of assessment scores, and how those purposes and uses guide decisions about test design and evaluation; • Complement the needs assessment by providing background information and resources for educators to grow their knowledge about foundational assessment topics; and • Address construct coherence, comparability, accessibility and fairness, and consequences. 17
Dig igit ital Work rkbook Module Topics 1 Validity, validity evidence, and the assessment life cycle (design and development, administration, scoring, analysis, reporting, score use) 2 Construct Coherence: To what extent do the test scores reflect the knowledge and skills we’re intending to measure, for example, those defined in the academic content standards? 3 Comparability: To what extent are the test scores reliable and consistent in meaning across all students, classes, and schools? 4 Accessibility and Fairness: To what extent does the test allow all students to demonstrate what they know and can do? 5 Consequences: To what extent are the test scores used appropriately to achieve specific goals? 18
Recommend
More recommend