growth in student achievement issues of measurement
play

Growth in Student Achievement: Issues of Measurement, Longitudinal - PowerPoint PPT Presentation

Growth in Student Achievement: Issues of Measurement, Longitudinal Analyses & Accountability Damian W. Betebenner NCIEA CCSSO NCSA, June 23, 2010 Discussions of student growth lie at the intersection of three topics Longitudinal Data


  1. Growth in Student Achievement: Issues of Measurement, Longitudinal Analyses & Accountability Damian W. Betebenner NCIEA CCSSO NCSA, June 23, 2010

  2. Discussions of student growth lie at the intersection of three topics Longitudinal Data Analysis/Applied Statistics Overview Accountability/Education Policy/Data Use Measurement/Psychometrics

  3. Measurement/Psychometrics Examining student growth requires multiple measurements of the same individual  Growth in what? Overview  How much growth? (How is scaling involved in answering this question?)  Is it enough growth?

  4. Longitudinal Data Analysis/Applied Statistics Many methods for analysis of longitudinal data  What are the relevant questions?  Are the analytic techniques capable of Overview answering those questions?  Does the data possess properties sufficient for the analytic techniques employed? (e.g., vertical scale)  Does the analysis sustain the inferences made from the data?

  5. Accountability/Education Policy/Data Use Education Policy & Accountability have many goals and purposes  Why growth in accountability? Overview  What are the goals and purposes of accountability?  What is the theory of action behind accountability?  How can we judge the validity of the accountability system?  What about the current policy context?

  6. Measurement/Psychometric Issues Technical Considerations

  7. Measurement/Psychometric Issues Technical Considerations  Growth in what?  How much growth?  Scales for measuring growth  Ordinal (within-year, across year)  Interval (within-year, across year)  Vertical  Growth magnitude versus growth norm  Is it enough growth? Norm- versus criterion- referencing (intersection of Accountability and Measurement)

  8. Growth in what? Technical Considerations  Beneath any notion of change (i.e., growth) is a construct that is changing over time  Height and weight are common points of reference  Constructs in education are “slippery”  Need, at a minimum, an underlying semantical referent (e.g. reading or math)

  9. How much growth? Technical Considerations  Are growth magnitudes possible in education?  If calculable, are they interpretable absent some norm?  Approaches to growth magnitudes:  Performance standards  Vertical scale with interval properties  Learning progressions (qualitative growth)

  10. How much growth? Technical Considerations Performance Standards Limitations Strengths  Few levels, mask  Anchors reference substantial range within points for discussions levels thus masking about performance student growth within  Growth is embedded level in accountability metric  Vary greatly in stringency from state to state so that “proficient” performance lacks meaning

  11. How much growth? Technical Considerations Scale Scores Strengths Limitations  Difficult to interpret or  Semi-continuous scores (many score points) explain to users  Vertical scales are hard  Can be used to create vertical scales across to defend grade levels  Claims of interval  Give the appearance of measurement interval scales needed properties don’t hold to by some analytical close scrutiny models

  12. How much growth? Technical Considerations Vertical Scale Vertical & Interval scales required for some analytic techniques:  Gain score calculation (magnitude of growth)  Growth curve analysis (rate of growth) (e.g., Willett & Singer, 2003) Vertical & Interval scales required for some questions:  Matthew effects: Do higher achievers grow faster than lower achievers?  Growth rates relative to student age: Do students grow more in later grades than earlier grades?

  13. How much growth? Technical Considerations Vertical Scale Vertical and/or Interval scales NOT required for some analytic techniques:  Value-Added analyses: Most require interval, but not vertical, scale. See Ballou (2008), Briggs & Betebenner (2009).  Auto-regressive analyses, growth norms Vertical and/or Interval scales NOT required for some questions:  Is a student’s progress (ab)normal?  Is a student’s growth sufficient to put them on track to reach/maintain proficiency?  See Yen (2007) for an excellent list of questions

  14. How much growth? Technical Considerations Magnitudes versus Norms Physical growth Two Growth Quantities  9 year old boy grew 5 inches in past year  Magnitude of growth  Average increase in height  Relative amount of growth for boys between years 8 and 9 is 4 inches How much growth? Achievement growth  4 th grader grew 25 scale  People expect an answer score points since 3 rd grade of magnitude  Average 4 th grade scale  People need magnitude score is 21 points higher embedded within a norm than average 3 rd grade score

  15. How much growth? Technical Considerations Growth norms Although normative comparisons are spurned by criterion-referenced and standards-based measurement advocates, norms can provide a useful interpretive framework, especially in the interpretation of student growth “Scratch a criterion and you find a norm” W. H. Angoff (1974)

  16. Longitudinal Data Analysis Issues Technical Considerations

  17. Many Questions Technical Considerations  How much annual growth did this (these) student(s) make in reading?  Is (Are) this (these) student(s) making sufficient growth to reach/maintain desired achievement targets? (Growth-to-standard & Growth Model Pilot Program)  Are students in particular subgroups (e.g., minority students) making as much progress as other students?  How much did this teacher/school contribute to students’ growth over the last year? (Value-Added)  Again, see Yen (2007) for an excellent list of questions

  18. Many Techniques Technical Considerations Numerous data analysis techniques for use with longitudinal data:  Gain scores (suitable scale required)  Cross-tabulation based upon prior and current categorical achievement level attainment (e.g., value-tables, transition matrices)  Regression based approaches: growth-curve analysis (HLM), fixed/mixed-effects models, growth norms

  19. Questions 1 st , Analyses 2 nd Technical Considerations  Different growth analysis techniques often address different questions  Different questions lead to different conversations which lead to different uses and outcomes “It is better to have an approximate answer to the right question than a precise answer to the wrong question.” J. W. Tukey

  20. Model Purpose Technical Considerations Three general uses associated with statistical models (Berk, 2004): Description: An account of the data. Model is true to the extent that it is useful. Model quality judged by craftsmanship (de Leeuw, 2004) Inference: Sample to Population. Model is true to the extent that the assumed chance process reflects reality (super-population fallacy) Causality: A causes B to happen. Model is true to the extent that plausible causal theory exists and design criteria are met  Models are rarely descriptive despite minimal requirements  Inference and causality require information external to the data. Can’t be validated solely from data  Models are often causal in nature but rarely meet rigorous criteria necessary for such inferences

  21. Value-Added Models Technical Considerations Causality  Value-Added Models (e.g., EVAAS) are a frequently discussed type of growth model  Value-Added Models attempt to quantify the portion of student progress attributable, usually to a teacher or school  Value-Added is about the inferences made and not the actual model  Causal attributions make value-added models well suited for accountability discussions  In the absence of random assignment causal attributions are always suspect and subject to challenges (see, for example, Raudenbush, 2004; Rubin, Stuart & Zanutto, 2004)

  22. Value-Added Models Technical Considerations Causality  Value-added models return norm-referenced effectiveness quantities  With regard to schools, quantities indicate whether a school is significantly more or less effective than the mean school effectiveness in the district or state  In a standards based assessment environment, how much effectiveness is enough?  Especially important in light of universal proficiency policy mandates  Growth-to-standard models created to provide criterion-referenced growth models

  23. Growth Model Pilot Program Technical Considerations Growth-to-standard  In response to requests for growth model use as part of AYP, USED allowed states to apply to use growth models  Fifteen states had models accepted  Models required to adhere to the “bright line principle” of universal proficiency (growth-to- standard)  Yen (2009) provides an excellent overview of the models  Growth-to-standard models returned, in general, results that closely aligned with AYP status results.

  24. Growth versus Value-Added Models Technical Considerations Description & Causality  Growth measures are descriptive  Accountability has skewed discussions of growth from description toward responsibility (i.e., causality)  All measures (even VAM) are potentially descriptive. However, some measures are specially crafted for causal inference/attribution  Good descriptive measures are interpretable, informative and capable of multiple uses

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend