http://www.springer.com/us/book/9783319673035 The Springer Series in - - PowerPoint PPT Presentation
http://www.springer.com/us/book/9783319673035 The Springer Series in - - PowerPoint PPT Presentation
http://www.springer.com/us/book/9783319673035 The Springer Series in Measurement Science and Technology The Springer Series in Measurement Science and Technology comprehensively covers the science and technology of measurement, addressing all
The Springer Series in Measurement Science and Technology
The Springer Series in Measurement Science and Technology comprehensively covers the science and technology of measurement, addressing all aspects of the subject from the fundamental principles through to the state-of-the-art in applied and industrial metrology, as well as in the social sciences. Volumes published in the series cover theoretical developments, experimental techniques and measurement best practice, devices and technology, data analysis, uncertainty, and standards, with application to physics, chemistry, materials science, engineering and the life and social sciences.
William P. Fisher, Jr.
University of California, Berkeley, CA, USA
BEAR Seminar UC Berkeley Graduate School of Education 30 January 2018
Discontinuous Levels of Complexity in Coherent Educational Measurement: The Roles of KidMaps, Wright Maps, and Construct Maps
Thanks to colleagues
- Emily Oon and Mei Zhou at University of Macau
- Fisher, W. P., Jr., Oon, E. P.-T., & Zhou, M. (2018). Assessment coherence across
information complexity contexts: Coordinating classroom and international
- assessments. Journal of Educational Measurement, in review.
- Mark Wilson at University of California, Berkeley
- National Research Council. (2006). Systems for state science assessment (M. R.
Wilson & M. W. Bertenthal, Eds.). Washington, DC: The National Academies Press.
- Wilson, M. (Ed.). (2004). National Society for the Study of Education Yearbooks.
- Vol. 103, Part II: Towards coherence between classroom assessment and
- accountability. Chicago, Illinois: University of Chicago Press.
The problem of coherence in educational assessment
- Wilson (2004; NRC, 2006) asks
- What kind of information infrastructure is needed to coherently
coordinate meaningful and comparable formative, interim, and summative assessments within and across classrooms?
- Applications and reports would have to function within common
frames of reference across developmental, horizontal, and vertical comparisons.
Developmental, horizontal and vertical forms
- f coherence
Coherence: Forced conformity, or an unexplored alternative?
- Moss (2004), in a chapter included in Wilson’s (2004) NSSE Yearbook,
fears that coherence in educational assessment will become another instance of a “high modern” scheme that systematically homogenizes human variation into bureaucratically manageable forms.
- She cites Scott’s (1998) account of the history of failed governmental
efforts at improving the human condition, but does not mention Scott’s concluding suggestion that language could provide a model for a new kind of standard that functions as a means of continually adapting broad principles to novel circumstances.
Multiple levels of complexity in language and information infrastructures
- Language is, Scott (1998, p. 357) says, “a structure of meaning and
continuity that is never still and ever open to the improvisations of its speakers."
- Star and Ruhleder (1996) similarly point out that "The competing
requirements of openness and malleability, coupled with structure and navigability, create a fascinating design challenge—even a new science."
- The design of information infrastructures providing both structure
and openness "is highly challenging technically, requiring new forms
- f computability that are both socially situated and abstract enough
to travel across time and space“ (Star & Ruhleder, 1996, p. 132).
Levels of complexity in language
(Star & Ruhleder, 1996; following Bateson, 1972)
- Denotative: factual and local
- The cat is on the mat.
- You answered 28 of the 50 questions correctly.
- Metalinguistic: abstractly refers to words
- The word ‘cat’ has no fur.
- Your score of 28 means you fail the course.
- Metacommunicative: statements about statements
- My telling you where to find your cat was friendly.
- My giving you a failing grade based on your score of 28 is
justified given the content and difficulties of the questions.
Levels of complexity in language
- “The cat on the mat” points at something real and tangible.
- Pointing at the word ‘cat’ refers to an abstract concept.
- It applies to all small domesticated felines.
- It has an invariant meaning in the English language.
- It came into use via an evolutionary process not
controlled by any person or group.
Levels of complexity in language
- The score of 28 on the assessment points at something
real and tangible: questions answered correctly and incorrectly.
- The number word ‘28’ is supposed to be abstract.
- But the meaning of an assessment score of ‘28’ is tied to a
particular set of questions.
- It means something different across tests.
- It came into use via a process controlled by an individual
person or group.
- Used to indicate a learning outcome, ‘28’ does not have a
general and invariant meaning.
- The theoretical justification for failure based on the score is
contained in a privately organized information system not
- pen to contestation or confirmation by others.
What happens when we ignore levels of complexity in language?
- We find ourselves:
- “…with organizations which are split and confused, systems which are
unused or circumvented, and a set of circumstances of our own creation which more deeply impress disparities on the organizational landscape" (Star & Ruhleder, 1996, p. 118).
- Sounds like Scott’s (1998) history of failed “high modern” schemes
- Also resonates with Ladd’s (2017) documentation of the flawed
U.S. NCLB proficiency standards.
- Ladd, H. F. (2017). No Child Left Behind: A deeply flawed federal
- policy. Journal of Policy Analysis and Management, 36(2), 461-469.
What’s the alternative?
- Can number words be connected with concrete
- bservations and abstract meanings that remain
invariant throughout the language?
- Can number words emerge from a group-level process
not under the control of any individual?
- Can publicly reproducible justifications for uses of
number words provide independent validation of the inferences made?
Levels of complexity in education
- Denotative: statements about learning
- You answered these questions correctly and incorrectly.
- Your score on the test was a particular count of correct responses.
- Kidmap display
- Metalinguistic: learning about learning
- We observe a pattern of consistently increasing difficulty in items.
- Similar patterns of invariance emerge across assessments.
- Wright map display
- Metacommunicative: theories about learning
- We see item features that cause items to be easy or hard.
- We design tests from specifications, and they function as expected.
- Construct map display and construct specification equation
Levels of complexity in education
- Denotative: Concrete statements about learning
- You answered these questions correctly and incorrectly.
- Your score on the test was 28.
- Kidmaps
CORRECT INCORRECT EASY HARD
Levels of complexity in education
- Metalinguistic: Abstract learning
about learning
- We observe a pattern of self-organized
conjoint order:
- consistently increasing item difficulties, and
- consistently increasing student abilities.
- Similar patterns of spontaneous
invariance emerge across tests.
- Wright maps
- Equating
- Item banks
MEASURE | MEASURE <more> --------------------- PERSON -+- ITEM ----------------- <rare> 7 .## + 7 | | | 6 + . 6 | # | | . 5 . + 5 | # . | .## | . 4 . + ### 4 . | ## .### |T ####. . | #### 3 .##### T+ # 3 .## | ###### .####### | ###### .####### | ######## 2 .######## + ###### 2 .########## |S ################. .#################### S| ##################. .############# | ################ 1 .################# + ###################### 1 .############## | ##################### #################### | ################# .######################## | #######################. 0 .######################## M+M ###########################. 0 .########################### | ###################### .######################## | ####################### .####################### | ###############
- 1 .######################### + ######################. -1
.##################### | #################### .############## S| ##############. .############# |S #############.
- 2 .########### + ###############. -2
.###### | ####. .###### | ########## .#### | ########.
- 3 .### T+ #####. -3
.#### | #### .# |T ## . | #.
- 4 . + . -4
| # . | . . |
- 5 + -5
. | | |
- 6 . + -6
<less> --------------------- PERSON -+- ITEM ----------------- <freq>
Levels of complexity in education
- Metacommunicative: theories about learning
- We see item features that cause items to be easy or hard.
- We can design tests from specifications, and they function as expected.
- Construct maps and specification equations
Wilson, M. (2014). BEAR Assessment System
- Software. BEAR Center, UC
Berkeley Graduate School
- f Education.
Levels of complexity in education
- Metacommunicative: theories about learning
- Construct specification equation
Reading difficulty (or readability) = A*log(MSL)-B*log(WF) + C where MSL is the mean sentence length, WF is the word frequencies, and A, B, and C are constants.
Burdick, H., & Stenner, A. J. (1996). Theoretical prediction of test items. Rasch Measurement Transactions, 10(1), 475.
Theoretical vs Empirical Reading Item Estimates
Theoretical vs Empirical Mathematics Item Estimates
Fisher, W. P., Jr., Seeratan, K., Draney, K., Wilson, M., Murray, B., Saldarriaga, C. et
- al. (2012, April). Predicting
mathematics test item difficulties: Results of a preliminary study. Presented at the Fifteenth International Objective Measurement Workshops, Vancouver, Canada.
“There is nothing so practical as a good theory."
(Lewin, 1951, p. 169)
- Meaningful and practical explanatory power is obtained
when phenomena are understood well enough to predict their behaviors.
- Efficiencies at a new order of magnitude come to bear when
the analysis and reporting of response data from tests and assessments are integrated with learning materials in immediate formative feedback.
LO: Student articulates basic properties of matter
Score report for an individual student
*** * * * * *
Developmental Coherence
Measures over time
360 400 440 480 520 560 600 640 680 720 760 STUDENT MEASURES |-----+-----+-----+-----+-----+-----+-----+-----+-----+-----| DISTRIBUTIONS P Y D F A M G LZQCERUC NW H J K WEEK ONE L Y D F G A M ZPUCQERU NW H J K WEEK THREE G P Y F A M EZQCLRUS NW D K J WEEK FIVE Q Y D W F A P M LZSRUCE J H N KJ WEEK SEVEN P Q D GA F LMSYCERU NJ WK H WEEK NINE |-----+-----+-----+-----+-----+-----+-----+-----+-----+-----| 360 400 440 480 520 560 600 640 680 720 760 MEASUREMENT SCALE 80 70 50 40 35 30 35 40 50 70 UNCERTAINTY T S M S T MEAN, SD, 2 SD (T) 0 10 20 30 40 50 70 80 90 99 OVERALL STUDENT PERCENTILE
Progress map for a classroom
25
School Term
Developmentally coherent
Horizontal Coherence
Score report for multiple classrooms, week xx
360 400 440 480 520 560 600 640 680 720 760 STUDENT MEASURES |-----+-----+-----+-----+-----+-----+-----+-----+-----+-----| DISTRIBUTIONS P Y D F A M G LZQCERUC NW H J K CLASSROOM ONE L Y D F G A M ZPUCQERU NW H J K CLASSROOM TWO G P Y F A M EZQCLRUS NW D K J CLASSROOM THREE Q Y D W F A P M LZSRUCE J H N KJ CLASSROOM FOUR P Q D GA F LMSYCERU NJ WK H CLASSROOM FIVE |-----+-----+-----+-----+-----+-----+-----+-----+-----+-----| 360 400 440 480 520 560 600 640 680 720 760 MEASUREMENT SCALE 80 70 50 40 35 30 35 40 50 70 UNCERTAINTY T S M S T MEAN, SD, 2 SD (T) 0 10 20 30 40 50 70 80 90 99 OVERALL STUDENT PERCENTILE
Vertical Coherence
360 400 440 480 520 560 600 640 680 720 760 END OF SEMESTER |-----+-----+-----+-----+-----+-----+-----+-----+-----+-----| ELEVENTH GRADE 13 2 \ OVERALL 1 2142572791784251 1 1 \ STUDENT 1 1 1 3 22 4 62735893922385590827032906974 5821 1 1 > MEASURE 6756 7 3 7210 381110986347843441031745162392532192 6 89 4 46 / DISTRIBUTION T S M S T MEAN, SD, 2 SD (T) 0 10 20 30 40 50 70 80 90 99 PERCENTILE 110 70 50 35 30 35 50 70 110 UNCERTAINTY 1 T )+ 6 78 Q Y D W 4 3 %$ F^& AV*(P2M LZSRUCE J H N KJ!@# 0 5 9 DISTRICT-WIDE CLASSES P Q D GA F LMSYCERU NJ WK H DISTRICT-WIDE SCHOOLS D | P | S PROFICIENCY STDS 15% 53% 32% PROFICIENCY %ILES * LAST YEAR’S MEAN * PISA/TIMSS/ICILS 400 500 600 700 800 SAT EQUIVALENTS |-----+-----+-----+-----+-----+-----+-----+-----+-----+-----| 360 400 440 480 520 560 600 640 680 720 760 MEASUREMENT SCALE 4 2 1 1 .5 .3 .5 1 2 4 UNCERTAINTY T S M S T MEAN, SD, 2 SD (T) 0 10 20 30 40 50 70 80 90 99 OVERALL STUDENT %ILE
Vertical Coherence
360 400 440 480 520 560 600 640 680 720 760 END OF SEMESTER |-----+-----+-----+-----+-----+-----+-----+-----+-----+-----| FIFTH GRADE 13 2 \ OVERALL DISTRICT 1 2142572791784251 1 1 \ STUDENT 1 1 1 3 22 4 62735893922385590827032906974 5821 1 1 > MEASURE 6756 7 3 7210 381110986347843441031745162392532192 6 89 4 46 / DISTRIBUTION T S M S T MEAN, SD, 2 SD (T) 0 10 20 30 40 50 70 80 90 99 PERCENTILE XXXXXXX STUDENT X WEEK ONE XXXXXXX STUDENT X WEEK THREE XXXXXXX STUDENT X WEEK FIVE XXXXXXX STUDENT X WEEK SEVEN XXXXXXX STUDENT X WEEK NINE D | P | S PROFICIENCY STDS 15% 53% 32% PROFICIENCY %ILES * LAST YEAR’S MEAN * PISA/TIMMS/ICILS 400 500 600 700 800 SAT EQUIVALENTS |-----+-----+-----+-----+-----+-----+-----+-----+-----+-----| 360 400 440 480 520 560 600 640 680 720 760 MEASUREMENT SCALE 0 10 20 30 40 50 70 80 90 99 OVERALL STUD %ILE AT GRADE LEVEL
Taking language as a model
- Provides guidelines to the complex and discontinuous
denotative, metalinguistic, and metacommunicative structures we need to connect number words with formal theories, abstract concepts, and concrete things in the world.
- Rasch measurement theory’s kidmaps, Wright maps, and
construct maps and specification equations provide the tools we need for productively integrating data, instruments, and theory in a new art and science.
- Rasch provides the “new form of computability” that is
“both socially situated and abstract enough to travel across space and time,” as called for by Star and Ruhleder (1996).
Alliances and Translations for Coherence
(Adapted from Star & Griesemer, 1989, p. 390) Construct map and specification equation Different Wright maps showing separate samples
- f students and items in
same unit Unique kidmaps Metacommunicative (theory) Metalinguistic (instrument) Denotative (data)
Golinski (2012, p. 35):
"Practices of translation, replication, and metrology have taken the place of the universality that used to be assumed as an attribute of singular science."
Linguistic Complexity in Research & Practice
Level of Complexity
Bottom-Up Research Top-Down Practice Visual Display (Users) Metacommunicative Construct specification equations and explanatory theory Metrological traceability to metric system standard units Construct Map (Theoreticians) Metalinguistic Invariance scaling models Applied research innovations and quality improvement applications Wright Map (Psychometricians) Denotative Qualitative observations Contextualized information supporting caring arts and sciences KidMap (Teachers)
An English language reading measurement network
- 100+ English language reading tests across the world measure in a
common unit.
- Over 30 million student measures in the U.S. annually are interpreted
relative to 250,000 book measures and 200 million article measures, where matching student and text measures predict a 75 percent comprehension rate.
- Books, articles, assessments, and students have been brought into a
common frame of reference in a process now almost 30 years old and still accelerating.
- Text complexity corresponds with reading learning progressions,
enabling individualized instruction.
Developmental Coherence
Fisher, W. P., Jr., & Stenner, A. J. (2016). Theory-based metrological traceability in education: A reading measurement network. Measurement, 92, 489-496
Coherence in reading measurement
- Student measures are tracked over time and across grade levels,
instantiating developmental coherence.
- Teachers are able to compare learning outcomes within their own and
across each other’s classes, realizing horizontal coherence.
- State end-of-year or graduation tests report in the common unit,
providing parents, students, teachers, principals, librarians, researchers, and the public with the vertical coherence needed for connecting classroom formative assessments with accountability standards.
Philosophically…
- …we are taking up the problem of how to realize a full and non-
contradictory integration of global human identity and unique human singularity.
- “…a social ethic cannot spring from a system but from a paradox. It
aims at two opposed things: human totality and human singularity. I want both.” (Ricoeur, 1974, p. 166)
Ricoeur, P. (1974). The project of a social ethic. In D. Stewart & J. Bien, (Eds.). Political and social essays (pp. 160-175). Athens, Ohio: Ohio University Press.
Towards a social ethic
- A social ethic capable of integrating human totality and
human singularity:
- will emerge only from resolution of the paradox of inclusively
addressing the needs of humanity as a whole
- while also vigorously personalizing to the maximum relationships
that tend to become anonymous and inhuman in the wake of the quest for a shared human identity.
Thank you!
- William P. Fisher, Jr.
- University of California, Berkeley
- wfisher@berkeley.edu