Assessing Outcomes and Processes
- f Student Collaboration
Peter F. Halpin April 19, 2016
Joint work with: Alina von Davier, Yoav Bergner, Jiangang Hao, Lei Liu (ETS); Jacqueline Gutman (NYU)
1 / 89
Assessing Outcomes and Processes of Student Collaboration Peter F. - - PowerPoint PPT Presentation
Assessing Outcomes and Processes of Student Collaboration Peter F. Halpin April 19, 2016 Joint work with: Alina von Davier, Yoav Bergner, Jiangang Hao, Lei Liu (ETS); Jacqueline Gutman (NYU) 1 / 89 Outline Part 1: Wherefore assessments
1 / 89
◮ Set up the current perspective: performance assessments ◮ Selective review of research on small group productivity 2 / 89
◮ Set up the current perspective: performance assessments ◮ Selective review of research on small group productivity
◮ Combining psychometric models with research on small group
◮ Testing models against observed team performance 3 / 89
◮ Set up the current perspective: performance assessments ◮ Selective review of research on small group productivity
◮ Combining psychometric models with research on small group
◮ Testing models against observed team performance
◮ Focus on chat data (for now!) ◮ Modeling engagement among collaborators using temporal
1Halpin, von Davier, Hao, & Lui (under review). Journal of Educational Measurement. 4 / 89
◮ Theme: traditional educational tests target a relatively narrow
5 / 89
◮ Theme: traditional educational tests target a relatively narrow
6 / 89
◮ Theme: traditional educational tests target a relatively narrow
7 / 89
upenn.app.box.com/8itemgrit 8 / 89
ability for educational purposes. Educational Researcher, 44(4), 237-251. 9 / 89
10 / 89
◮ e.g., IRT models don’t use process data 11 / 89
◮ e.g., IRT models don’t use process data
◮ NY opt-out movement: 20% of students (parents) boycotted
3www.wnyc.org/story/ new-york-city-students-make-modest-gains-state-tests-opt-out-numbers-triple/ 12 / 89
4Davey, Ferrara, Holland, Shavelson, Webb, & Wise (2015). Psychometric Considerations for the Next Generation of Performance Assessment. Princeton, NJ. p. 10 13 / 89
◮ The Jigsaw Classroom (Aronson et al., 1978; jigsaw.org) ◮ Group-worthy tasks (Cohen et al., 1999)
◮ CSCL (e.g., Hmelo-Silver et al., 2013) 14 / 89
◮ The Jigsaw Classroom (Aronson et al., 1978; jigsaw.org) ◮ Group-worthy tasks (Cohen et al., 1999)
◮ CSCL (e.g., Hmelo-Silver et al., 2013)
◮ e.g., Webb, 1995; 2015 15 / 89
McGrath’s (1984) group task circumplex 16 / 89
5Two models of group behavior in the solution of Eureka-type problems. Psychometrika, 1955, 20 (2), p. 141 17 / 89
6Two models of group behavior in the solution of Eureka-type problems. Psychometrika, 1955, 20 (2), p. 141 18 / 89
7On the reliability of group judgements and decisions. In Mathematical methods for small group processes (Eds. Criswell, Solomon, Suppes), p. 322 19 / 89
8Towards a general model of group productivity. Psychological Bulletin, 86 (1), pp. 67-68 20 / 89
◮ Intellective tasks (vs decision tasks) ◮ Cooperative group interactions (vs competitive or
◮ Describing group outcomes via decision / functions that
◮ Letting probability of success vary over individuals (e.g., via
◮ Describing relevant task characteristics (e.g., via difficulty) ◮ The performance of individual groups rather than groups in
21 / 89
22 / 89
23 / 89
◮ The conjunctive rule
◮ The disjunctive rule
24 / 89
◮ Under control of the test designer10
◮ Under control of the team 10Maris & van der Maas (2012). Speed-accuracy response models: scoring rules based on response time and
25 / 89
◮ Under control of the test designer10
◮ Under control of the team
◮ Assume a certain scoring rule ◮ Consider plausible models for team strategies ◮ Test the models against data 10Maris & van der Maas (2012). Speed-accuracy response models: scoring rules based on response time and
26 / 89
◮ The conjunctive rule
◮ The disjunctive rule
27 / 89
28 / 89
29 / 89
30 / 89
31 / 89
32 / 89
33 / 89
34 / 89
35 / 89
36 / 89
11Holland & Rosenbaum (1986). Conditional Association and Unidimensionality in Monotone Latent Variable
37 / 89
11Holland & Rosenbaum (1986). Conditional Association and Unidimensionality in Monotone Latent Variable
38 / 89
39 / 89
theta1 p r
12Using 2PL model for individual IRFs with α = 1 and β = 0 40 / 89
41 / 89
◮ Observed collaborative responses xjk = (x1jk, x1jk, . . . , xmjk) ◮ A model for individual performance on the m (conventional)
42 / 89
◮ Observed collaborative responses xjk = (x1jk, x1jk, . . . , xmjk) ◮ A model for individual performance on the m (conventional)
43 / 89
44 / 89
13Likelihood ratio tests for model selection and non-nested hypotheses. Econometrika, 57(2), 307 – 333. 45 / 89
jk from
jk | θj θk)
jk for each x(r) jk ; save L0(x(r) jk | θjk)
model or D(r)
46 / 89
◮ Pool of pre-calibrated math items (grade 12 NAEP, modified
◮ Individual “pre-test” → estimate individual abilities ◮ Collaborative “post-test” → evaluate models, estimate δjk ◮ Modality of collaboration: online chat
◮ Small calibration sample; crowd workers ◮ Individual and collaborative forms were not counterbalanced
47 / 89
48 / 89
49 / 89
Individual Theta Collaborative Theta
50 / 89
51 / 89
52 / 89
Individual Theta Collaborative Theta pairs
3 4 12 15 16 35
53 / 89
Individual Theta Collaborative Theta pairs
7 8 9 13 29 36
54 / 89
Individual Theta Collaborative Theta pairs
2 5 10 14 17 19 22 24 25 26 27 28 30 38 44 45
55 / 89
Individual Theta Collaborative Theta pairs
6 18 20 21 23 32 33 37 39 40 41 42 43
56 / 89
Individual Theta Collaborative Theta pairs
1 11 31 34
57 / 89
58 / 89
59 / 89
14Halpin & von Davier 2013, Hao, & Lui (under review). Journal of Educational Measurement. 60 / 89
◮ In ed tech context, typically associated with time-stamped user
14Halpin & von Davier 2013, Hao, & Lui (under review). Journal of Educational Measurement. 61 / 89
◮ In ed tech context, typically associated with time-stamped user
14Halpin & von Davier 2013, Hao, & Lui (under review). Journal of Educational Measurement. 62 / 89
◮ ATC21S collaborative problem solving prototype items15 ◮ CPS frame16 15http://www.atc21s.org/uploads/3/7/0/0/37007163/pd_module_3_nonadmin.pdf 16In alpha at Computational Psychometrics lab at ETS 63 / 89
◮ ATC21S collaborative problem solving prototype items15 ◮ CPS frame16
15http://www.atc21s.org/uploads/3/7/0/0/37007163/pd_module_3_nonadmin.pdf 16In alpha at Computational Psychometrics lab at ETS 64 / 89
◮ e.g., Howley, Mayfield, & Ros`
◮ e.g., Barab`
65 / 89
◮ Contrast events with states, regimes
◮ “Instantaneous probability” of an event, denoted p(t) 17Daley, D. J., & Vera-Jones. (2003). An introduction to the theory of point processes: Elementary theory and methods (2nd ed., Vol. 1). New York: Springer. 66 / 89
◮ How the probability of each person’s actions changes in
◮ How this depends on their previous actions ◮ Emergent or group-level phenomena like coordination,
67 / 89
◮ ¯
◮ nk is the number of actions of student k (observed) ◮ Lower bound is tight in practice; not necessary for
68 / 89
69 / 89
70 / 89
◮ Have an IP address located in the United States ◮ Self-identify as speaking English as their primary language ◮ Self-identify as having at least one year of college education 71 / 89
5 10 15 0.0 0.2 0.4 0.6 0.8
Alpha count
Engagement Index
0.00 0.05 0.10 0.15 50 75 100 125
Number of Chats of Partner Standard Error
Method Hessian Lower Bound
Standard Error Against Number of Chats
0.0 0.2 0.4 0.6 0.2 0.4 0.6
Alpha Alpha
Relation with Partner's Index
1 2 3 4 0.0 0.2 0.4 0.6
Alpha Difference in Number of Chats
Relation with Number of Chats
Note: Alpha denotes the estimated response intensities from Equation 6. Hessian denotes standard errors obtained via the Hessian of the log-likelihood. See appendix of Halpin et al. for Lower Bound. Difference in Number of Chats was scaled using the log of the absolute value of the difference. 72 / 89
0.25 0.30 0.35 0.40
Alpha Partner Alpha Team Alpha Mean Engagement
No Revisions Revisions
Note: Comparison of mean levels of engagement indices for individuals who either did or did not revise at least one response after discussion with their partners. Alpha denotes the estimated response intensities from Equation 6; Partner’s Alpha denotes the partner’s response intensity; Team Alpha denotes the team-level index in Equation 7. For the latter, the data are reported for dyads, not individuals, and no revisions means that both individuals on the team made no revisions. Error bars are 95% confidence intervals on the means. 73 / 89
Note: Alpha denotes the estimated response intensities from Equation alpha2; Partner’s Alpha denotes the engagement index of the individual’s partner; Team Alpha denotes the team-level index in Equation 7. Hedges’ g used the correction factor described by Hedges (1981) and r denotes the point-biserial correlation. 74 / 89
◮ Random effects models for simultaneous estimation over multiple groups ◮ Inclusion of model parameters describing task characteristics ◮ Analytic expressions for standard errors of model parameters ◮ Methods for improving optimization with relatively small numbers of
◮ Integration with text-based analyses (e.g., using marks / time-varying
75 / 89
76 / 89
77 / 89
Aronson, E., Blaney, N., Stephan, C., Sikes, J., & Snapp, M. (1978). The jigsaw classroom. Beverly Hills, CA: Sage. Burrus, J., Carlson, J., Bridgeman, B., Golub-smith, M., & Greenwood, R. (2013). Identifying the Most Important 21st Century Workforce Competencies : An Analysis of the Occupational Information Network ( O * NET ) (ETS RR-13-21). Princeton, NJ. Cohen, E. G., Lotan, R. A., Scarloss, B. A., & Arellano, A. R. (1999). Complex instruction: Equity in cooperative learning classrooms. Theory Into Practice, 38, 80-86. Davey, T., Ferrara, S., Holland, P. W., Shavelson, R. J., Webb, N. M., & Wise, L. L. (2015). Psychometric Considerations for the Next Generation of Performance Assessment. Princeton, NJ. Deming, D. J. (2015). The Growing Importance of Social Skills in the Labor Market. National Bureau of Economic Research Working Paper Series, (21473). Griffin, P., & Care, E. (2015). Assessment and teaching of 21st century skills: Methods and approach. New York: Springer. Hmelo-Silver, C. E., Chinn, C. A., Chan, C. K., & O?Donnel, A. M. (2013). International handbook of collaborative learning. New York: Taylor and Francis. McGrath, J. E. (1984). Groups: Interaction and performance. (Prentice-Hall, Ed.). Englewood Cliffs, NJ. Organisation for Economic Co-operation and Development. (2013). PISA 2015 Draft Collaborative Problem Solving Framework. Retrieved from http://www.oecd.org/pisa/pisaproducts/DraftPISA2015CollaborativeProblemSolvingFramework.pdf Webb, N. M. (1995). Group Collaboration in Assessment: Multiple Objectives, Processes, and Outcomes. Educational Evaluation and Policy Analysis, 17(2), 239-261. 78 / 89
jk from
jk | θj θk)
jk for each x(r) jk ; save L0(x(r) jk | θjk)
model or D(r)
79 / 89
80 / 89
81 / 89
82 / 89
83 / 89
84 / 89
85 / 89
86 / 89
87 / 89
88 / 89