selection of linking items
play

Selection of Linking Items Subset of items that maximally reflect - PowerPoint PPT Presentation

Selection of Linking Items Subset of items that maximally reflect the scale information function Denote the scale information as Linear programming solver (in R, lp_solve 5.5) min(y) Subject to


  1. Selection of Linking Items • Subset of items that maximally reflect the scale information function – Denote the scale information as � – Linear programming solver (in R, lp_solve 5.5) • min(y) • Subject to � – ∑ � � � θ�� � � � � � �, ��� ��� θs, where � � ��� ��4, �3.95, … , 3.95, 4 } � – ∑ � � � �, ��� – � � � 0, 1 , ��� ��� �, – � � 0. 37

  2. An example: Subscale 2 • Sum of Information Functions for 6 ‐ , 7 ‐ , and 8 ‐ Item Linking Sets 38

  3. An example: Subscale 3 39

  4. Why Fisher information is useful? • In multidimensional CAT – The volume of the confidence ellipsoid around ��� is proportional to the determinant �� �� � of � (Anderson, 1984) – Maximize the determinant of the Fisher information matrix (Segall, 1996, Wang & Chang, 2011). D ‐ optimal method – � ��� � 40

  5. Fisher information vs. confidence ellipse ��θ�� 15 10  � �� �θ�� 0.067 0 0 (Wang, et al., 2013) 0.1 � Σ 0 0 41

  6. Fisher information vs. confidence ellipse ��θ�� 50 25  � �� � �θ�� 0.02 0 0 (Wang, et al., 2013) 0.04 � Σ 0 0 42

  7. Mini ‐ max mechanism • � ��� � – Assuming there are three dimensions, then, � det � � ��� � �� �� �� � ��� � , � � , � ��,�� � det � � ��� � � det � � ��� � � � � � �� � � �� ��,�� ��,�� � � 2� �� � �� det � � ��� � � ⋯� ��,�� This criterion tends to pick the items that minimize the variance of the estimator lagging behind most 43

  8. Item bank Information 44

  9. Domain/Content balancing • Constraint weighted D ‐ optimal (Wang et al., 2017) – Suppose for each domain, we have maximum and minimum number of items set in advance, { � � , � � }, k =1,.., D – � # of items belong to domain k so far, and n is the current test length, ��� is the maximum test length – �� indicates whether item j belongs to domain k � �� � � �� � ∏ – � �� �� � (Cheng, et al., 2009) ��� � � �� � �� �� ��� �� � ������ � � – �� = , �� = � � � ��� �� � 45

  10. A simulation study • Sample size N =2,000 • Multivariate normal, with mean of 0’s, and covariance matrix Σ = • Maximum a Posteriori (MAP) is used, and prior is multivariate normal with mean of 0’s and � • Evaluation criterion: root mean squared error (RMSE) 1 N  ˆ     2 RMSE( )= ( ) 1 1 1 i i N  1 i 46

  11. Results: Domain ‐ level recovery D ‐ optimal ( ‐ ) vs. Random selection ( ‐‐‐ ) 47

  12. Results: Domain ‐ level recovery D ‐ optimal ( ‐ ) vs. Constraint ‐ weighted D ‐ optimal ( ‐‐ ) 48

  13. Results: Domain ‐ level recovery D ‐ optimal ( ‐ ) vs. Constraint ‐ weighted D ‐ optimal ( ‐‐ ) 49

  14. Reducing Test Length 50

  15. (0, 0, 0) θ Confidence Interval 51 Test Length

  16. (2, 2, 2) θ Confidence Interval 52 Test Length

  17. Variable ‐ length CAT: Stopping rule Start 300+ items 53

  18. Stopping rule Start 300+ items When the measurement precision criterion is satisfied (Dodd, Koch & De Ayala, 1993; Boyd, Dodd, & Choi, 2010) 54

  19. Stopping rule Start 300+ items (a) Volume of the confidence ellipsoid (D ‐ rule) (b) Sum of S.E. per domain θ (c) Maximum axis of the confidence ellipsoid (d) Kullback ‐ Leibler divergence between to consecutive posteriors (Wang et al., 2013) 55

  20. Cumulated information growth Fisher information matrix Determinant of Test Length 56

  21. Stopping rule Start 300+ items 57

  22. Stopping rule Start 300+ items 58

  23. Stopping rule Start 300+ items � does not change much: When θ theta ‐ convergence rule (T ‐ rule) � ��� � � � � � 0.01 � (Babcock & Weiss, 2012 Wang et al., 2017+) 59

  24. Why T ‐ rule is secondary? • 2PL � � ∗ is in the ��� � � – , ��� � � ∗ � � � ��� ) interval of ( (Chang & Ying, 2008) 60

  25. Why T ‐ rule is secondary? • 2PL � � ∗ is in the ��� � � – , ��� � � ∗ � � � ��� ) interval of ( – It does not monotonically decrease when test length increases! • Terminate test pre ‐ maturely (Wang et al., 2017+) 61

  26. Why T ‐ rule is secondary? • 2PL � � ∗ is in the ��� � � – , ��� � � ∗ � � � ��� ) interval of ( – Undermine test efficiency � � � 25 � � )<.2 (Dodd, et al., 1993)  � � • Usually, the SE( � � ��� � � � � <.01 • If hypothetically � � � 1 , satisfying � � ∗ � 50 then � � (Wang et al., 2017+) 62

  27. MGRM • Simple structure � � � � � ���,� �� ��� � � � � ∗ � �� � ��� �� � �� � � 0: �� � � � � ��� �,� � �� � �� � � � � � � � � � � � � � �,� � � �,��� � � ���,� � �� � � �� � ∈ 1, … , � � � 2 : �� � 1 � � � � � � � � � ��� �,� � ��� �,��� � � � � � � �� � � � � �,� � �� � � � � � 1: � � � � � ��� �,� � � � � � �,� � �,� � � exp �� � (Wang et al., 2017+) 63

  28. MGRM • Simple structure � .5 � � � � � � ���,� �� ��� � � � � ∗ � �� � ��� �� � �� � � 0: �� � � � � ��� �,� � �� � �� � � � � � � � � � � � � � �,� � � �,��� � � ���,� � �� � � �� � ∈ 1, … , � � � 2 : �� � 1 � � � � � � � � � ��� �,� � ��� �,��� � � � � � � �� � � � � �,� � �� � � � � � 1: � � � � � ��� �,� � � � � � �,� � �,� � � exp �� � (Wang et al., 2017+) 64

  29. MGRM • Complex structure – If item j measures the p th trait � � ��� � ∗ � � � � ��� � ��� � � � � �� � � � � � � ∗ ��� � ��� � (Wang et al., 2017+) 65

  30. MGRM • Complex structure – If item j measures the p th trait � � ��� � ∗ � � � � ��� � ��� � � � � �� � � � � � � ∗ ��� � ��� � �� � ��� �� � � p th element of � � ��� The amount of information carried by item j (Wang et al., 2017+) 66

  31. MGRM • Complex structure – If item j measures the p th trait � � ��� � ∗ � � � � ��� � ��� � � � � �� � � � � � � ∗ ��� � ��� � (Wang et al., 2017+) 67

  32. MGRM • Complex structure – If item j measures the p th trait � � ��� � ∗ � � � � ��� � ��� � � � � �� � � � � � � ∗ ��� � ��� � – If item j measures multiple traits � ∗ ��� � ��� � � � ��� � ��� � � � � � � � � � � ∗ ��� � ��� � (Wang et al., 2017+) 68

  33. Primary vs. Secondary stopping rules Minimum test length Start 300+ items (Babcock & Weiss, 2012 Wang et al., 2017+) 69

  34. Primary vs. Secondary stopping rules Minimum test length Start 300+ items If D ‐ rule is satisfied? (Wang et al., 2017+) 70

  35. Primary vs. Secondary stopping rules Minimum test length Start 300+ items If D ‐ rule is satisfied? No Yes If T ‐ rule is satisfied? (Wang et al., 2017+) 71

  36. Primary vs. Secondary stopping rules Minimum test length Start 300+ items If D ‐ rule is satisfied? No Yes If T ‐ rule is satisfied? No Yes Continue (Wang et al., 2017+) 72

  37. Primary vs. Secondary stopping rules Minimum Maximum test length test length Start 300+ items If D ‐ rule is satisfied? No Yes If T ‐ rule is satisfied? 94.9% No Yes 28.5 Continue 5.1% (Wang et al., 2017+) 73 61.5

  38. Stopping rule results Applied Cognition Daily Activity Mobility SE θ 74

  39. 3D plot 75

  40. Stopping rule Cont. Test length Overall precision Primary stop Mean SD Bias RMSE Determinant Actual Eventual 28.5 13.3 0.005 0.303 514.7 0.949 0.965 1.6% 76

  41. Stopping rule Cont. Test length Overall precision Primary stop Mean SD Bias RMSE Determinant Actual Eventual 28.5 13.3 0.005 0.303 514.7 0.949 0.965 1.6% Test length Bias RMSE Stop End Stop End Stop End Mean SD N=31 58.7 15.3 72.2 15.5 0.162 0.136 0.430 0.391 N=71 64.5 13.0 120 0 0.207 0.204 0.592 0.525 77

  42. Outline • Brief introduction to computerized adaptive testing (CAT) • Multidimensional CAT • “Computerized Adaptive Testing to Direct Delivery of Hospital ‐ Based Rehabilitation” (NIH R01HD079439, 2015 ‐ 2020) – Item bank calibration – Item selection – Stopping rules • Ongoing projects 78

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend