Bayesian Networks in Educational Testing Ji r Vomlel Laboratory - PowerPoint PPT Presentation

Bayesian Networks in Educational Testing Jiˇ r´ ı Vomlel Laboratory for Intelligent Systems Prague University of Economics This presentation is available at: http://www.utia.cas.cz/vomlel/slides/lisp2002.pdf

Contents: • Educational testing is a “big business”. • What is a fixed test and an adaptive test? • An example: a test of basic operations with fractions. • Optimal and myopically optimal tests. • Construction of a myopically optimal fixed test. • Results of experiments. • Ane example showing that modeling dependence between skills is important. • Conclusions.

Educational Testing Service (ETS) • Educational Testing Service is the world’s largest private educational testing organization with 2,300 regular employees. • Volumes for ETS’s Largest Exams in 2000-2001: 3,185,000 SAT I Reasoning Test and SAT II: Subject Area Tests (the SAT test is the standard college admission test in US) 2,293,000 PSAT: Preliminary SAT/National Merit Scholarship Qualifying Test 1,421,000 AP: Advanced Placement Program 801,000 The Praxis Series: Professional Assessments for Beginning Teach- ers and Pre-Professional Skills Tests 787,000 TOEFL: Test of English as a Foreign Language 449,000 GRE: Graduate Record Examinations General Test etc.

Fixed Test vs. Adaptive Test Q 1 Q 2 Q 5 wrong correct Q 3 Q 4 Q 8 Q 4 wrong correct wrong correct Q 5 Q 7 Q 6 Q 9 Q 2 Q 6 wrong correct wrong correct wrong correct wrong correct Q 7 Q 1 Q 3 Q 6 Q 8 Q 4 Q 7 Q 7 Q 10 Q 8 Q 9 Q 10

Computerized Adaptive Testing (CAT) Objective: An optimal test for each examinee Two basic steps: (1) examinee’s knowledge level is estimated (2) questions appropriate for the level are selected. R. Almond and R. Mislevy from ETS proposed to use graphical models in CAT. • one student model (relations between skills, abilities, etc.) • several evidence models , one for each task or question.

CAT for basic operations with fractions Examples of tasks: � � 3 4 · 5 − 1 15 24 − 1 8 = 5 8 − 1 8 = 4 8 = 1 T 1 : = 6 8 2 6 + 1 1 12 + 1 2 12 = 1 3 12 = T 2 : = 12 4 4 · 1 1 1 1 4 · 3 2 = 3 T 3 : = 2 8 � � � � 1 2 · 1 1 3 + 1 1 4 · 2 12 = 1 2 · 3 = T 4 : = 6 . 2 3

Elementary and operational skills 2 > 1 1 3 > 1 2 CP Comparison (common nu- 3 , 3 merator or denominator) 7 = 1 + 2 1 7 + 2 = 3 AD Addition (comm. denom.) 7 7 5 = 2 − 1 5 − 1 2 = 1 SB Subtract. (comm. denom.) 5 5 1 2 · 3 3 5 = MT Multiplication 10 � � � � 1 2 , 2 6 , 4 3 = Common denominator CD 3 6 6 = 2 · 2 4 2 · 3 = 2 CL Cancelling out 3 2 = 3 · 2 + 1 7 = 3 1 CIM Conv. to mixed numbers 2 2 2 = 3 · 2 + 1 3 1 = 7 CMI Conv. to improp. fractions 2 2

Misconceptions Label Description Occurrence d = a + c a b + c MAD 14.8% b + d d = a − c a b − c MSB 9.4% b − d b = a · c a b · c MMT1 14.1% b b · c a b = a + c MMT2 8.1% b · b d = a · d a b · c MMT3 15.4% b · c a · c a b · c d = MMT4 8.1% b + d c = a · b a b MC 4.0% c

Process that lead to the student model • decision on what skills will be tested, preparation of paper tests • paper tests given to students at Brønderslev high school, 149 students did the test. • analysis of results, finding misconceptions, summarizing results into a data file, • learning a Bayesian network model using the PC-algorithm and the EM-algorithm, • attempts to explain some relations between skills and misconceptions using hidden variables, • a new learning phase with hidden variables included, certain edges required to be part of the learned model.

Student model HV2 HV1 ACMI ACIM ACL ACD AD SB CMI CIM CL CD MT CP MAD MSB MC MMT1 MMT2 MMT3 MMT4

Evidence model for task T 1 � 3 � 4 · 5 − 1 8 = 15 24 − 1 8 = 5 8 − 1 8 = 4 8 = 1 6 2 ⇔ MT & CL & ACL & SB & ¬ MMT 3 & ¬ MMT 4 & ¬ MSB T 1 CL ACL MT SB MMT3 MSB T 1 MMT4 P ( X 1 | T 1 ) X 1

Student + Evidence model HV2 HV1 ACMI ACIM ACL ACD AD SB CMI CIM CL CD MT CP MAD MSB MC MMT1 MMT2 MMT3 MMT4 T 1 X 1

Example of an adaptive test X 3 = yes X 3 : 1 4 < 2 5 ? X 2 = yes X 3 = no X 2 : 1 5 < 1 4 ? X 1 = yes X 2 = no X 1 : 1 5 < 2 5 ? X 1 = no Entropy of a probability distribution P ( S i ) − ∑ P ( S i = s i ) · log P ( S i = s i ) H ( P ( S i )) = s i ∈ S i Total entropy in a node n : H ( e n ) = ∑ S i ∈S H ( P ( S i | e n )) . Expected entropy at the end of a test t is EH ( t ) = ∑ ℓ ∈L ( t ) P ( e ℓ ) · H ( e ℓ ) .

Let T be the set of all possible tests. X 2 A test t ⋆ is optimal iff X 3 X 1 t ⋆ = t ∈T EH ( t ) . arg min X 2 X 3 A selected test X 1 A myopically optimal test t is a test X 3 where each question X ⋆ of t minimizes X 2 X 1 the expected value of entropy after the X 3 question is answered: X 1 X ⋆ = X ∈X EH ( t ↓ X ) , arg min X 2 X 3 X 1 i.e. it works as if the test finished after X 2 the selected question X ⋆ .

Myopic construction of a fixed test X 2 e list : = [ ∅ ] ; X 3 X 1 test : = [ ] ; X 2 for i : = 1 to |X | do counts [ i ] : = 0; X 3 P ( X 2 = 0 ) X 1 for position : = 1 to test lenght do X 3 new e list : = [ ] ; X 2 X 1 for all e ∈ e list do X 3 P ( X 2 = 1 ) i : = most in f ormative X ( e ) ; X 1 counts [ i ] : = counts [ i ] + P ( e ) ; X 2 X 3 X 1 for all x i ∈ X i do X 2 append ( new e list , { e ∪ { X i = x i }} ) ; e list : = new e list ; = {{ X 2 = 0 } , { X 2 = 1 }} e list i ⋆ : = arg max i counts [ i ] ; counts [ 3 ] = P ( X 2 = 0 ) = 0.7 append ( test , X i ⋆ ) ; counts [ 1 ] = P ( X 2 = 1 ) = 0.3 counts [ i ⋆ ] : = 0; X 2 X 3 . . . return ( test ) ;

Skill Prediction Quality 92 adaptive average descending 90 ascending 88 Quality of skill predictions 86 84 82 80 78 76 74 0 2 4 6 8 10 12 14 16 18 20 Number of answered questions

Total entropy of probability of skills 12 adaptive average descending 11 ascending 10 9 Entropy on skills 8 7 6 5 4 0 2 4 6 8 10 12 14 16 18 20 Number of answered questions

Question Prediction Quality 100 adaptive average descending ascending 95 Quality of question predictions 90 85 80 75 70 0 2 4 6 8 10 12 14 16 18 Number of answered questions

An example of a simple diagnostic task Diagnosis of the absence or the presence of three skills S 1 , S 2 , S 3 by use of a bank of three questions X 1,2 , X 1,3 , X 2,3 . such that  if ( s i , s j ) = ( 1, 1 ) 1  P ( X i , j = 1 | S i = s i , S j = s j ) = 0 otherwise.  Assume answers to all questions from the item bank are wrong, i.e. X 1,2 = 0, X 1,3 = 0, X 2,3 = 0 .

Reasoning assuming skill independency X 1 , 3 All skills are independent S 1 S 3 P ( S 1 ) · P ( S 2 ) · P ( S 3 ) P ( S 1 , S 2 , S 3 ) = and P ( S i ) , i = 1, 2, 3 are uniform. X 1 , 2 S 2 X 2 , 3 Then the probabilities for j = 1, 2, 3 are: P ( S j = 0 | X 1,2 = 0, X 1,3 = 0, X 2,3 = 0 ) = 0.75 , i.e. we can not decide which skills are present and which are missing.

Modeling dependence between skills X 1 , 3 with deterministic hierarchy S 1 S 3 S 1 ⇒ S 2 , S 2 ⇒ S 3 X 1 , 2 S 2 X 2 , 3 P ( S 1 = 0 | X 1,2 = 0, X 1,3 = 0, X 2,3 = 0 ) = 1 P ( S 2 = 0 | X 1,2 = 0, X 1,3 = 0, X 2,3 = 0 ) = 1 P ( S 3 = 0 | X 1,2 = 0, X 1,3 = 0, X 2,3 = 0 ) = 0.5 Observe, that for i = 1, 2, 3 P ( S i | X 1,2 = 0, X 1,3 = 0, X 2,3 = 0 ) P ( S i | X 2,3 = 0 ) , i.e. = X 2,3 = 0 gives the same information as X 1,2 = 0, X 1,3 = 0, X 2,3 = 0.

Conclusions • Empirical evidence shows that educational testing can benefit from application of Bayesian networks . • Adaptive tests may substantially reduce the number of questions that are necessary to be asked. • The new method for the design of a fixed test provided good results on tested data. It may be regarded as a good cheap alternative to computerized adaptive tests when they are not suitable. • One theoretical problem related to application of Bayesian networks to educational testing is efficient inference exploiting deterministic relations in the model. This problem was addressed in our UAI 2002 paper.

... and this is the END. It’s time to have a beer. ... or are there any questions?

Bayesian Networks in Educational Testing Ji r Vomlel Laboratory - PowerPoint PPT Presentation

Bayesian Networks in Educational Testing Ji r Vomlel Laboratory for Intelligent Systems Prague University of Economics This presentation is available at: http://www.utia.cas.cz/vomlel/slides/lisp2002.pdf Contents: Educational

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Methods for Neural Networks Readings: Bishop, Neural Networks for Pattern Recognition .

Chapter14 Probabilistic Reasoning (Bayesian Networks) Sec. 1 - 2 20070607 Chap14 1

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Bayesian Networks Philipp Koehn 2 April 2020 Philipp Koehn Artificial Intelligence: Bayesian

Bayesian Networks Philipp Koehn 6 April 2017 Philipp Koehn Artificial Intelligence: Bayesian

Probabilistic Modeling: Bayesian Networks Bioinformatics: Sequence Analysis COMP 571 - Spring

Bayesian Networks Li Xiong Slide credits: Page (Wisconsin) CS760 , Zhu (Wisconsin) KDD 12

Having Fun with OpenCV Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps Today

SELF-EXPLANATION AND SELF-DRIVING java exception result no explanation communication to

1 What Financial Markets Do Financial markets perform two important functions. One is to enable

Positive delimitation : (i) a set of rules intended to bring argumentative discipline to the

Markov Chains CS70 Summer 2016 - Lecture 6B David Dinh 26 July 2016 UC Berkeley Agenda Quiz

Introduction to Model Versioning SFM-12: MDE June 22, 2012 Petra Brosch, Gerti Kappel, Philip

The Citizen Cyberscience Centre: Mission, Sponsorship Models and Project Portfolio Franois Grey

Complexity of the hypercubic billiard Nicolas Bedaride Laboratoire dAnalyse Topologie