Question Difficulty Prediction for READING Problems in Standard - - PowerPoint PPT Presentation

question difficulty prediction for reading problems in
SMART_READER_LITE
LIVE PREVIEW

Question Difficulty Prediction for READING Problems in Standard - - PowerPoint PPT Presentation

The 31 st Association for the Advancement of Artificial Intelligence (AAAI'17) 2017/02/04-02/09, San Francisco, CA Question Difficulty Prediction for READING Problems in Standard Tests Reporter: Zhenya Huang Date: Feb. 7 th , 2017


slide-1
SLIDE 1

.

Question Difficulty Prediction for READING Problems in Standard Tests

The 31st Association for the Advancement

  • f Artificial Intelligence (AAAI'17)

2017/02/04-02/09, San Francisco, CA

Reporter: Zhenya Huang Date: Feb. 7th, 2017

slide-2
SLIDE 2

.

Outline

.

  • A
  • .
slide-3
SLIDE 3

.

Background

Ø In widely used standard tests, such as TOEFL, examinees are often

allowed to retake tests and choose higher scores for college admission.

Ø Fairness requirement: select test papers

with consistent difficulties.

Ø Test Measurements have attracted much

attention.

Ø Crucial demand: question difficulty

prediction (QDP)

slide-4
SLIDE 4

.

What is question difficulty?

Ø Following Educational Psychology, question difficulty refers to the percentage

  • f examinees who answer the question wrong.

(T1) Q1: (1+0)/2=0.5 (T1) Q2: 0 (T2) Q3: 0.33

slide-5
SLIDE 5

.

Background

Ø Traditional solutions resort to expertise

Ø Experts Labeling

Ø Subjective Ø Biases on different experts, thus sometimes misleading

Ø Artificial test organization

Ø Labor intensive Ø Confidentiality

Ø Human-based solutions cannot applied to large-scale

Question Difficulty Prediction (QDP)

slide-6
SLIDE 6

.

Research Problem

Ø Urgent issue: Question Difficulty Prediction (QDP)

Ø How to automatically predict question difficulty without manual intervention ?

Ø Opportunity

Ø Historical test logs of examinees Ø Text materials of questions

Ø This paper focuses on English Reading Problems

slide-7
SLIDE 7

.

Challenge 1 for QDP

Ø Requires an unified way to understand and represent them from a

semantic perspective.

Ø Multiple parts of question texts

Ø Document (TD) Ø Question (TQ) Ø Options (TO)

slide-8
SLIDE 8

.

Challenge 2 for QDP

Ø It is necessary and hard to distinguish the importance of text

materials to a specific question

Ø Different questions concern different parts of texts

Ø Q1 concentrates more on the highlighted “blue” Ø Q2 focuses more on the “green”

slide-9
SLIDE 9

.

Challenge 3 for QDP

Ø It is necessary to take these difficulty biases into consideration for

question difficulty prediction

Ø Different questions are incomparable in different tests

Ø Q2 with difficulty 0.6 in T1 Ø Q1 with difficulty 0.37 in T2

slide-10
SLIDE 10

.

Related Work for QDP

Ø Education Psychology

Ø Possible factors contributed to question difficulty

Ø Question attributes, i.e., question types (structures) Ø Examinee knowledge mastering degree

Ø Cognitive Diagnosis Assessment (CDA)

Ø Question difficulty obtained from examinees’ responses

Ø Nature Language Process

Ø Understanding and representations of all text materials

Ø Ø Ø

Take a lot of human effort. Not an automatic solution. Machine abilities V.S. Question difficulty e.g., word reasoning

slide-11
SLIDE 11

.

Outline

  • .
  • A
  • .
slide-12
SLIDE 12

.

Problem Definition

Ø Given: questions of READING problems with corresponding text materials Ø Given: historical examinees’ test logs. Ø Goal: Automatically predict question difficulty in newly-conduct tests

slide-13
SLIDE 13

.

Outline

A

  • .
  • .
slide-14
SLIDE 14

.

Study Overview

Ø Two-stage solution

Ø Training stage

Ø TACNN Ø Training strategy

Ø Testing stage

Ø Predict difficulty

slide-15
SLIDE 15

.

Outline

  • .
  • A
  • .
slide-16
SLIDE 16

.

TACNN Framework

Ø Test-dependent Attention-based Convolutional

Neural Network (TACNN)

Ø Learning all text materials of each question from a

sentence semantic perspective

— CNN-based architecture

Ø Learns attention representations for each question by

qualifying the contributions of its text materials

— Attention strategy

Ø Wipe out the difficulty biases in different tests for

training

— Test-dependent strategy

Challenge 1: unified way Challenge 2: qualify contributions Challenge 3: Difficulty biases

slide-17
SLIDE 17

.

TACNN Framework

Ø Four Layers

D

slide-18
SLIDE 18

.

TACNN Framework – Input

Ø Goal: learn sentence representations from word perspective Ø For each question (Text materials)

Ø Document (TD) Ø Sequence sentences Ø Question (TQ) Ø One sentence Ø Options (TO) Ø Four sentences

Ø For each sentence

Ø Sequence words

Ø For each word

Ø Embedding

D

slide-19
SLIDE 19

.

TACNN Framework – Sentence CNN

Ø Goal: learn sentence representations from

semantic perspective

Ø CNN-based architecture

Ø Capture dominated information

= Reading habit

Ø Learn deep comparable semantic

representations

Ø Reduces the model complexity

slide-20
SLIDE 20

.

Ø A variant of traditional CNN

Ø Four Convolution (3 wide + 1 narrow) Ø Four pooling

TACNN Framework – Sentence CNN

slide-21
SLIDE 21

.

TACNN Framework - Attention Layer

Ø Goal:

Ø Qualify the contributions of text materials to a

specific question

Ø Learn the attention representations

Ø Considering both documents and options level

Attention score Attention vector

slide-22
SLIDE 22

.

TACNN Framework – Predict Layer

Ø Goal: predicting question difficulty

Ø Document attention vector Ø Option attention vector Ø Question vector

Document attention vector Option attention vector Question vector

slide-23
SLIDE 23

.

TACNN — training strategy

Ø How to train? Ø Supervised way: leverage historical test logs of examinees

slide-24
SLIDE 24

.

TACNN — training strategy

Ø Biases: question difficulties are test-dependent

Ø Different questions in different tests are incomparable, i.e., Q1 and Q3 Ø Different questions in same tests are comparable, i.e., Q1 and Q2

(T1) Q1: (1+0)/2=0.5 (T1) Q2: 0 (T2) Q3: 0.33 Which is more difficult?

slide-25
SLIDE 25

.

TACNN — training strategy

Ø Test-dependent pairwise training objective

Ø Training “gap” from two question difficulties Ø Minimize the objective function by AdaDelta

Qi, Qj in same test Tt Prediction of Qi Prediction of Qj

slide-26
SLIDE 26

.

TACNN — testing stage

Ø After training, we can predict question difficulty from text perspectives, e.g.,

words or sentences

Ø More application

Ø Automatically label question for large-scale systems Ø Help decide whether the question to choose into the test paper or not.

slide-27
SLIDE 27

.

Outline

  • .
  • A
  • .
slide-28
SLIDE 28

.

Experiments

Ø Experiments dataset

Ø Supplied by IFLYTEK Ø Collected from real-world standard tests for READING problems in Chinese

senior high schools from the year 2014 to 2016

slide-29
SLIDE 29

.

Experiments

Ø Baseline methods

Ø Variants of TACNN: CNN, ACNN, TCNN

Ø To validate the performance of each component in TACNN

Ø Machine comprehension (MC) model: HABCNN

Ø The most similar network architecture to ours

Ø Evaluation metrics

Ø RMSE Ø DOA: Measure the percentage of correctly ranked difficulties of question pairs Ø PCC: Pearson Correlation Coefficient Ø PR: the percentage of tests which pass t-test at confidence level of 0.05

slide-30
SLIDE 30

.

Experiments

Ø Overall results

Ø Attention strategy and test-dependent training strategy do effectively Ø Solutions to MC task is unsuitable for QDP Ø Demonstrates the rationality of pairwise training strategy

slide-31
SLIDE 31

.

Experiments

Ø Experts comparisons

Ø Predictions from experts are not always consistent Ø Expert predictions are subjective, which are hardly of the same mind. Ø Expert predictions may sometimes misleading

slide-32
SLIDE 32

.

Experiments

Ø Model explanatory power (model visualization)

Ø Document-level (Q1) Ø

Good way for a question to capture key information for model explanations

slide-33
SLIDE 33

.

Outline

.

  • .
  • A
slide-34
SLIDE 34

.

Conclusion

Ø Proposed an unified TACNN framework for question difficulty

prediction task.

Ø TACNN integrated two critical components, i.e., Sentence CNN

Layer and Attention Layer, which can exactly learn question representations for reading problems from semantic perspective.

Ø Proposed a test-dependent pairwise strategy for training TACNN

and generating the difficulty prediction values.

Ø Experiments on real-world dataset demonstrated both the

effectiveness and explanatory power of TACNN.

slide-35
SLIDE 35

.

Future Work

Ø We will make our efforts to design a more efficient learning

algorithm for TACNN

Ø We are also willing to extend TACNN to solve QDP task in

Ø Other types of problems in English tests, e.g., LISTENING, WRITING Ø Other subjects, e.g., MATH

slide-36
SLIDE 36

.

Q & A