IMTKU Question Answering System for World History Exams at NTCIR-13 - - PowerPoint PPT Presentation

imtku question answering system for world history exams
SMART_READER_LITE
LIVE PREVIEW

IMTKU Question Answering System for World History Exams at NTCIR-13 - - PowerPoint PPT Presentation

Tamkang University IMTKU Question Answering System for World History Exams at NTCIR-13 QALab-3 Department of Information Management Tamkang University, Taiwan Min-Yuh Day Chao-Yu Chen I-Hsuan Huang Tz-Rung Chen Min-Chun Kuo Yue-Da Lin


slide-1
SLIDE 1

IMTKU Question Answering System for World History Exams at NTCIR-13 QALab-3

myday@mail.tku.edu.tw NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

Department of Information Management Tamkang University, Taiwan

Tamkang University

Min-Yuh Day

Yue-Da Lin

Chao-Yu Chen

I-Hsuan Huang Wanchu Huang Shi-Ya Zheng Tz-Rung Chen Min-Chun Kuo

Yi-Jing Lin

slide-2
SLIDE 2

2

IMTKU

Question Answering System

for World History Exams at NTCIR-13 QALab-3

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

slide-3
SLIDE 3

NTCIR-9 Workshop, December 6-9, 2011, Tokyo, Japan

myday@mail.tku.edu.tw

Department of Information Management Tamkang University, Taiwan

Chun Tu

Min-Yuh Day

IMTKU Textual Entailment System for Recognizing Inference in Text at NTCIR-9 RITE

Tamkang University Tamkang University

2011

slide-4
SLIDE 4

IMTKU Textual Entailment System for Recognizing Inference in Text at NTCIR-10 RITE-2

Tamkang University

myday@mail.tku.edu.tw

NTCIR-10 Conference, June 18-21, 2013, Tokyo, Japan

Department of Information Management Tamkang University, Taiwan

Chun Tu Hou-Cheng Vong Shih-Wei Wu Shih-Jhen Huang

Min-Yuh Day

Tamkang University

2013

slide-5
SLIDE 5

Ya-Jung Wang Min-Yuh Day

Che-Wei Hsu

Huai-Wen Hsu

En-Chun Tu

IMTKU Textual Entailment System for Recognizing Inference in Text at NTCIR-11 RITE-VAL

2014

Yu-Hsuan Tai

Shang-Yu Wu

Cheng-Chia Tsai

NTCIR-11 Conference, December 8-12, 2014, Tokyo, Japan

Tamkang University

Yu-An Lin

slide-6
SLIDE 6

IMTKU Question Answering System for World History Exams at NTCIR-12 QA Lab2

myday@mail.tku.edu.tw NTCIR-12 Conference, June 7-10, 2016, Tokyo, Japan

Min-Yuh Day Cheng-Chia Tsai Wei-Chun Chung Hsiu-Yuan Chang

Yuan-Jie Tsai Jin-Kun Lin Yue-Da Lin Wei-Ming Chen Yun-Da Tsai Cheng-Jhih Han Yi-Jing Lin Yu-Ming Guo Tzu-Jui Sun Yi-Heng Chiang Ching-Yuan Chien

Department of Information Management Tamkang University, Taiwan

Cheng-Hung Lee

Tamkang University

2016

Sagacity Technology

slide-7
SLIDE 7

IMTKU Question Answering System for World History Exams at NTCIR-13 QALab-3

myday@mail.tku.edu.tw NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

Department of Information Management Tamkang University, Taiwan

Tamkang University

Yue-Da Lin I-Hsuan Huang Wanchu Huang Shi-Ya Zheng Tz-Rung Chen Min-Chun Kuo

Yi-Jing Lin

2017

Min-Yuh Day

Chao-Yu Chen

slide-8
SLIDE 8

Outline

  • IMTKU Question Answering System Architecture
  • IMTKU System Description
  • Performance
  • Discussions and Conclusions

8

Tamkang University

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

slide-9
SLIDE 9

Highlights

  • IMTKU (Information Management at TamKang University)

Question Answering System for World History Exams in Japanese university entrance exams at NTCIR-13 QALab-3.

  • IMTKU Submitted runs for QALab-3 phase-2

– 3 English End-to-End multiple-choice – 2 English and 2 Japanese End-to-End essay – 2 English and 2 Japanese extraction essay – 1 English and Japanese summarization essay

  • MTKU achieved the best passage precision and

the best nugget recall in English Extraction task.

9

Tamkang University

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

slide-10
SLIDE 10

IMTKU System Architecture for NTCIR-13 QALab-3

10

Question (XML)

Question Analysis Document Retrieval Answer Extraction Answer Generation

Stanford CoreNLP JA&EN Translator

Wikipedia

Answer (XML)

Complex Essay Simple Essay True-or-False Factoid Slot-Filling Unique Word Embedding Wiki Word2Vec

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

slide-11
SLIDE 11

11

IMTKU System Description

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

slide-12
SLIDE 12

12

JA & EN Translator NER & POS Tagger Question Type Identification Keyword Extraction

Question (XML) Question Analysis Result (XML)

Question Analysis

1

Complex Essay Simple Essay True-or-False Slot-Filling Unique

Stanford CoreNLP JA&EN Translator

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

Factoid

slide-13
SLIDE 13

13

Japanese: 古代メソポタミアと古代エジプトにおける 暦とその発達の背景について,3行以内で 説明しなさい。 English (JA & EN Translator by Google Translate): Explain the calendar in ancient Mesopotamia and ancient Egypt and the background of its development within 3 lines.

JA & EN Translator

JA&EN Translator

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

slide-14
SLIDE 14

14

Raw Data: Wang Anshi, who lived during the Song period, carried out reforms called the New Policies (xin fa). POS tagger and NER: Wang/PERSON/NNP Anshi/PERSON/NNP ,/O/, who/O/WP lived/O/VBD during/O/IN the/O/DT Song/O/NN period/O/NN ,/O/, carried/O/VBD out/O/RP reforms/O/NNS called/O/VBD the/O/DT New/O/JJ Policies/O/NNS -LRB-/O/-LRB- xin/O/FW fa/O/FW -RRB- /O/-RRB- ./O/.

NER & POS tagger

Stanford CoreNLP

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

slide-15
SLIDE 15

15

Question Analysis Keyword list Content list Wikipedia

Articles Extraction Ambiguous word Extraction

Y N

Document Retrieval

2

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

slide-16
SLIDE 16

16

Answer Extraction

3

Question Analysis Result (XML) Answer Extraction Result (XML) TF-IDF Score List Document Retrieval Result (XML)

TF-IDF Scoring

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

slide-17
SLIDE 17

Answer Generation

17

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

Question Analysis Result (XML) Document Retrieval Result (XML) Answer Extraction Result (XML) Combination and Matching Strategy Answer (XML) Complex Essay Simple Essay

slide-18
SLIDE 18

Answer Generation

18

4

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

Text Summarization

Answer for Essay

Token Extraction Vector Similarity

DR Result Content list

Wiki Word2Vec Gensim Stanford NER Stanford POS tagger

TF-IDF and Cosine Similarity Answer for Multiple Choice QA Result Question (XML)

slide-19
SLIDE 19

19

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

IMTKU Phase-2 Official Runs

IMTKU Official Runs End-to-End (e2e) extraction Summarization Essay qalab3-en-phase2- answersheet- essay_QALabIMTKU_e2e_01 qalab3-en-phase2- answersheet- essay_QALabIMTKU_extraction _01 qalab3-en-phase2- answersheet- essay_QALabIMTKU_summariz ation_01 qalab3-en-phase2- answersheet- essay_QALabIMTKU_e2e_02 qalab3-en-phase2- answersheet- essay_QALabIMTKU_extraction _02

  • qalab3-ja-phase2-answersheet-

essay_QALabIMTKU_e2e_01 qalab3-ja-phase2-answersheet- essay_QALabIMTKU_extraction _01 qalab3-ja-phase2-answersheet- essay_QALabIMTKU_summariz ation_01 qalab3-ja-phase2-answersheet- essay_QALabIMTKU_e2e_02 qalab3-ja-phase2-answersheet- essay_QALabIMTKU_extraction _02 qalab3-ja-phase2-answersheet- essay_QALabIMTKU_summariz ation_02 IMTKU Official Runs Multiple Choice National Center Test Center-2014--Main- SekaishiB_QALabIMTKU_EN_01 Center-2014--Main- SekaishiB_QALabIMTKU_EN_02 Center-2014--Main- SekaishiB_QALabIMTKU_EN_03

slide-20
SLIDE 20

20

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

IMTKU at NTCIR-13 QA-Lab3 Phase-3 Performance

Run Lang. Correct rate

Total

Total score Average score

IMTKU RUN01

EN 0.333

12/36

34 0.34

IMTKU RUN02

EN 0.389

14/36 40 0.40 IMTKU RUN03

EN 0.194

7/36

18 0.18

Results of IMTKU multiple-choice questions in Phase-2

slide-21
SLIDE 21

21

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

IMTKU at NTCIR-13 QA-Lab3 Phase-3 Performance

Results of IMTKU English essay questions in Phase-2

SYSTEM IMTKU1QALab3 IMTKU2QALab3 TYPE SIMPLE COMPLEX SIMPLE COMPLEX METHO D CASE STEM STOP CASE STEM STOP CASE STEM STOP CASE STEM STOP R-1 0.075 0.077 0.026 0.312 0.329 0.131 0.006 0.009 0.012 0.008 0.014 0.013 R-2 0.005 0.007 0.052 0.054 0.007 R-S* 0.056 0.057 0.023 0.164 0.167 0.063 0.006 0.009 0.012 0.007 0.012 0.012 R-S4 0.031 0.032 0.015 0.047 0.048 0.025 0.003 0.006 0.008 0.004 0.006 0.007 R-S9 0.007 0.007 0.092 0.102 0.013 R-SU* 0.007 0.008 0.063 0.069 0.005 R-SU4 0.008 0.009 0.073 0.080 0.006 R-SU9 0.009 0.010 0.001 0.094 0.104 0.015 R-L 0.018 0.019 0.003 0.105 0.113 0.027 0.001 0.001 0.001 0.002 0.003 R-W1.2 0.015 0.015 0.002 0.095 0.103 0.018 0.001 0.002 0.002

slide-22
SLIDE 22

22

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

IMTKU at NTCIR-13 QA-Lab3 Phase-3 Performance

Results of IMTKU Japanese essay questions in Phase-2

SYSTEM IMTKU1QALab3 TYPE SIMPLE COMPLEX METHOD content text shortest unit (stem) shortes t unit (root) content text shortes t unit (stem) shortes t unit (root) R-1 0.014 0.185 0.175 0.180 0.098 0.408 0.347 0.352 R-2 0.052 0.040 0.041 0.002 0.164 0.109 0.113 R-S* 0.006 0.147 0.150 0.144 0.070 0.354 0.317 0.308 R-S4 0.005 0.075 0.082 0.079 0.038 0.129 0.119 0.117 R-S9 0.041 0.038 0.039 0.006 0.139 0.105 0.108 R-SU* 0.001 0.043 0.041 0.042 0.003 0.144 0.122 0.128 R-SU4 0.048 0.049 0.051 0.005 0.158 0.136 0.143 R-SU9 0.001 0.043 0.041 0.042 0.007 0.140 0.106 0.108 R-L 0.003 0.066 0.062 0.064 0.019 0.188 0.160 0.165 R-W1.2 0.002 0.060 0.060 0.062 0.013 0.181 0.155 0.162

slide-23
SLIDE 23

23

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

IMTKU at NTCIR-13 QA-Lab3 Phase-3 Performance

Results of IMTKU Japanese essay questions in Phase-2

SYSTEM IMTKU2QALab3 TYPE SIMPLE COMPLEX METHOD content text shortest unit (stem) shortes t unit (root) content text shortes t unit (stem) shortes t unit (root) R-1 0.004 0.067 0.043 0.050 0.010 0.146 0.059 0.075 R-2 0.007 0.003 0.003 0.022 0.007 0.008 R-S* 0.004 0.049 0.041 0.048 0.009 0.124 0.058 0.071 R-S4 0.003 0.028 0.026 0.030 0.006 0.043 0.022 0.027 R-S9 0.006 0.002 0.003 0.027 0.006 0.008 R-SU* 0.004 0.001 0.002 0.016 0.003 0.005 R-SU4 0.004 0.001 0.002 0.018 0.004 0.006 R-SU9 0.007 0.003 0.004 0.028 0.006 0.009 R-L 0.001 0.013 0.006 0.008 0.002 0.037 0.012 0.016 R-W1.2 0.001 0.010 0.004 0.006 0.001 0.030 0.008 0.012

slide-24
SLIDE 24

24

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

IMTKU at NTCIR-13 QA-Lab3 Extraction task at Phase 2 (N=5 and N = 10)

Run Lang. Passage Precision

Nugget Recall

  • Ave. of

Tokens

IMTKU RUN01

EN 0.260 0.061 249.2

IMTKU RUN02

EN 0.234 0.058 249.2

slide-25
SLIDE 25
  • IMTKU Question Answering System for

World History Exams at NTCIR-13 QALab-2

  • IMTKU Submitted runs for QALab-3 phase-2

– 3 English End-to-End multiple-choice – 2 English and 2 Japanese End-to-End essay – 2 English and 2 Japanese extraction essay – 1 English and 1 Japanese summarization essay

  • MTKU achieved the best passage precision and

the best nugget recall in English Extraction task.

25

Conclusions

NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

slide-26
SLIDE 26

IMTKU Question Answering System for World History Exams at NTCIR-13 QALab-3

myday@mail.tku.edu.tw NTCIR-13 Conference, December 5-8, 2017, Tokyo, Japan

Department of Information Management Tamkang University, Taiwan

Tamkang University

Yue-Da Lin I-Hsuan Huang Wanchu Huang Shi-Ya Zheng Tz-Rung Chen Min-Chun Kuo

Yi-Jing Lin

Q&A

Min-Yuh Day

Chao-Yu Chen