for Solv lving and Reasoning Math Word Problems N T U Ting-Rui - - PowerPoint PPT Presentation

for solv lving and reasoning math word problems
SMART_READER_LITE
LIVE PREVIEW

for Solv lving and Reasoning Math Word Problems N T U Ting-Rui - - PowerPoint PPT Presentation

M I U L A B Semanticall lly-Aligned Equation Generation for Solv lving and Reasoning Math Word Problems N T U Ting-Rui Chiang and Yun-Nung (Vivian) Chen https://github.com/MiuLab/E2EMathSolver Math Word Problem Each notebook takes $ 0.5 and


slide-1
SLIDE 1

N T U M I U L A B

Semanticall lly-Aligned Equation Generation for Solv lving and Reasoning Math Word Problems

Ting-Rui Chiang and Yun-Nung (Vivian) Chen

https://github.com/MiuLab/E2EMathSolver

slide-2
SLIDE 2

N T U M I U L A B

Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

Math Word Problem

x = 10 − 1 × 5 ÷ 0.5

Reasoning & Solving

slide-3
SLIDE 3

N T U M I U L A B

Prior Work

Non-neural approaches

  • Template-based

(Kushman et al., Upadhyay and Chang)

Rely on hand-crafted features! Deep learning

  • Seq2Seq

(Wang et al., Ling et al.)

Does not use the structure of math expression.

x = (? + ?) × ? - ? x = (1+ 2) × 3 - 4

fill

Problem x = (1+ 2) × 3 - 4

generate

Our model is end-to-end and structural!

slide-4
SLIDE 4

N T U M I U L A B

Decoder

Encoder

Overview of f the Proposed Model

stack action stack action stack action stack action

Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

x = 10 − 1 × 5 ÷0.5

slide-5
SLIDE 5

N T U M I U L A B

Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

Look Again at the Problem

$1 $10 $0.5

?

slide-6
SLIDE 6

N T U M I U L A B

Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

Semantic Meaning of f the Operands

x = ( 10 − 1 × 5 ) ÷0.5

The amount of money Tom has Price of a pen Price of a notebook Number of pens bought

slide-7
SLIDE 7

N T U M I U L A B

Id Idea: Bridging Symbolic and Semantic Worlds

Symbolic World Semantic World

slide-8
SLIDE 8

N T U M I U L A B

Preprocess

0.5 1 10 5

Preprocess

Symbolic Part

Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

slide-9
SLIDE 9

N T U M I U L A B

Symbol Encoding

0.5 1 10 5

Preprocess

Symbolic Part

Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

Encode

Semantic Part

slide-10
SLIDE 10

N T U M I U L A B

In Inside Encoder

Each notebook takes 0.5 ... $ and

slide-11
SLIDE 11

N T U M I U L A B

Semantic Generation for Unknown x

Each notebook takes 0.5 ... $ and

* This part is actually done when decoding, but is present at this place for illustration. Check our paper for more information.

slide-12
SLIDE 12

N T U M I U L A B

Operands & Their Semantics

0.5 1 10 5 x Symbolic Part

Semantic Part Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

slide-13
SLIDE 13

N T U M I U L A B

In Intuition of f Using Semantics

x = ( 10 − 1 ? 5 )

Price of a pen. Number of pens bought.

Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

slide-14
SLIDE 14

N T U M I U L A B

x 10 1 5 × − 0.5 ÷ =

Equation Generation in Postfix

Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

slide-15
SLIDE 15

N T U M I U L A B

  • Stack is used
  • The decoder generates

stack actions.

  • An equation is generated

with actions on stack.

Equation Generation by Stack Actions

x = 10 − 1 × 5 ÷0.5

Decoder

stack action stack action stack action stack action

slide-16
SLIDE 16

N T U M I U L A B

Encoder

Action Selection in Each Step

Decoder

classifier

stack action

{+, -, ×, ÷, =, Push}

slide-17
SLIDE 17

N T U M I U L A B

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions:

Equation Generation by Stack Actions

0.5 1 10 5 x

Action: push

slide-18
SLIDE 18

N T U M I U L A B

x 0.5 1 10 5 x 10

Action: push

1

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x

Equation Generation by Stack Actions

5

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 1 Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 1 5

slide-19
SLIDE 19

N T U M I U L A B

x 0.5 1 10 5 x 10 1

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x

Equation Generation by Stack Actions

5

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 1 Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 1 5

Action: ×

1 × 5

slide-20
SLIDE 20

N T U M I U L A B

x=(10−1× 5)÷ 0.5 0.5 1 10 5 x

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x

Equation Generation by Stack Actions

Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 1 Target Equation: x = 10 − 1 × 5 ÷ 0.5 Generated Actions: x 10 1 5 × 0.5÷ =

After many steps…

slide-21
SLIDE 21

N T U M I U L A B

  • Target equation is given.
  • Trained as Seq2Seq.

Training Process

Decoder Encoder

Each notebook takes $0.5 and each pen takes $1. Tom has $10. How many notebooks can he buy after buying 5 pens?

x 10 1 5 … x 10 1 5 …

<bos>

slide-22
SLIDE 22

N T U M I U L A B

  • Dataset: Math23k
  • In Chinese
  • 23000 math word problems.
  • Operators: +, -, ×, ÷

Experiments

slide-23
SLIDE 23

N T U M I U L A B

Results

45 50 55 60 65 70

Retrieval BLSTM Self-Attention Seq2Seq w/SNI Proposed Hybrid

Acc.

Retrieval Template Generation Ensemble ≈ 8% > 1%

slide-24
SLIDE 24

N T U M I U L A B

59 60 61 62 63 64 65 66 Char-Based Word-Based Word-Based

  • Semantic

Word-Based

  • Gate

Word-Based

  • Gate
  • Attention

Word-Based

  • Gate
  • Attention
  • Stack

Acc.

Ablation Test

≈ 3% ≈ 2.5%

slide-25
SLIDE 25

N T U M I U L A B

Encoder

Self-Attention for Qualitative Analysis

Each notebook takes 0.5 ... $ and

slide-26
SLIDE 26

N T U M I U L A B

Encoder

Self-Attention for Qualitative Analysis

Each notebook takes 0.5 ... $ and

slide-27
SLIDE 27

N T U M I U L A B

The attention focuses on:

  • Informative verbs
  • “gain”, “get”, “fill”, etc.
  • Quantifier-related words
  • “every”, “how many”, etc.

Attention for Operand Semantics

slide-28
SLIDE 28

N T U M I U L A B

Three main contributions

  • Approach: equation generation with stack
  • Originality: automatic extraction of operand semantics
  • Performance: a SOTA end-to-end neural model on Math23k

Conclusion

slide-29
SLIDE 29

N T U M I U L A B

Code Available @ https://github.com/MiuLab/E2EMathSolver