Learning Fine-Grained Image Representations for Mathematical - - PowerPoint PPT Presentation

learning fine grained image representations for
SMART_READER_LITE
LIVE PREVIEW

Learning Fine-Grained Image Representations for Mathematical - - PowerPoint PPT Presentation

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen Computer Vision for Human Computer Interaction Lab Institute for Anthropomatics and


slide-1
SLIDE 1

Computer Vision for Human Computer Interaction Lab Institute for Anthropomatics and Robotics

KIT – The Research University in the Helmholtz Association

www.kit.edu

Input Image

\left[\;\Lambda\;\right]_{R}\^{S}= \left(\begin{array}{ll}{ \operatorname{cos}\Psi}& {\operatorname{sin}\Psi}\\ {\operatorname{sin}\Psi}& {\operatorname{cos}\Psi}\\ \end{array}\right),

LaTeX Markup Encoder Decoder FGFE

Learning Fine-Grained Image Representations for Mathematical Expression Recognition

Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

slide-2
SLIDE 2

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

2

Mathematical Expression Recognition (MER)

Input Image

\left[\;\Lambda\;\right]_{R}\^{S}= \left(\begin{array}{ll}{ \operatorname{cos}\Psi}& {\operatorname{sin}\Psi}\\ {\operatorname{sin}\Psi}& {\operatorname{cos}\Psi}\\ \end{array}\right),

Markup (e.g., LaTeX)

Model

Problem Definition Different types of MER Tasks Online MER [CROHME] Offline MER [IM2LATEX]

Model

T=1 T=2 T=3 T=4

Model

slide-3
SLIDE 3

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

3

Related Work

Infty [Suzuki et al.] Caption [Xu et al.] CTC [Graves et al.] Im2Tex [Deng et al.]

slide-4
SLIDE 4

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

4

Overview of the Model

Encoder Decoder FGFE

\left[ \; \Lambda \; \right] _ { R } \ ^ { S } = \left( \begin{array}{ l l } {\operatorname}{cos} \Psi } & {- \operatorname{sin} \Psi} \\ {\operatorname {sin} \Psi} & {\operatorname{cos} \Psi}

LaTeX Markup Input Image (H x W x 1)

slide-5
SLIDE 5

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

5

Visual Encoder and LaTeX Decoder

Encoder Input

H‘ x W‘ x C H‘ x W‘ x D

Encoder Output LSTM

H‘ x W‘ x D

Decoder Input Encoder Decoder FGFE

\left[ \; \Lambda \; …

\Psi for each t

·

repeat

slide-6
SLIDE 6

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

6

Final Results

Performance on the IM2LATEX-100K Test Set text-based img.-based

slide-7
SLIDE 7

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

7

Impact of Formula Length on Performance

Long Formulas are difficult to recognize Drop at Short Formula

slide-8
SLIDE 8

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

8

Impact of Rare Token Classes

slide-9
SLIDE 9

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

9

Importance of a Fine-grained Visual Representation

Im2Tex  Ours  \Delta \alpha

FE Type Attention Maps Predictions

slide-10
SLIDE 10

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

10

Recursive Behavior in Long Formulas

Encoder

Decoder +\frac{1}{2}\[ … D_1((x-x\prime)^2)

LaTeX Prediction … …

FGFE

Compiled Image Input Image

T = 108

\lambda

T = 109

\delta )

T = 132 T = 133

_

T = 134

\lambda

slide-11
SLIDE 11

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

11

Conclusion

Approach

  • We tackled the offline MER task
  • Our model was evaluated on the Im2LATEX Dataset
  • We were able to improve results by over 4% in Img-Abs

Analysis

  • Analysis of the performance by formula length
  • Visualization of attention maps
  • Impact of rare tokens on performance
  • Typical errors our model produced
slide-12
SLIDE 12

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

12

References

  • [Suzuki et al.] INFTY: an integrated OCR system for mathematical
  • documents. M. Suzuki, F. Tamari, R. Fukuda, S. Uchida, and T. Kanahori.

In DocEng, 2003.

  • [Graves et al.] Connectionist Temporal Classification: Labelling

Unsegmented Sequence Data with Recurrent Neural Networks. A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber. In ICML, 2006.

  • [Xu et al.] Show, attend and tell: Neural image caption generation with

visual attention. K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel and Y. Bengio. In ICML, 2015.

  • [Deng et al.] Image-to-markup generation with coarse-to-fine attention.
  • Y. Deng, A. Kanervisto, J. Ling and A. Rush. In ICML, 2017.
slide-13
SLIDE 13

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen

13

Learning Fine-Grained Image Representations for Mathematical Expression Recognition

Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen Karlsruhe Institute of Technology, Germany haurilet@kit.edu Input Image

\left[\;\Lambda\;\right]_{R}\^{S}= \left(\begin{array}{ll}{ \operatorname{cos}\Psi}& {\operatorname{sin}\Psi}\\ {\operatorname{sin}\Psi}& {\operatorname{cos}\Psi}\\ \end{array}\right),

LaTeX Markup Encoder Decoder FGFE