T h e w o r l d s i n f o r m a t i o n i s n - - PowerPoint PPT Presentation

t h e w o r l d s i n f o r m a t i o n i s n o t d i g i
SMART_READER_LITE
LIVE PREVIEW

T h e w o r l d s i n f o r m a t i o n i s n - - PowerPoint PPT Presentation

T h e w o r l d s i n f o r m a t i o n i s n o t d i g i t a l ! A n e s t i m a t e d 1 3 0 M i l l i o n U n i q u e b o o k s i n e x i s t e n c e A t t e m p


slide-1
SLIDE 1

T h e w

  • r

l d s i n f

  • r

m a t i

  • n

i s n

  • t

d i g i t a l !

  • A

n e s t i m a t e d 1 3 M i l l i

  • n

U n i q u e b

  • k

s i n e x i s t e n c e A t t e m p t i n g t

  • d

i g i t i z e t h i s i n f

  • r

m a t i

  • n

:

  • G
  • g

l e B

  • k

s

  • P

r

  • j

e c t G u t e n b e r g

  • M

i l l i

  • n

B

  • k

P r

  • j

e c t

  • L

O C : A m e r i c a n M e m

  • r

y A n d m a n y m

  • r

e

slide-2
SLIDE 2

V i e w i n g S c a n n e d d

  • c

u m e n t s

  • n

m

  • b

i l e d e v i c e s – t w

  • p

t i

  • n

s :

T O O B I G T O O S M A L L

slide-3
SLIDE 3

I s O C R t h e s

  • l

u t i

  • n

?

slide-4
SLIDE 4

B I T e R

B i d e r i c t i

  • n

a l I m a g e T e x t R e f l

  • w
slide-5
SLIDE 5

C

  • n

c e p t : R e f l

  • w

e d I m a g e

* Actual program

  • utput!

R e a r r a n g e t e x t a n d i m a g e s w h i l e p r e s e r v i n g d

  • c

u m e n t l a y

  • u

t

slide-6
SLIDE 6

S p e c i a l P r

  • p

e r t i e s

  • f

t e x t

U s u a l l y

  • r

g a n i z e d i n p a r a g r a p h s

slide-7
SLIDE 7

S p e c i a l P r

  • p

e r t i e s

  • f

t e x t

High directional variance in gradient

slide-8
SLIDE 8

S p e c i a l P r

  • p

e r t i e s

  • f

t e x t

S p a c i n g b e t w e e n w

  • r

d s i s d i r e c t l y r e l a t e d t

  • l

i n e h e i g h t

slide-9
SLIDE 9

S p e c i a l P r

  • p

e r t i e s

  • f

t e x t

P a r a g r a p h m a r g i n s a r e u s u a l l y s m a l l e r t h a n i m a g e / g r a p h m a r g i n s

slide-10
SLIDE 10

M e t h

  • d
  • l
  • g

y : O u t l i n e

  • Line height detection
  • Segmentation to textual and non-textual elements
  • Verification of segment classification
  • Further segmentation of textual segments
  • Ordering of textual segments
slide-11
SLIDE 11

M e t h

  • d
  • l
  • g

y : L i n e h e i g h t d e t e c t i

  • n
  • Sum pixel values on X axis
  • Filter using median value
  • Find median length of continuous segments
  • Robust – automatic re-targeting
slide-12
SLIDE 12

M e t h

  • d
  • l
  • g

y : L i n e h e i g h t d e t e c t i

  • n

28

slide-13
SLIDE 13

M e t h

  • d
  • l
  • g

y : I m a g e S e g m e n t a t i

  • n
  • C
  • m

p u t e d i r e c t i

  • n

a l v a r i a n c e

  • f

g r a d i e n t a l

  • n

g t h e X c

  • r

d i n a t e

  • T

h r e s h

  • l

d r e s u l t u s i n g m e d i a n v a l u e

  • A

b s

  • r

b s e g m e n t s s m a l l e r t h a n c

  • m

p u t e d d

  • c

u m e n t l i n e h e i g h t

  • V

e r i f y a n d r e c l a s s i f y u s i n g t w

  • m

e t r i c s : 1 . M a r g i n w i d t h 2 . S e g m e n t l i n e h e i g h t

slide-14
SLIDE 14

M e t h

  • d
  • l
  • g

y : I m a g e S e g m e n t a t i

  • n

O r i g i n a l I n i t i a l S e g me n t a t i

  • n

F i n a l S e g me n t a t i

  • n

IMAGE TEXT TEXT TEXT

TEXT IMAGE

slide-15
SLIDE 15

M e t h

  • d
  • l
  • g

y : W

  • r

d I d e n t i f i c a t i

  • n
  • Smooth segment using a filter based on detected

segment line height.

  • Detect connected components after smoothing
  • Filter out small components.
slide-16
SLIDE 16

M e t h

  • d
  • l
  • g

y : T e x t B a s e l i n e

P r

  • b

l e m: D u e t

  • c

l

  • s

e

  • c

r

  • p

p i n g , w

  • r

d s w i l l b e c

  • m

e m i s a l i g n e d S

  • l

u t i

  • n

: D e t e c t t r u e w

  • r

d b a s e l i n e u s i n g l

  • c

a l m a x i m a

  • f

v e r t i c a l g r a d i e n t

slide-17
SLIDE 17

M e t h

  • d
  • l
  • g

y : W

  • r

d

  • r

d e r

  • Detect text line locations using the same

methodology as line height detection

  • Order words by lines, then by X coordinate
  • RTL and LTR languages easily accommodated
slide-18
SLIDE 18

M e t h

  • d
  • l
  • g

y : O u t p u t

  • S

e g m e n t e d s e c t i

  • n

s a r e

  • u

t p u t a s i n d i v i d u a l i m a g e s

  • O

r i g i n a l d

  • c

u m e n t

  • r

d e r i s p r e s e r v e d a l

  • n

g s e g m e n t s

  • R

e s u l t i s d i s p l a y e d a s a H T M L f i l e , a l l

  • w

i n g e a s y v i e w i n g

  • n

m u l t i p l e p l a t f

  • r

m s .

slide-19
SLIDE 19

R e s u l t s

Test Set Accuracy Exert from V.S. Nalwa's “A guided tour of computer vision” , 1993 (1) 96.1% Exert from V.S. Nalwa's “A guided tour of computer vision” , 1993 (2) 91.6% 2010 English Journal article 97.8% 1907 German journal article 99.7% Hebrew sample text 96.8% Hebrew sample text 2 84.9% 1953 Article by Kuffler 93.7% 1925, Gestalt theory by Max Wertheimer. 57.3% 1983 Excerpt from Human and Machine Vision, Vitkin & Tenenbaum 98.2%

0.25⋅correctlyidentified total segments +¿

0.75⋅(correctlysegmented)+0.5⋅(baseline/segmenting erros)−(dropped /misplaced words)

2

total words =accuracy

slide-20
SLIDE 20

F u t u r e W

  • r

k

  • Improved adaptive parameters for filters
  • Better verification of segment identification
  • Support for multi-column layouts
  • Detection of special text formats (lists etc.)
slide-21
SLIDE 21

T h a n k y

  • u

!

T O O B I G T O O S M A L L J U S T R I G H T !