Image Filtering Huicheng Zheng, Mohamed Daoudi Enic Telecom Lille 1 - - PowerPoint PPT Presentation

image filtering
SMART_READER_LITE
LIVE PREVIEW

Image Filtering Huicheng Zheng, Mohamed Daoudi Enic Telecom Lille 1 - - PowerPoint PPT Presentation

Image Filtering Huicheng Zheng, Mohamed Daoudi Enic Telecom Lille 1 Final Workshop Pise 21-22 January 2004 (collaboration with Bruno Jedynak USTL) 2 Plan Pornographic image filtering. Symbol recognition. Conclusion and perspective. 3


slide-1
SLIDE 1
slide-2
SLIDE 2

2

Image Filtering

Huicheng Zheng, Mohamed Daoudi Enic Telecom Lille 1 Final Workshop Pise 21-22 January 2004 (collaboration with Bruno Jedynak USTL)

slide-3
SLIDE 3

3

Plan

Pornographic image filtering. Symbol recognition. Conclusion and perspective.

slide-4
SLIDE 4

4

Motivation

More than 4 millions webpages reveal that 70,1% of web pages contain images, URL : need manual updating, A lot of pornographic web pages contain very few text.

slide-5
SLIDE 5

Architecture of the Pornographic Image Filter

Skin detection Form analysis

Accept Accept Refuse Refuse

slide-6
SLIDE 6

6

Architecture of the Pornographic Image Filter

                    .. .. .. .. .. .. Feature Extraction Neural Network Skin detection

slide-7
SLIDE 7

7

Skin Detection

Motivation: There is a strong correlation between images with large skin patches and pornographic images.

slide-8
SLIDE 8

8

Difficulties

  • 1. Variations of the skin colors
slide-9
SLIDE 9

9

Difficulties

  • 2. Variations of the capturing conditions

(Illumination, camera, compression, noise...)

slide-10
SLIDE 10

1 0

Training Image Database

x y Compaq labeled image database: 18,696 images. Nearly 2 billion pixels in the training set.

slide-11
SLIDE 11

1 1

Maximum Entropy Principle

Task :

To infer the (image, label) joint distribution model from the training data.

Tool :

Maximum Entropy Principle Choose a probability model which is consistent with the training data, but otherwise as uniform as possible.

slide-12
SLIDE 12

1 2

Maximum Entropy Principle Steps: Calculate the (color, label) histograms. Write down the maximum entropy model within the ones that have the calculated histograms. Estimate the parameters. Use the model for classification.

slide-13
SLIDE 13

1 3

Examples of Skin Detection

Baseline model THMM TFOM Baseline model: assuming conditional independence between pixels THMM: hidden Markov model with tree approximation for probability inference TFOM: first order model with tree approximatioin for probability inference

slide-14
SLIDE 14

1 4

ROC Curves

slide-15
SLIDE 15

1 5

ROC Curve of TFOM

slide-16
SLIDE 16

1 6

Examples of Skin Detection by TFOM

slide-17
SLIDE 17

1 7

Examples of False Positive

slide-18
SLIDE 18

1 8

Examples of False Negative

slide-19
SLIDE 19

1 9

Comparison [Vez 03](same test database)

30.2 94.7%

Thresholding of I axis [Brand and Masson 2000]

30% 90%

Gaussian Mixture in IQ [Lee and Yoo 2002]

33.3% 90%

Single Gaussian in CbCr [Lee and Yoo 2002]

20.9% 90%

Elliptical boundary in CIE-xy [Lee and Yoo 2002]

32% 78%

SOM in TS [Brown et al. 2001]

9.5% 15.5% 80% 90%

Gaussian mixture [Jones and Rehg1999]

8% 80%

Poesia Algorithm, Maxent [2002]

19.8% 93.4%

Bayes SPM in RGB, [Jones and Rehg 2000]

8.5% 14.2% 80% 90%

Bayes SPM in RGB, [Jones and Rehg1999]

FP TP Method

slide-20
SLIDE 20

2 0

Feature Extraction

features GFE(Global Fit Ellipse)

Average probability in the image Average probability in the GFE Number of regions in the image Position of the LFE Orientation of the LFE Shape of the LFE Relative area of the LFE Average probability in the LFE Average probability outside the LFE

LFE(Local Fit Ellipse)

slide-21
SLIDE 21

2 1

Learning of Pornographic Image Neural Network

Bosson and Cawley[2002]: The neural network offers a statistically significant performance over several

  • ther approaches.

Training set: 1,297 pornographic images collected by the end users and 3,787 other images

slide-22
SLIDE 22

2 2

ROC Curve of the Test Database

Test database:1297 pornographic images 3787 other images Elapsed time: 0.19s/image

slide-23
SLIDE 23

2 3

Exemples of False Detection

Op=0.006828 Op=0.000005 Op=0.899044 Op=0.938251

slide-24
SLIDE 24

2 4

Symbols filtering

Symbols recognition is one the challenging problem in pattern recognition community. No general solution to this problem and few solution exist.

slide-25
SLIDE 25

2 5

Symboles recognition

Edge detection

Invariant Descriptors,

  • Moments descriptors,
  • Zernik moments

(recommended by Mpeg- 7 for image retrieval)

slide-26
SLIDE 26

2 6

Architecture New Symbol Compute Fast Zernik Moment FAST Zernik Moment Symbols collection Extraction Features

slide-27
SLIDE 27

2 7

Symbols recognition

194 Harmful symbols collected 21 symbols Non harmful

slide-28
SLIDE 28

2 8

Results

375 harmful symbols (rotations with different angles, scaling with different ratios, translations with different pixels and JPEG compression with different quality factors), and 105 benign symbols downloaded from web. The TN benign symbols is 0.89 and the TP rate for harmful symbols is 0.85. The average elapsed time for each symbol is 0.13s.

slide-29
SLIDE 29

2 9

Conclusion

Our adult image filter is more practical compared with those existing systems in terms of processing speed. We propose the first web symbol filtering. Http://cvs.sourceforge.net/viewcvs.py/p

  • esia/PoesiaSoft/ImageFilter