Presentation Failures Using Computer Vision-Based Techniques Sonal - - PowerPoint PPT Presentation
Presentation Failures Using Computer Vision-Based Techniques Sonal - - PowerPoint PPT Presentation
Detection and Localization of HTML Presentation Failures Using Computer Vision-Based Techniques Sonal Mahajan and William G. J. Halfond Department of Computer Science University of Southern California Presentation of a Website What do we
Presentation of a Website
- What do we mean by presentation?
– “Look and feel” of the website in a browser
- What is a presentation failure?
– Web page rendering ≠ expected appearance
- Why is it important?
– It takes users only 50 ms to form opinion about your website (Google research – 2012) – Affects impressions of trustworthiness, usability, company branding, and perceived quality
2
End user – no penalty to move to another website Business – loses out on valuable customers
Motivation
- Manual detection is difficult
– Complex interaction between HTML, CSS, and Javascript – Hundreds of HTML elements + CSS properties – Labor intensive and error-prone
- Our approach – Automate debugging of
presentation failures
3
Two Key Insights
- 1. Detect presentation failures
4
Oracle image Test web page Presentation failures Visual comparison
Use computer vision techniques
Two Key Insights
- 2. Localize to faulty HTML elements
5
Test web page Faulty HTML elements Layout tree
Use rendering maps
- Regression Debugging
– Current version of the web app is modified
- Correct bug
- Refactor HTML (e.g., convert <table> layout to <div> layout)
– DOM comparison techniques (XBT) not useful, if DOM has changed significantly
- Mockup-driven development
– Front-end developers convert high-fidelity mockups to HTML pages – DOM comparison techniques cannot be used, since there is no existing DOM – Invariants specification techniques (Selenium, Cucumber, Sikuli) not practical, since all correctness properties need to be specified – Fighting layout bugs: app independent correctness checker
6
Limitations of Existing Techniques
Running Example
Web page rendering Expected appearance (oracle)
≠
7
Our Approach
8
- P1. Detection
- P2. Localization
Oracle image Test web page Visual differences Pixel-HTML mapping Report
Goal – Automatically detect and localize presentation failures in web pages
- P1. Detection
9
- Find visual differences (presentation failures)
- Compare oracle image and test page
screenshot
- Simple approach: strict pixel-to-pixel
equivalence comparison
– Drawbacks
- Spurious differences due to difference in platform
- Small differences may be “OK”
Perceptual Image Differencing (PID)
- Uses models of the human visual system
– Spatial sensitivity – Luminance sensitivity – Color sensitivity
- Configurable parameters
– Δ : Threshold value for perceptible difference – F : Field of view of the observer – L : Brightness of the display – C : Sensitivity to colors
10
Shows only human perceptible differences
Filter differences belonging to dynamic areas
- P1. Detection – Example
11
Test web page screenshot Oracle Visual comparison using PID Apply clustering (DBSCAN)
A B C
- P2. Localization
- Identify the faulty HTML element
- Use R-tree to map pixel visual differences
to HTML elements
- “R”ectangle-tree: height-balanced tree,
popular to store multidimensional data
Use rendering maps to find faulty HTML elements corresponding to visual differences
12
13
- P2. Localization - Example
R1
Sub-tree of R-tree
14
- P2. Localization - Example
R2 R3 R4 R5
(100, 400)
Map pixel visual differences to HTML elements
15
- P2. Localization - Example
R 2 R 1
tr[2 ]
R 3
t d tabl e t r t d
Result Set: /html/body/…/tr[2] /html/body/…/tr[2]/td[1] /html/body/…/tr[2]/td[1]/table[1] /html/body/…/tr[2]/td[1]/table[1]/tr[1] /html/body/…/tr[2]/td[1]/table[1]/td[1]
Special Regions Handling
- Special regions = Dynamic portions (actual content
not known)
16
- 1. Exclusion Region
- 2. Dynamic Text Region
- 1. Exclusion Regions
- Only apply size bounding property
17
Difference pixels filtered in detection Advertisement box <element> reported as faulty
Test web page Oracle
- 2. Dynamic Text Regions
- Style properties of text known
18
Text color: red Font-size: 12px Font-weight: bold
Test web page Modified test web page (Oracle) Run P1, P2 News box <element> reported as faulty
- P3. Result Set Processing
- Rank the HTML elements in the order of
likelihood of being faulty
- Weighted prioritization score
- Lower the score, higher the likelihood of
being faulty
Use heuristics based on element relationships
19
3.1 Contained Elements (C)
20
parent child1 child2 child1 child2 parent
Expected appearance Actual appearance
✖ ✖ ✖
3.2 Overlapped Elements (O)
21
parent child1 child2 child1 child2 parent
Expected appearance Actual appearance
✖ ✖
3.3 Cascading (D)
22
element 1
Expected appearance Actual appearance
element 2 element 3 element 1 element 2 element 3
✖ ✖
3.4 Pixels Ratio (P)
23
parent child
✖ ✖
Child pixels ratio = 100% Parent pixels ratio = 20%
/html /html/body /html/body/table . . . /html/body/table/…/img
- 1. /html/body/table/…/img
. . .
- 5. /html/body/table
- 6. /html/body
- 7. /html
- P3. Result Set Processing - Example
24
A B C D E
Report
Cluster B Cluster C Cluster D Cluster E
/html /html/body /html/body/table . . . /html/body/table/…/img
Cluster A
Empirical Evaluation
- RQ1: What is the accuracy of our approach
for detecting and localizing presentation failures?
- RQ2: What is the quality of the localization
results?
- RQ3: How long does it take to detect and
localize presentation failures with our approach?
25
Experimental Protocol
- Approach implemented in “WebSee”
- Five real-world subject applications
- For each subject application
– Download page and take screenshot, use as the oracle – Seed a unique presentation failure to create a variant – Run WebSee on oracle and variant
26
Subject Applications
27
Subject Application Size (Total HTML Elements) Generated # test cases
Gmail
72 52
USC CS Research
322 59
Craigslist
1,100 53
Virgin America
998 39
Java Tutorial
159 50
RQ1: What is the accuracy?
28
93%
92% 92% 90% 97% 94% Gmail USC CS Research Craigslist Virgin America Java Tutorial
Localization accuracy
- Detection accuracy: Sanity check for PID
- Localization accuracy: % of test cases in
which the expected faulty element was reported in the result set
RQ2: What is the quality of localization?
29
12 (16%) 17 (5%) 32 (3%) 49 (5%) 8 (5%)
Gmail USC CS… Craigslist Virgin America Java Tutorial Result Set Size
23 (10%)
- 1. <>…</>
- 2. <>…</>
. . . . . .
- 23. <>…</>
Rank = 4.8 (2%)
✖ ✔ Distance = 6
faulty element not present
RQ3: What is the running time?
21% 25% 54%
P1: Detection P2: Localization P3: Result Set Processing
30
7 sec 3 min
87 sec Sub-image search for cascading heuristic
Comparison with User Study
76% 36% 100% 93% Detection Localization
Accuracy
Students WebSee
- Graduate-level students
- Manual detection and
localization using Firebug
- Time
– Students: 7 min – WebSee: 87 sec
31
Case Study with Real Mockups
- Three subject applications
- 45% of the faulty elements reported in
top five
- 70% reported in top 10
- Analysis time similar
32
Summary
- Technique for automatically detecting and
localizing presentation failures
- Use computer vision techniques for detection
- Use rendering maps for localization
- Empirical evaluation shows positive results
33
Thank you
34
Detection and Localization of HTML Presentation Failures Using Computer Vision-Based Techniques
Sonal Mahajan and William G. J. Halfond spmahaja@usc.edu halfond@usc.edu
Normalization Process
- Pre-processing step before detection
- 1. Browser window size is adjusted based
- n the oracle
- 2. Zoom level is adjusted
- 3. Scrolling is taken care of
35
Difference with XBT
- XBT use DOM comparison
– Find matched nodes, compare them
- Regression debugging
– Correct bug, refactor HTML (e.g. <table> to <div> layout) – DOM significantly changed
- XBT cannot find matching DOM nodes, not accurate
comparison
- Mockup Driven Development
– No “golden” version of page (DOM) exists – XBT techniques cannot be used
- Our approach
– Uses computer vision techniques for detection – Applies to both scenarios
36
Pixel-to-pixel Comparison
37
Oracle Test Webpage Screenshot
Pixel-to-pixel Comparison
38
Difference image
98% of the entire image is shown in difference!
Difference pixel Matched pixel
- P1. Detection
39
Analyze using computer vision techniques
- Our approach: Perceptual image
differencing (PID)
- Find visual differences (presentation failures)
- Simple approach: strict pixel-to-pixel
equivalence comparison
Perceptual Image Differencing
40
Difference image
Difference pixel Matched pixel
41
High fidelity mockups… reasonable?
42
High fidelity mockups… reasonable?
43
High fidelity mockups… reasonable?
44
High fidelity mockups… reasonable?
45
High fidelity mockups… reasonable?
46
High fidelity mockups… reasonable?
47
High fidelity mockups… reasonable?
48
High fidelity mockups… reasonable?
49
High fidelity mockups… reasonable?
50
High fidelity mockups… reasonable?
51
High fidelity mockups… reasonable?
52